The preprocessing step is no longer necessary since v2. (OpenNMT-py 2.0 release)
The doc in question was updated to reflect that.
Subword vs word actually doesn’t matter much. If you use subword tokenization and pass subword pretrained embeddings it will work exactly the same as word tokenization and word pretrained embeddings.
If BPEmb requires a specific sentencepiece model, then you need to use this one. See this entry for on the fly tokenization.
Maybe you should try some easier setup without BPEmb to get started and get your head around how it all works.