I want to use a pretrained BPEmb model. Here I use SentencePiece and create training.en.vocab and training.bn.vocab which I used as vocabulary files.
I have downloaded W2V files and included as source and target embedding files.
Is this correct pipeline for data preprocessing and training?
OS: Windows 10
model_dir: run/ data: train_features_file: TrainEn.txt train_labels_file: trainBn.txt eval_features_file: trainDevEn.txt eval_labels_file: trainDevBn.txt source_vocabulary: training.en.vocab target_vocabulary: training.bn.vocab train: max_step: 5000 batch_size: 40 source_embedding: path: en.wiki.bpe.vs25000.d300.w2v.txt with_header: False case_insensitive: True target_embedding: path: bn.wiki.bpe.vs25000.d300.w2v.txt with_header: False