Hello,
It doesn’t seem to matter if my corpora has 10K, 100K, 1 million, 20 million segments, it takes 4 hours to run on my GPU. I see that there are 100K steps, again, regardless of corpus size. This does not seem right. Before I dive into the scripts and hyper parameters, I wanted to check in to see if this is the default expected behavior. Here are the commands I use to preprocess, train, translate, and evaluate the model:
python3 preprocess.py -train_src data/src-train.txt -train_tgt data/tgt-train.txt -valid_src data/src-val.txt -valid_tgt data/tgt-val.txt -save_data data/demopython3 train.py -data data/demo -save_model demo-model -world_size 1 -gpu_ranks 0
python3 translate.py -model enes_1mil/demo-model_step_100000.pt -src data/src-test.txt -output pred.txt -replace_unk -verbose
~/workspace/OpenNMT-py/tools/multi-bleu.perl data/tgt-test.txt < data/pred.txt
I get a 35.68 BLEU (detok) on my EN>ES model using 20 million segments from the paracrawl corpus.
Any feedback is appreciated.
Thanks,
Steve