Inferences from non-training data all giving the same result in OpenNMT-tf

spaces · September 13, 2018, 2:56pm

Hi all,

I’ve been experimenting with OpenNMT-tf with my own data, which are a series of numbers as source data and word sentences as the target. The corpus is about 20,000 lines long each, of generally uneven length.

When I run onmt-main infer --config config/opennmt-defaults.yml config/data/toy-ende.yml --features_file data/toy-ende/src-test.txt, using a src-test file of new inputs, it repeats the same output over and over as the results, which is a seemingly random sentence (one that is present in the target training data). What I was hoping for from the inference process was to generate predictions that were different from the training data, a mishmash of new sentences (of likely varying levels of sense).

I’m running the NMTBig model, with 200000 steps. The beam is 12, learning 1.0, decay rate 0.7, batch size 64, infer batch size 30.

Am I doing something wrong here which is resulting in these repeated results? And can anyone suggest what I can do to get my desired results?

guillaumekln · September 14, 2018, 8:08am

Hi,

It looks like you don’t have enough data to train the NMTBig model which, as the name suggests, is a large sequence to sequence model.

Maybe try to train NMTSmall and make sure to use train_and_eval to monitor how the evaluation loss evolves.

spaces · September 15, 2018, 7:08pm

Thanks for the suggestion! I did some experiments with a smallNMT (200000 steps) and it seems to have the same results as before, and this is when training it with train_and_eval.

I wonder if it’s my training data? My vocab file has some individual floats, as well as collections of 10-15 floats, all paired with what are fairly uniform sentences in my target data. Could this be the culprit?