I want to use OpenNMT to create “meaning representations” of a sentence. I basically want to “translate” an English sentence into a meaning representation. Previously, I used Tensorflow sequence-to-sequence models and in that case, character-level models clearly outperformed word-level models. So I hope you understand why I want to use this.
I know OpenNMT does not have a separate setting for character-level input (although I read about possibly adding support), so I simply gave the character as input, e.g. m e a n i n g + r e p r e s e n t a t i o n instead of just meaning representation.
In the Tensorflow models, I obtained quite good results with 1 layer + 400 nodes. More did not fit in GPU memory. My main reason for (possibly) switching is that OpenNMT luckily is a lot more memory efficient and can also fit different models. However, I’ve not been able to come close to my previous results so far. The models does not seem to learn much, and when tested only outputs very general, default looking meaning representation, no matter the input sentence.
This might either be due to the fact that the architecture was never designed to be able to handle character-level input. But, it might also be that I used the wrong parameter settings. I’m far from an expert, so any help would be very much appreciated. These are my settings:
For preprocessing/training:
src-words-min-frequency = 1
tgt-words-min-frequency = 1
src-seq-length = 500
tgt-seq-length = 500
sort = 1
shuffle = 1
src-word-vec-size = 500
tgt-word-vec-size = 500
layers = 2
rnn-size = 500
rnn-type = LSTM
dropout = 0.3
rnn-t = -brnn
batch-size = 12
optim = sgd
learning-rate = 1
max-grad-norm = 5
learning-rate-decay = 0.7
start-decay-at = 9
decay = default
curriculum = 0
For testing:
beamsize = 5
batch-size-test = 12
max-sent-length = 500
If there’s anything that is obviously bad/suboptimal, please let me know. Even just some suggestions about what I should try next are very welcome. I want to try different settings, but fully training the model takes about a full day, I’d rather only search in possibly fruitful directions and right now I don’t really know where to start. Thanks in advance!