Lm.lua sample mode?

(Brian Albertalli) #1

Another question about lm.lua… I have tried everything I can think of and cannot get lm.lua to generate text from a trained model in sample mode. I’ve tried with and without features, ensured the test text was tokenized exactly the same as the training text, etc. However, no matter what I try, I’m getting the error

attempt to call method 'isContiguous' (a nil value)

from /nn/LookupTable.lua, which ultimately traces back to the sample function in onmt/lm/LM.lua. Is there a trick to formatting the text provided by the -src option? I’ve been putting my test text into a file since providing text at the command line gives me a “No such file or directory” error. What am I missing here?

Oh, and using the score option works fine. I’m only getting errors when trying use the sample option. If anyone can clue me in to what is expected for -src, I’d greatly appreciate it. Thanks!

(Guillaume Klein) #2

It seems this mode was always broken. Can you have a second try with after this commit:


(Brian Albertalli) #3

Thanks for fixing this so fast. It does generate text now. However, it stops as soon as it has completed one sentence (or reached the max_length, I assume). If the src is a full sentence, often this means it will add one word and exit. Ideally, the model would continue past the sentence boundary, and predict the next word based on words in the prior sentence. Is there any way to have it continue to produce text up to an arbitrary number of sentences?

If you’d rather have this conversation over at GitHub, let me know. I wish I could be more helpful, but I really don’t know Lua very well. Thanks again.

(Guillaume Klein) #4

It stops generating when the special end of sentence token is generated.

Maybe you can try training your language model on inputs containing multiple sentences? Or we can try refining the sample behavior based on your feedback and suggestions.

(Brian Albertalli) #5

I see. For my use case, I was looking for more of a generative LM. No need to make any changes just for my sake. If I get some time, I can tinker with it an put in a PR on GitHub if I come up with something useful.

Thank you for your time. This is a great resource for the open-source community!