Dynamic Dataset

dbl · September 15, 2017, 2:17pm

Makes sense… So to make I use everything at least once, I should set the sample size to:
a/.7 + b/.2 + c/.1
where a, b, and c are the sizes (in # of segments) of the respective sets.

Etienne38 · September 15, 2017, 2:54pm

I would rather say: Max(a/.7, b/.2, c/.1)

But, since I was a bit lost by the previous 3.5 factor…

dbl · September 15, 2017, 2:59pm

Ah, I think you’re right.

Shruti · September 17, 2017, 12:08am

Hi @jean.senellart,

Would this work with the -idx_files option as well?

jean.senellart · September 17, 2017, 8:11am

no unfortunately - not implemented for the moment. It is more complicated since idx_files can be not aligned, and it takes more work (several passes on files) to make an efficient implementation. Please open an issue on GitHub if you need it.

dbl · September 22, 2017, 8:56am

I can’t find this in the documentation. Is this the same as -decay_method reset?

guillaumekln · September 22, 2017, 8:57am

Yes this is the revised name.

dbl · September 22, 2017, 8:58am

Whew, thanks!

jean.senellart · September 22, 2017, 1:07pm

just to give some update on the previous run (with -max_batch_size 196) - the 100 epochs training (so 300 000 000 sentences fed to the training), the final PPL is 42.59 for about 7 days training.

vince62s · September 22, 2017, 2:08pm

did you plot the ppl curve ? I am interested.
thks

tel34 · September 22, 2017, 2:55pm

What’s the hardware set-up for this?