Dynamic Dataset

Makes sense… So to make I use everything at least once, I should set the sample size to:
a/.7 + b/.2 + c/.1
where a, b, and c are the sizes (in # of segments) of the respective sets.

I would rather say: Max(a/.7, b/.2, c/.1)

But, since I was a bit lost by the previous 3.5 factor…
:roll_eyes:

1 Like

Ah, I think you’re right. :slight_smile:

1 Like

Hi @jean.senellart,

Would this work with the -idx_files option as well?

no unfortunately - not implemented for the moment. It is more complicated since idx_files can be not aligned, and it takes more work (several passes on files) to make an efficient implementation. Please open an issue on GitHub if you need it.

I can’t find this in the documentation. Is this the same as -decay_method reset?

Yes this is the revised name.

Whew, thanks!

just to give some update on the previous run (with -max_batch_size 196) - the 100 epochs training (so 300 000 000 sentences fed to the training), the final PPL is 42.59 for about 7 days training.

did you plot the ppl curve ? I am interested.
thks

What’s the hardware set-up for this?