Dynamic Dataset

(David Landan) #53

Makes sense… So to make I use everything at least once, I should set the sample size to:
a/.7 + b/.2 + c/.1
where a, b, and c are the sizes (in # of segments) of the respective sets.

(Etienne Monneret) #54

I would rather say: Max(a/.7, b/.2, c/.1)

But, since I was a bit lost by the previous 3.5 factor…

(David Landan) #55

Ah, I think you’re right. :slight_smile:

(Shruti P) #56

Hi @jean.senellart,

Would this work with the -idx_files option as well?

(jean.senellart) #57

no unfortunately - not implemented for the moment. It is more complicated since idx_files can be not aligned, and it takes more work (several passes on files) to make an efficient implementation. Please open an issue on GitHub if you need it.

(David Landan) #58

I can’t find this in the documentation. Is this the same as -decay_method reset?

(Guillaume Klein) #59

Yes this is the revised name.

(David Landan) #60

Whew, thanks!

(jean.senellart) #61

just to give some update on the previous run (with -max_batch_size 196) - the 100 epochs training (so 300 000 000 sentences fed to the training), the final PPL is 42.59 for about 7 days training.

(Vincent Nguyen) #62

did you plot the ppl curve ? I am interested.

(Terence Lewis) #63

What’s the hardware set-up for this?