OpenNMT Forum

OpenNMT-py transforms stats

Hi,
I saw there are some code chunks for transforms statistics. However, they are now being printed.
What I want is to know how many sentences have been filtered with the FilterTooLong transform, in order to update src_seq_length

IIRC transform stats are printed once a dataset has been fully looped on.
cc @Zenglinxiao

I guess it is one per thread, so we need to sum them?

Yes, in multi-GPU setup there is actually one producer thread per GPU, and each of these has its own iterator (each looping on the data in a strided way) .