Save validation translations at each epoch


(Matt Relich) #41

@jean.senellart Sorry I got busy the last few days and didn’t respond. I like @dbl suggestion to use Levenshtein-Damerau divided by length. I see there is already some implementation now for this, but if needed I can test / work on it in the coming weekend.

Thanks again for including this!!


(David Landan) #42

I think it would be nice to have TER (translation error rate) as well, but it would likely take me quite a while to code up TER in lua, and I’m not sure about performance.

In theory, one could make a subprocess call to the latest Java implementation of TER, but I doubt we want such an external dependency.


(jean.senellart) #43

Hi David, for TER I will take it - we have reimplemented it several times, so I it should be quite fast


(jean.senellart) #44

TER is now also available - as a metric for score.lua or as a validation metric. See http://opennmt.net/OpenNMT/tools/scorer/


(Matt Relich) #45

@jean.senellart I was coming back to this again to see if in the updates the translation / epoch was included, but it doesn’t seem so. Does that mean we will go with @vince62s script for unloading / reloading during training? It’s a pity, since it seems trivial to just run through the validation data during the training.

If there is no plan to add this feature, then I will try to maintain my own script for this in case anyone else is interested. I should be getting back to this task in the next few days and will update my fork.

Cheers


(Guillaume Klein) #46

Now we could save validation translation when using translation-based validation metrics (BLEU, D.-L. ratio). Would that work for you?


(Vincent Nguyen) #47

It would be preferable to have a flag to trigger the saving.
When using sampling with hundreds of epochs it’s not convenient.


(Vincent Nguyen) #48

additional thing since @guillaumekln already PR’ed…

Would you prefer an output with just the valid set translation or even more uselfull a file
with both reference and translation, plus a score for each sentence (Bleu or TER) at the beginning of each line.

(similar output to analysis.perl in the Moses project)


(Matt Relich) #49

@guillaumekln I’m with @vince62s on this one, I think having a flag to determine whether or not to save would be best, that way users can choose whether or not to turn it on. Having it also output the scores would be nice for additional analysis. Would you like me to take a stab at implementing this?


(Guillaume Klein) #50

Of course the feature will be behind an option.

I actually have something almost ready:

I can also add the scores on each line. Is there any request for a specific format?


(Matt Relich) #51

@guillaumekln Sweet, that looks great! So atm it would just dump the output, which is actually fine for me. If we also want the scores, then either csv, or maybe some other delimiter (eg | or <>).


(Vincent Nguyen) #52

suggestion
field1= score
field2=translation
field3=reference
separator |||

because | might be the feature separator already


(Guillaume Klein) #53

This is now on master.

Note that the comment I made earlier in this thread still applies:

However, as we set up the preprocessing, BLEU will be computed against gold sentences with resolved vocabulary, i.e. with OOV replaced by <unk> tokens.


(Matt Relich) #54

Sounds good. I will test this today. Thanks again for the feature addition!


(Matt Relich) #55

The output looks great, thank you again!