I use OpenNMT-py and my translator requires a bit of additional data on
top of what is included in checkpoint (such as BPE codes, for example).
I find it convenient to keep everything I need in one big file, so I add
these extra keys after the training is done.
I think it may be useful to parametrize drop_checkpoint with a dict
with user-defined data. Would you find it valuable?
Sure, that’s what I did. I just don’t want my fork to diverge too much, so if my changes were considered valuable for the project, I’d rather had them merged.
It is a private repo of my company, so, unfortunately, no. Also, the code is a bit hairy at the moment, I didn’t go all the way to make it usable for anyone but myself.
If the problem itself is acknowledged as worth solving, I would make a cleaner patch at my own time, maybe work with maintainers to make sure it doesn’t violate the project’s principles I’m not aware of, and have it merged.