OpenNMT

Transfer tensor's weight to another tensor

Is there a way to transfer a weight of one tensor to another?
Let’s assume that I have already trained a model with these quotation marks “some text in here”, then I decide to update the vocabulary and add another quotation marks «some text i here» that should behave the same, so instead of training the model again, is there a way to transfer the weights of those existing quotation marks to the new ones?

No, this will require a training. New embeddings should be learned for the new tokens, otherwise the model has no ways to make the difference.

But you can make this training shorter by using OpenNMT-tf vocabulary update feature:

  • Add the new tokens in the vocabulary
  • Run onmt-main ... update_vocab ...
  • Update model_dir to the updated checkpoint directory
  • Continue the training on data containing the new tokens
1 Like

Update model_dir to the updated checkpoint directory
Continue the training on data containing the new tokens

Would this be like a fresh training with uploaded weights?

No, this will reuse most of the weights from the checkpoint. So when continuing the training the loss will start low.

Thank you.

I’m trying to update the vocab of an existing model with a few custom placeholder tokens. When I follow this procedure, I get this error:

ValueError: could not broadcast input array from shape (33744) into shape (33768)

The vocab is shared and the model is a custom big transformer with shared embeddings.

onmt-main --config my_config.yml --auto_config --model TransformerBigShared.py update_vocab --output_dir merged --src_vocab updated.vocab --tgt_vocab updated.vocab

Any hints?

Edit: Actually, the question is if the procedure works by replacing tokens but keeping the same vocab size.

What OpenNMT-tf version are you using?

2.17.1. I know, I’m planning to update soon…

Updating the vocabulary of models with shared embeddings was fixed fairly recently. In version 2.18.1:

1 Like

Super, thanks!