Getting encoder embeddings for words from the model

Hi!
I’m not the best coder, so I apologise in advance for any potential mistakes in understanding the architecture or terminology. I have a built .pt translation model and I need somehow to extract from it embeddings for particular words (from the first layer of the encoder, if I’m not mistaken.) Could you please guide me how I can do that using openmt py source code?

Thank you in advance

1 Like

if your point is just to extract the embeddings out of the model and use them outside there is an old script here:

if you want to use them within your code to do something else, then identify in the script above how you access those and it should be clearer.

1 Like

My initial plan was to replace the original encoder’s embeddings of specific words with alternative vectors like an average between two embeddings or x2 etc…And to translate with the model using these changed embeddings and to see the results. Is there a way to do that?

still not sure what you are trying to do but on the encoder side it is happening here:

Hi! I am getting this error: ImportError: cannot import name ‘dict_to_vocabs’ from ‘onmt.inputters.inputter’. when I try to run this script. What can be a reason? Thank you in advance!