Hello, I have some doubts about the generated word embedding for NMTSmall architecture. As far as I know about word embeddings and neural architectures, word embedding matrix is supposed to be in vocab_source_sizeXembedding_size space. It happens quite the same with target words embeddings. However, when I was playing around with the generated embeddings, I found out that the dimensions does not corresponds to the source/target vocabulary. You can see the picture below…
I thought about the special tokens (
, and so on), but vocabulary files contains special tokens too, so I can’t find out why the embedding matrix is in (vocab_size+2)Xsize_embedding space. If you can explain to me the reason, I will appreciate it.
Thank you in advance.