Doc2vec and Transformer

valentinmace · April 1, 2019, 5:05pm

Hi,

I’m working and learning on the topic of Document-level NMT. I recently read about Doc2vec algorithm to represent documents as embedgings

My question might be naive, but would it be a good idea or even a reasonable idea to add document embedding to a transformer in order to bring some context information ?

I’m not even thinking of a practical implementation but I want to know if it is a stupid assumption to think of Document Level traduction in this way

Thanks in advance

Dakun · April 2, 2019, 3:33pm

I’m not focus on Doc2vec, but it seems reasonable like https://aclweb.org/anthology/D18-1325

valentinmace · April 2, 2019, 3:53pm

Thanks for your suggestion.

I haven’t read the publication yet but I guess doc2vec might be difficult to add to a translation model because you need to make an inference to get a document vector for each new document before translation

I don’t know opennmt enough to know if it is reasonable idea to implement, and also if the document vector would not be too costly to compute

Also what do you think of alternatives like Elmo to get word embeddings with sentence context ?

emartinezVic · April 2, 2019, 4:19pm

Dear Valentin,

doc2vec provides you an embedding for an entire document (or batch of sentences) capturing in this way the document context information.
You can always get this document representation as a first step before starting the translation process of your document, either with a transformer or a RNN encoder-decoder based system, where you can introduce modifications to take into account this document vector representation.
Also, you can play around with the wide of the context you use, taking into account document entire context or batch context, thinking on how topic can vary inside the same document.

As far as I know, Elmo embeddings only capture sentence context, this is, they ignore inter-sentence information. Recall that the NMT systems do handle inter-sentence context from how they build the source sentence representations before passing this info to the decoder. However, I am not aware of any approach that used ElMo embeddings to check if changing this word representations can help systems to better manage some sentence context information.

I am quite interested in document-level MT so, feel free to write me a pm any time to keep on discussing/talking on this topic.

Best,
Eva

valentinmace · April 3, 2019, 8:04am

Dear Eva,

Thank you very much for your enlightening answer, it will help me a lot

I will pm you as soon as I feel I have something interesting to bring you

Best,
Valentin

emartinezVic · April 4, 2019, 9:01am

Just to complete the info here, check this post:

there it is explained how to use ElMo embeddings inside OpenNMT

valentinmace · April 4, 2019, 2:14pm

Yes it is a good complement, thank you again for your help