Improving word feature labels by block text labeling for context

devinbostIL · July 3, 2017, 9:28pm

What would be required to improve the word features: OpenNMT Labeled word features
so support labeling blocks of text? For example, let’s say that I have statements like:

S: "What is the color of the blue block? The color of the blue block is blue!"
T: "Are you sure?"
S: "What is the color of the blue block? Yes I’m sure!"
T: "How do you know?"
S: "What is the color of the blue block? I know because you told me that it’s blue!"
T: “That is correct!”

I want to split them with labels like this:

S: "CONTEXT What is the color of the blue block? ||| MESSAGE The color of the blue block is blue!"
T: "MESSAGE Are you sure?"
S: "CONTEXT What is the color of the blue block? ||| MESSAGE Yes I’m sure!"
T: "MESSAGE How do you know?"
S: "CONTEXT What is the color of the blue block? ||| MESSAGE I know because you told me that it’s blue!"
T: “MESSAGE That is correct!”

I’d be happy to fork the code and submit a pull request to implement the changes if I can get some direction on what changes need to be made. What changes would need to be made to implement this feature?
(Either in OpenNMT or OpenNMT-py)

jean.senellart · July 3, 2017, 9:50pm

do I correctly understand that your statements are stateless sequences of Source/Target - or is there an history to keep between each pairs?

As for word features, what you are looking for is more related to a dual encoder - in Source there is an encoder for CONTEXT, and one for MESSAGE. It is not extremely hard to do, and we have some other use case for such dual encoder - I implemented something like that in SiameseRNN in this branch (SiameseRNN.lua)

however, some easy way to start/experiment is simply to paste CONTEXT and MESSAGE in a single sentence, and the RNN will naturally learn about the 2 parts

devinbostIL · July 3, 2017, 11:03pm

Thanks for the advice! I’ll look into the dual encoder approach.

The actual data is comprised of a set of two-person conversations like the one in the example I provided. So, there actually is a conversational history, and it would be awesome if that could be captured. How would that change things?