Training chatbot with multiple inputs (to add context)

devinbostIL · May 12, 2017, 5:50pm

I’m training a chatbot involving student/teacher dialogue (think of it like an IT helpdesk chatbot model), but in the discussion, there’s often contextual information that is very important.
Example:
Johnny is trying to calculate the area of a rectangle. His problem is:
"You have a rectangle with height of 5 ft, x, and length of x^2. What is the area?"
Johnny (to teacher): "Is the length 25 ft? I still don’t get how to find the area"
Teacher (to Johnny): "Yes, the length is 25 ft because x = 5 and 5 * 5 = 25. Remember the formula for area?"
Johnny (to teacher): “Is it length times width?”
. . .

When training using a straight NMT model, we would map Johnny’s message to the Teacher’s message. However, without also including the actual text of the problem that Johnny was given, the Teacher’s response could be relevant to another problem involving area (where a student made an incorrect guess) and cause the Teacher to say something like:
Teacher (to Johnny): “No, the length is 49 inches because x = 7 and 7 * 7 = 49. Remember the formula for area?”

That would obviously confuse a student pretty profoundly.

I suppose that I could just append the context question to the end of the student’s message like this:

Johnny (to teacher): "Is the length 25 ft? I still don’t get how to find the area | You have a rectangle with height of 5 ft, x, and length of x^2. What is the area?"
Teacher (to Johnny): "Yes, the length is 25 ft because x = 5 and 5 * 5 = 25. Remember the formula for area?"
Johnny (to teacher): “Is it length times width? | You have a rectangle with height of 5 ft, x, and length of x^2. What is the area?”

guillaumekln · May 13, 2017, 5:52am

This sounds like a good approach to experiment first. Did you try it already?

jean.senellart · May 14, 2017, 9:54pm

Hi @devinbostIL, do you know if there is some dataset available with such sequence that we could use to experiment? In the dataset I built here from OpenSubtitle, it is very hard to get a real consistent sequence.
something else we could do is to pass the encoder output as input to the following sentence - this would avoid re-encoding the sentence, but also would possibly enable a continuity in all the dialog sequence.

devinbostIL · May 18, 2017, 9:12pm

Hi @jean.senellart, I would love it if we could open source the data, but there are student privacy issues that might create a barrier to being able to do that right now.
I ran the initial conversational data (without including the problem context) on the standard OpenNMT model except with a hidden unit size of 1000, and we were able to get the perplexity down to an average of around 5.6.
The results were still pretty exciting despite my concerns. Here’s one conversational excerpt:

[05/16/17 11:15:59 INFO] SENT 11897: Ready
[05/16/17 11:15:59 INFO] PRED 11897: What does this problem ask you to find?
[05/16/17 11:15:59 INFO] PRED SCORE: -2.36
[05/16/17 11:15:59 INFO]
[05/16/17 11:15:59 INFO] SENT 11898: 12
[05/16/17 11:15:59 INFO] PRED 11898: How did you get that?
[05/16/17 11:15:59 INFO] PRED SCORE: -6.37
[05/16/17 11:15:59 INFO]
[05/16/17 11:15:59 INFO] SENT 11899: 12+12+12=38 ~~~ right
[05/16/17 11:15:59 INFO] PRED 11899: not quite
[05/16/17 11:15:59 INFO] PRED SCORE: -5.84
[05/16/17 11:15:59 INFO]
[05/16/17 11:15:59 INFO] SENT 11900: 12+12+12=36 right?
[05/16/17 11:15:59 INFO] PRED 11900: You got it! Click back if you need more help–have a wonderful day
[05/16/17 11:15:59 INFO] PRED SCORE: -5.79
[05/16/17 11:15:59 INFO]
[05/16/17 11:15:59 INFO] SENT 11901: 12 times 3= 36
[05/16/17 11:15:59 INFO] PRED 11901: You are ready to solve the problem. Have a nice day!
[05/16/17 11:15:59 INFO] PRED SCORE: -5.09

devinbostIL · June 22, 2017, 9:53pm

I attempted to append the problem/context text to the front (left side) of the input messages (from the students), and the result was more than 1 point of improvement of perplexity against training and cross-validation.

However, interestingly, on this model that was trained to use the problem/context, when the problem/context text was omitted (during evaluation) and only the student message was provided, the result was very bizarre.
Examples:

[06/19/17 10:47:23 INFO] SENT 1: What’s 10 plus 10?
[06/19/17 10:47:23 INFO] PRED 1: 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10. 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10. 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10. 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 30, 20, 20, 30, 20, 30, 20, 30, 50, 20, 30, 50, 20, 30, 50, 20, 50, 20, 30, 50, 20, 50, 20, 30, 50, 20, 50, 20, 30, 50, 20, 50, 20, 30, 50, 20, 50, 20, 30, 50, 20, 30, 50, 20, 30, 50, 20, 50, 20, 30, 50, 20, 30, 50, 20, 50, 20, 30, 50, 20, 30, 30, 50, 20, 30, 30, 50, 20, 30, 30, 50, 20, 30, 50, 50, 30, 50, 50, 20, 30, 50, 50, 30, 50, 50, 30, 50, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
[06/19/17 10:47:23 INFO] PRED SCORE: -93.30

[06/22/17 15:44:23 INFO] SENT 1: I hate you
[06/22/17 15:44:23 INFO]
[06/22/17 15:44:23 INFO] BEST HYP:
[06/22/17 15:44:23 INFO] [-15.44] Students respect [removed for privacy] respect students respect students respect students respect students students respect students students. Ready to work?

[06/22/17 15:48:15 INFO] SENT 1: Wow teacher you are hot
[06/22/17 15:48:15 INFO]
[06/22/17 15:48:15 INFO] BEST HYP:
[06/22/17 15:48:15 INFO] [-11.08] All student student who student student student student student student student student missed their soccer days?
[06/22/17 15:48:15 INFO] [-11.42] All student student who student student student student student student student student student missed their soccer days?
[06/22/17 15:48:15 INFO] [-11.73] All student student who student student student student student student student student student student missed their soccer days?
[06/22/17 15:48:15 INFO] [-13.41] All student student who student student student student student student student student student student student student student missed their soccer days?
[06/22/17 15:48:15 INFO] [-14.35] All student student who student student student student student student student student student student student student student student missed their soccer days?
[06/22/17 15:48:15 INFO]
[06/22/17 15:48:15 INFO] PRED AVG SCORE: -0.69, PRED PPL: 2.00

So, it looks like this is a case of overfitting, but the repeated words are very interesting to me. The only cases of repetition like that would either be in student messages where they are saying something inappropriate/strange or in cases where the neural network is picking up repetition across multiple student messages that start with the same prefixed text (where they all have the same problem/context text).

Any ideas?