I have been using OpenNMT-tf to translate from synthesized English phrases to a simple form of knowledge representation (KR). I want to use OpenNMT to learn generalized pattern that would work even for words not in the training set (but are in the overall vocab), and I am looking for advice as to how to make it happen.
For example, a training sample has the following form:
Source English: John gave the ball to Mary
Target KR: PTRANS ball John Mary
The target KR above is meant to indicate that there is the physical transfer of a ball from John to Mary. Similarly, the following indicates that the transfer happens in the opposite direction.
Source English: John took the ball from Mary
Target KR: PTRANS ball Mary John
OpenNMT performs great with a set of synthesized training dataset like the above, with random combinations of many names, actions, and objects. However, if a test sample contains names that OpenNMT has never observed in the training dataset, then the test always fails. For example, if the test English phrase is:
Test Source sample: John gave the ball to Jess
Where the name ‘Jess’ is in the vocab but never appeared in the training dataset, then running it through a trained OpenNMT will yield a prediction something like:
Predicted KR: PTRANS ball John Dan
Where the ‘Dan’ part varies but always the wrong answer. The correct answer should be:
Correct answer: PTRANS ball John Jess
For my purposes this is a big problem, since limiting my system to function only with names visible during training is not acceptable.
I understand that OpenNMT is not meant for performing the kind of generalization that can infer higher-level pattern from ground facts, and also that such limitation is not specific to OpenNMT. What I am looking for here is a way to get OpenNMT to behave closer to what I wanted as described above
Any advice is appreciated.