Adding biased decoding

whitemamba · February 12, 2017, 2:31am

Hi everyone, I am trying to add biased decoding to the model, but I am unsure of how to do this. Basically, my model is trying to add punctuation to the source sentences, so the only ‘words’ I want it to output is either the original word, or let’s say a comma (for simplicity sake). Therefore, is there any way to bias decoding such that the only words that the model considers are the original word and the comma, and nothing else?

whitemamba · February 12, 2017, 10:17am

By the way, this is how it was done on TensorFlow: http://atpaino.com/2017/01/03/deep-text-correcter.html

However, I am not that familiar with Lua and Torch yet, so I’m not sure how this can be translated onto this model.

srush · February 13, 2017, 3:30pm

Oh yes, this is something that we are interested in as well.

The new beam search should make this possible, although you will have to modify the code.

In particular see this function:

github.com

OpenNMT/OpenNMT/blob/master/onmt/translate/DecoderAdvancer.lua#L144




--[[Expand function. Expands beam by all possible tokens and returns the
scores.


Parameters:


* `beam` - an `onmt.translate.Beam` object.


Returns:


* `scores` - a 2D tensor of size `(batchSize * beamSize, numTokens)`.


]]
function DecoderAdvancer:expand(beam)
local state = beam:getState()
local decOut = state[2]
local out = self.decoder.generator:forward(decOut)
local features = {}
for j = 2, #out do
  local _, best = out[j]:max(2)
  features[j - 1] = best:view(-1)

It will allow you to return a (batchSize * beamSize) tensor at each step of beam search. If you return false on “bad” outputs it will ignore them.

whitemamba · March 5, 2017, 12:51pm

Hello, thanks for your response! I tried to do what you suggested for a looong time, but haven’t really gotten anywhere as I really can’t understand the code. Let’s say I only want it to output ‘a’ or ‘b’, which have index number of 1 and 2 respectively in the dict. How do I check if the token (which I understand is a tensor?) is equal to either ‘a’ or ‘b’?

Also, let’s say the original sentence is ‘c d e f’. Is the token the predicted word or the original word? And if it is the predicted word, is there a way to access the original word/sentence data?