I’m trying to implement the following command using code. I’d like to do some over processing with what translate returns ie trans.log(0)
. Ultimately I want to get the following code to work:
for batch in data_iter:
trans_batch = translator.translate_batch(
batch=batch, src_vocabs=[src_vocab],
attn_debug=False)
translations = builder.from_batch(trans_batch)
for trans in translations:
print(trans.log(0))
break
Below is the command that I use from the command line:
onmt_translate -model Model_1A_step_5000.pt -src data/test.txt -output data/pred.txt -gpu 0 --n_best 1
I’m working off the example from here but I’ve having a hard time filling in the correct code. I’d like to translate the file data/test.txt
but I’m struggling to get that to work with the onmt.inputters.Dataset
. The format of the data/test.txt
is below:
test.txt
this is a sentence to translate
Will we go play today?
Is it snowing outside?
I’ve tried loading that into src_data
from the code:
src_data = {"reader": onmt.inputters.str2reader["text"](), "data": src_val}
tgt_data = {"reader": onmt.inputters.str2reader["text"](), "data": tgt_val}
_readers, _data = onmt.inputters.Dataset.config(
[('src', src_data), ('tgt', tgt_data)])
dataset = onmt.inputters.Dataset(
vocab_fields, readers=_readers, data=_data,
sort_key=onmt.inputters.str2sortkey["text"])
So then I’d be able to create the dataset
object so I can work with the rest of the code.