Hello Fellow Researchers,
I tried to run onmt with multiple sources and multiple features.
My data.yml file like below,
model_dir: runQG/
data:
train_features_file:
- - train.txt.case
- train.txt.pos
- train.txt.bio
- train.txt.ner
- train.txt.src.txt
train_labels_file: train.txt.target.txt
eval_features_file:
- - dev.txt.shuffle.dev.case
- dev.txt.shuffle.dev.pos
- dev.txt.shuffle.dev.bio
- dev.txt.shuffle.dev.ner
- dev.txt.shuffle.dev.source.txt
eval_labels_file: dev.txt.shuffle.dev.target.txt
source_1_1_vocabulary: case-vocab.txt
source_1_2_vocabulary: pos-vocab.txt
source_1_3_vocabulary: bio-vocab.txt
source_1_4_vocabulary: ner-vocab.txt
source_2_vocabulary: src-vocab.txt
target_vocabulary: tgt-vocab.txt
And the model.py file looks like below,
from opennmt import models, inputters, encoders, layers, decoders
import tensorflow_addons as tfa
def model():
return models.SequenceToSequence(
source_inputter=inputters.ParallelInputter(
[inputters.ParallelInputter(
[inputters.WordEmbedder(embedding_size=2),
inputters.WordEmbedder(embedding_size=16),
inputters.WordEmbedder(embedding_size=16),
inputters.WordEmbedder(embedding_size=16),
], combine_features=True, reducer=layers.ConcatReducer()),
inputters.WordEmbedder(embedding_size=300)]
),
target_inputter=inputters.WordEmbedder(embedding_size=300),
encoder=encoders.ParallelEncoder([
encoders.RNNEncoder(1, 300, dropout=0.2),
encoders.RNNEncoder(2, 300, dropout=0.2, bidirectional=True)],
outputs_reducer=layers.ConcatReducer(axis=-1)),
decoder=decoders.AttentionalRNNDecoder(
num_layers=1,
num_units=300, attention_mechanism_class=tfa.seq2seq.LuongAttention,
dropout=0.2))
I already trained the model, but when I tried to infer it by the command
onmt-main --config data.yml --auto_config infer --features_file test.case test.pos test.bio test.ner test.source.txt --predictions_file infer_dev.txt
The error is
Traceback (most recent call last):
File “/usr/local/bin/onmt-main”, line 8, in
sys.exit(main())
File “/usr/local/lib/python3.6/dist-packages/opennmt/bin/main.py”, line 235, in main
log_time=args.log_prediction_time)
File “/usr/local/lib/python3.6/dist-packages/opennmt/runner.py”, line 342, in infer
prefetch_buffer_size=infer_config.get(“prefetch_buffer_size”))
File “/usr/local/lib/python3.6/dist-packages/opennmt/inputters/inputter.py”, line 462, in make_inference_dataset
prefetch_buffer_size=prefetch_buffer_size)
File “/usr/local/lib/python3.6/dist-packages/opennmt/inputters/inputter.py”, line 92, in make_inference_dataset
dataset = self.make_dataset(features_file, training=False)
File “/usr/local/lib/python3.6/dist-packages/opennmt/inputters/inputter.py”, line 270, in make_dataset
dataset = inputter.make_dataset(data, training=training)
File “/usr/local/lib/python3.6/dist-packages/opennmt/inputters/inputter.py”, line 265, in make_dataset
raise ValueError(“The number of data files must be the same as the number of inputters”)
ValueError: The number of data files must be the same as the number of inputters
How could I fix it?
Thanks and best regards