Hi everyone,
I’m having trouble using source features in the v3 format while re-running some old experiments with the new feature format (W1│F1 W2│F2 … Wn│Fn) rather than two parallel files. Below I’m showing my config .yaml file that I’m using and command I run to fine-tune my model, followed by the error I receive.
For context, when I run this without attempting to use features, everything functions as expected and my models continue to train.
Thank you!
Finetune best model from run 1
## Where the samples will be written
save_data: tg-finetune
## Where the vocab(s) will be written
src_vocab: tg-finetune/tg-finetune.vocab.src
tgt_vocab: tg-finetune/tg-finetune.vocab.tgt
# Prevent overwriting existing files in the folder
overwrite: True
#src_feats: None
# Corpus opts:
data:
corpus_1:
path_src: eslse_train_es_pos_tagged.txt
path_tgt: eslse_train_gloss.txt
transforms: [inferfeats]
weight: 1
valid:
path_src: eslse_dev_es_tok.txt
path_tgt: eslse_dev_gloss.txt
transforms: [inferfeats]
# Train on a single GPU
world_size: 1
gpu_ranks: [0]
# Where to save the checkpoints - for finetuning must specify # of steps from final checkpoint!
save_model: tg-finetune/tg-finetune_8400_01
save_checkpoint_steps: 200
train_steps: 13400
valid_steps: 200
# Transform options
reversible_tokenization: "joiner"
# Features options
n_src_feats: 1
src_feats_defaults: "X"
feat_merge: "concat"
The command I run:
python3 ../OpenNMT-py/train.py --config eslse_pos_8400_run01.yaml --train_from tg-pretrain/models/tg-pretrain_03_step_8400.pt --reset_optim keep_states --log_file tg-features/eslse_ft_tat_8400_run01.log
The error I receive:
[2024-01-23 16:31:55,491 INFO] Weighted corpora loaded so far:
* corpus_1: 276
Traceback (most recent call last):
File "/home/ubuntu/lse_exps/../OpenNMT-py/train.py", line 6, in <module>
main()
File "/home/ubuntu/OpenNMT-py/onmt/bin/train.py", line 67, in main
train(opt)
File "/home/ubuntu/OpenNMT-py/onmt/bin/train.py", line 52, in train
train_process(opt, device_id=0)
File "/home/ubuntu/OpenNMT-py/onmt/train_single.py", line 238, in main
trainer.train(
File "/home/ubuntu/OpenNMT-py/onmt/trainer.py", line 308, in train
for i, (batches, normalization) in enumerate(self._accum_batches(train_iter)):
File "/home/ubuntu/OpenNMT-py/onmt/trainer.py", line 238, in _accum_batches
for batch, bucket_idx in iterator:
File "/home/ubuntu/OpenNMT-py/onmt/inputters/dynamic_iterator.py", line 373, in __iter__
for (tensor_batch, bucket_idx) in self.data_iter:
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 633, in __next__
data = self._next_data()
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
return self._process_data(data)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
data.reraise()
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/_utils.py", line 644, in reraise
raise exception
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 41, in fetch
data = next(self.dataset_iter)
File "/home/ubuntu/OpenNMT-py/onmt/inputters/dynamic_iterator.py", line 341, in __iter__
for bucket, bucket_idx in self._bucketing():
File "/home/ubuntu/OpenNMT-py/onmt/inputters/dynamic_iterator.py", line 278, in _bucketing
yield (self._tuple_to_json_with_tokIDs(bucket), self.bucket_idx)
File "/home/ubuntu/OpenNMT-py/onmt/inputters/dynamic_iterator.py", line 252, in _tuple_to_json_with_tokIDs
bucket.append(numericalize(self.vocabs, example))
File "/home/ubuntu/OpenNMT-py/onmt/inputters/text_utils.py", line 149, in numericalize
for fv, feat in zip(vocabs["src_feats"], example["src"]["feats"]):
KeyError: 'src_feats'
Thanks in advance for anyone who can offer me some advice!