Hello,
I’ve been trying to use the automatic export on best bleu score, but as soon as i enable these options… it crash.
train:
save_checkpoints_steps: 500
#max_step: 500000
#single_pass: true
# (optional) How many checkpoints to keep on disk.
keep_checkpoint_max: 10
#effective_batch_size: 1
eval:
steps: 500
# Available scorers: bleu, rouge, wer, ter, prf
scorers: bleu
export_on_best: bleu
export_format: saved_model
max_exports_to_keep: 2
infer:
n_best: 3
with_scores: true
2021-06-30 15:13:16.661000: W deprecation.py:534] From /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/array_ops.py:5049: calling gather (from tensorflow.python.ops.array_ops) with validate_indices is deprecated and will be removed in a future version.
Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.
2021-06-30 15:13:20.703076: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.703234: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.705835: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.707624: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.709433: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.711257: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.713106: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.718170: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.737917: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.739712: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.740458: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.743680: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.744711: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.746572: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.749903: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.750904: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.751805: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.755037: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.755934: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.756768: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.760104: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.761013: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.761944: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:13:20.765847: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-SXM2-16GB" frequency: 1530 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11000" } environment { key: "cudnn" value: "8004" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 15395979264 bandwidth: 898048000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }
2021-06-30 15:16:44.707000: I training.py:202] Evaluation predictions saved to gdrive/MyDrive/VGR/en-fr/model/tf/OpenNMT-TF_fr_mtTransformer_tmtbpe_vs8000/eval/predictions.txt.500
Traceback (most recent call last):
File "/usr/local/bin/onmt-main", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/dist-packages/opennmt/bin/main.py", line 326, in main
hvd=hvd,
File "/usr/local/lib/python3.7/dist-packages/opennmt/runner.py", line 281, in train
moving_average_decay=train_config.get("moving_average_decay"),
File "/usr/local/lib/python3.7/dist-packages/opennmt/training.py", line 145, in __call__
evaluator, step, moving_average=moving_average
File "/usr/local/lib/python3.7/dist-packages/opennmt/training.py", line 202, in _evaluate
evaluator(step)
File "/usr/local/lib/python3.7/dist-packages/opennmt/evaluation.py", line 343, in __call__
score = scorer(self._labels_file, output_path)
File "/usr/local/lib/python3.7/dist-packages/opennmt/utils/scorers.py", line 92, in __call__
bleu = sacrebleu.corpus_bleu(sys_stream, [ref_stream], force=True)
File "/usr/local/lib/python3.7/dist-packages/sacrebleu/compat.py", line 36, in corpus_bleu
sys_stream, ref_streams, use_effective_order=use_effective_order)
File "/usr/local/lib/python3.7/dist-packages/sacrebleu/metrics/bleu.py", line 277, in corpus_score
raise EOFError("System and reference streams have different lengths!")
EOFError: System and reference streams have different lengths!