Hello. I am a beginner with OpenNMT, and I am trying to convert my self-made OpenNMT-py model to a CTranslate2 model.
However, I am encountering an error as follows. When creating the OpenNMT-py model, I set “self_attn_type” to “scaled-dot”, and also my config.yaml is created referring to GitHub - ymoslem/OpenNMT-Tutorial: Neural Machine Translation (NMT) tutorial. Data preprocessing, model training, evaluation, and deployment..
Could anyone please advise me on how to solve this problem? Thank you very much in advance.
Command
ct2-opennmt-py-converter --model_path model.ensl_step_14000.pt --output_dir enslo_ctranslate
Error message
"ValueError: The model you are trying to convert is not supported by CTranslate2. We identified the following reasons:
- Option --self_attn_type scaled-dot-flash is not supported (supported values are: scaled-dot)"
config.yaml
save_data: run
data:
corpus_1:
path_src: Moodle.en-sl.en-filtered.en.subword.train
path_tgt: Moodle.en-sl.sl-filtered.sl.subword.train
transforms: [filtertoolong]
weight: 2
valid:
path_src: Moodle.en-sl.en-filtered.en.subword.dev
path_tgt: Moodle.en-sl.sl-filtered.sl.subword.dev
transforms: [filtertoolong]
weight: 2
train_from: "model_back/model.ensl_step2_11000.pt"
update_vocab: true
reset_optim: "states"
self_attn_type: "scaled-dot"
src_vocab: run/source.vocab
tgt_vocab: run/target.vocab
src_vocab_size: 50000
tgt_vocab_size: 50000
src_seq_length: 150
tgt_seq_length: 150
src_subword_model: source.model
tgt_subword_model: target.model
log_file: train.log
save_model: models/model.ensl
early_stopping: 2
save_checkpoint_steps: 1000
seed: 3435
train_steps: 14000
valid_steps: 1000
warmup_steps: 500
report_every: 100
world_size: 1
gpu_ranks: [0]
bucket_size: 262144
num_workers: 0
batch_type: "tokens"
batch_size: 4096
valid_batch_size: 2048
max_generator_batches: 2
accum_count: [4]
accum_steps: [0]
model_dtype: "fp16"
optim: "adam"
learning_rate: 0.3
decay_method: "noam"
adam_beta2: 0.998
max_grad_norm: 0
label_smoothing: 0.1
param_init: 0
param_init_glorot: true
normalization: "tokens"
encoder_type: transformer
decoder_type: transformer
position_encoding: true
enc_layers: 6
dec_layers: 6
heads: 8
hidden_size: 512
word_vec_size: 512
transformer_ff: 2048
dropout_steps: [0, 11000, 12000, 13000]
dropout: [0.3, 0.5, 0.4, 0.4]
attention_dropout: [0.1, 0.1, 0.1, 0.1]