Error after converting a Fairseq model

anderleich · July 5, 2021, 12:10pm

Hi,

I’m wondering how I could run a Fairseq model with the REST server. With the newest version of Ctranslate2 it is possible to convert a Fairseq model. I guess this converted model should work with the REST server. Is this possible?

I get the following error:

IndexError: variable encoder/layer_norm/beta not found

Any clues?

Thanks

guillaumekln · July 5, 2021, 12:24pm

What are the options used by this Fairseq model and how did you convert it?

(I’m moving this discussion in a separate topic since it is not directly related to the REST server.)

anderleich · July 5, 2021, 12:27pm

This is how I converted the model:

ct2-fairseq-converter --model_path models/en_es.pt --data_dir models/en_es/data --output_dir models/en_es.pt.ctrans2

I did not train the model myself. Data in models/ is what I have. I’ve tried translating some sentences with Fairseq and it works with that data.

guillaumekln · July 5, 2021, 12:32pm

Can you run the following command and report the output (make sure the path to the model is correct):

python3 -c 'from fairseq import checkpoint_utils; print(checkpoint_utils.load_checkpoint_to_cpu("models/en_es.pt")["args"])'

anderleich · July 5, 2021, 12:35pm

Namespace(activation_dropout=0.0, activation_fn='relu', adam_betas='(0.9, 0.98)', adam_eps=1e-08, adaptive_input=False, adaptive_softmax_cutoff=None, adaptive_softmax_dropout=0, arch='transformer', attention_dropout=0.0, best_checkpoint_metric='loss', bpe=None, bucket_cap_mb=25, clip_norm=0.0, cpu=False, criterion='label_smoothed_cross_entropy', cross_self_attention=False, curriculum=0, data='data', dataset_impl=None, ddp_backend='c10d', decoder_attention_heads=8, decoder_embed_dim=512, decoder_embed_path=None, decoder_ffn_embed_dim=2048, decoder_input_dim=512, decoder_layerdrop=0, decoder_layers=6, decoder_layers_to_keep=None, decoder_learned_pos=False, decoder_normalize_before=False, decoder_output_dim=512, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_port=-1, distributed_rank=0, distributed_world_size=1, dropout=0.3, empty_cache_freq=0, encoder_attention_heads=8, encoder_embed_dim=512, encoder_embed_path=None, encoder_ffn_embed_dim=2048, encoder_layerdrop=0, encoder_layers=6, encoder_layers_to_keep=None, encoder_learned_pos=False, encoder_normalize_before=False, eval_bleu_detok='space', eval_bleu_remove_bpe=None, eval_tokenized_bleu=False, fast_stat_sync=False, find_unused_parameters=False, fix_batches_to_gpus=False, fixed_validation_seed=None, fp16=True, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, ignore_prefix_size=0, keep_interval_updates=-1, keep_last_epochs=-1, label_smoothing=0.1, layer_wise_attention=False, layernorm_embedding=False, lazy_load=False, left_pad_source=True, left_pad_target=False, load_alignments=False, log_format='tqdm', log_interval=1000, lr=[0.001], lr_scheduler='inverse_sqrt', max_epoch=60, max_sentences=None, max_sentences_valid=None, max_source_positions=1024, max_target_positions=1024, max_tokens=3584, max_tokens_valid=3584, max_update=0, maximize_best_checkpoint_metric=False, memory_efficient_fp16=False, min_loss_scale=0.0001, min_lr=1e-09, no_cross_attention=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_progress_bar=False, no_save=False, no_save_optimizer_state=False, no_scale_embedding=False, no_token_positional_embeddings=False, num_batch_buckets=0, num_workers=1, optimizer='adam', optimizer_overrides='{}', quant_noise_pq=0, quant_noise_pq_block_size=8, quant_noise_scalar=0, raw_text=False, required_batch_size_multiple=8, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', save_dir='checkpoints', save_interval=1, save_interval_updates=0, seed=1, sentence_avg=False, share_all_embeddings=False, share_decoder_input_output_embed=False, skip_invalid_size_inputs_valid_test=False, source_lang='eu', target_lang='es', task='translation', tensorboard_logdir='', threshold_loss_scale=None, tokenizer=None, train_subset='train', truncate_source=False, update_freq=[32], upsample_primary=1, use_bmuf=False, use_old_adam=False, user_dir=None, valid_subset='valid', validate_interval=1, warmup_init_lr=1e-07, warmup_updates=4000, weight_decay=0.0)

guillaumekln · July 5, 2021, 12:41pm

This model seems pretty standard so it should work.

Can you verify the CTranslate2 version you are using to run the translation? I suspect this version is older than the one you use for converting the model.

anderleich · July 5, 2021, 12:44pm

I was using ctranslate2==1.20.1. I upgraded ctranslate2 to the latest version (2.1.0). Now I get the following error:
onmt.translate.translation_server.ServerModelError: Runtime Error: CUDA failed with error initialization error

I’m using CUDA 10.2 with pytorch==1.6

guillaumekln · July 5, 2021, 12:51pm

The version should not be older than the one used for converting the model.

As you noted before, the latest CTranslate2 version requires CUDA 11. This is another issue.

anderleich · July 5, 2021, 12:52pm

Can I compile ctranslate2 with CUDA 10.2?

guillaumekln · July 5, 2021, 12:59pm

Yes. We still support CUDA >= 10.0 for source compilation.

You can get started with this example. The Python instructions are further down.

I’m closing this topic since the initial issue has been identified (the runtime version was too old to support models converted from Fairseq).

guillaumekln · July 5, 2021, 12:59pm