@guillaumekln Thanks for the great ctranslate2 library.
With this release which supports conversion of Transformer models trained with Fairseq, is it possible to convert the M2M100_418M model from Facebook AI too? I can’t seem to find straightforward examples of similar models which were converted to ctranslate2 so far. The original model is here while there’s a Huggingface transformer version available here
I was successfully able to convert the WMT16 model but it seems to have quite a different model structure.
import os
import ctranslate2
# relative path to where the script is run from
data_dir = os.path.join(
"path...",
"m2m100_418M"
)
# huggingface transformer m2m100 model
# Ref: https://huggingface.co/facebook/m2m100_418M
converter = ctranslate2.converters.FairseqConverter(
os.path.join(data_dir, "pytorch_model.bin"), data_dir
)
output_dir = "/path/m2m_100/ctranslate2_model"
converter.convert(output_dir)
This is the error I got:
python3 python/m2m_100_converter.py
Traceback (most recent call last):
File "python/m2m_100_converter.py", line 23, in <module>
converter.convert(output_dir)
File "/<path>/github.com/OpenNMT/CTranslate2/python/ctranslate2/converters/converter.py", line 45, in convert
model_spec = self._load()
File "/<path>/github.com/OpenNMT/CTranslate2/python/ctranslate2/converters/fairseq.py", line 84, in _load
checkpoint = checkpoint_utils.load_checkpoint_to_cpu(self._model_path)
File "<path>/Library/Python/3.8/lib/python/site-packages/fairseq/checkpoint_utils.py", line 228, in load_checkpoint_to_cpu
args = state["args"]
KeyError: 'args'
I also tried with the original Fairseq M2M100_418M model from here and got an error.
Script:
import os
import ctranslate2
# relative path to where the script is run from
data_dir = os.path.join(
"path...",
"m2m100_original"
)
# original fairseq m2m100 model
# Ref: https://github.com/pytorch/fairseq/tree/master/examples/m2m_100
converter = ctranslate2.converters.FairseqConverter(
os.path.join(data_dir, "418M_last_checkpoint.pt"), data_dir
)
output_dir = "/path/m2m100_original/ctranslate2_model"
converter.convert(output_dir)
Error:
python3 python/m2m_100_original_converter.py
External language dictionary is not provided; use lang-pairs to infer the set of supported languages. The language ordering is not stable which might cause misalignment in pretraining and finetuning.
Traceback (most recent call last):
File "python/m2m_100_original_converter.py", line 23, in <module>
converter.convert(output_dir)
File "<path>/OpenNMT/CTranslate2/python/ctranslate2/converters/converter.py", line 45, in convert
model_spec = self._load()
File "<path>/OpenNMT/CTranslate2/python/ctranslate2/converters/fairseq.py", line 92, in _load
task = fairseq.tasks.setup_task(args)
File "<path>/Python/3.8/lib/python/site-packages/fairseq/tasks/__init__.py", line 28, in setup_task
return TASK_REGISTRY[task_cfg.task].setup_task(task_cfg, **kwargs)
File "<path>/Python/3.8/lib/python/site-packages/fairseq/tasks/translation_multi_simple_epoch.py", line 106, in setup_task
langs, dicts, training = MultilingualDatasetManager.prepare(
File "<path>/Python/3.8/lib/python/site-packages/fairseq/data/multilingual/multilingual_data_manager.py", line 371, in prepare
dicts[lang] = load_dictionary(
File "<path>/Python/3.8/lib/python/site-packages/fairseq/tasks/fairseq_task.py", line 54, in load_dictionary
return Dictionary.load(filename)
File "<path>/Python/3.8/lib/python/site-packages/fairseq/data/dictionary.py", line 214, in load
d.add_from_file(f)
File "<path>/Python/3.8/lib/python/site-packages/fairseq/data/dictionary.py", line 227, in add_from_file
raise fnfe
File "<path>/Python/3.8/lib/python/site-packages/fairseq/data/dictionary.py", line 224, in add_from_file
with open(PathManager.get_local_path(f), "r", encoding="utf-8") as fd:
FileNotFoundError: [Errno 2] No such file or directory: '<path>/m2m100_original/dict.af.txt'
✗ aws-fb-test ~ $ ct2-fairseq-converter --model_path /root/fairseq/1.2B_last_checkpoint.pt --data_dir /root/fairseq/ --output_dir /tmp/out --force
Traceback (most recent call last):
File “/root/.pyenv/versions/3.8.1/bin/ct2-fairseq-converter”, line 8, in
sys.exit(main())
File “/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/ctranslate2/bin/fairseq_converter.py”, line 18, in main
converters.FairseqConverter(args.model_path, args.data_dir).convert_from_args(args)
File “/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/ctranslate2/converters/converter.py”, line 31, in convert_from_args
return self.convert(
File “/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/ctranslate2/converters/converter.py”, line 45, in convert
model_spec = self._load()
File “/root/.pyenv/versions/3.8.1/lib/python3.8/site-packages/ctranslate2/converters/fairseq.py”, line 92, in _load
task = fairseq.tasks.setup_task(args)
File “/root/fairseq/fairseq/tasks/init.py”, line 44, in setup_task
return task.setup_task(cfg, **kwargs)
File “/root/fairseq/fairseq/tasks/translation_multi_simple_epoch.py”, line 125, in setup_task
langs, dicts, training = MultilingualDatasetManager.prepare(
File “/root/fairseq/fairseq/data/multilingual/multilingual_data_manager.py”, line 311, in prepare
if args.langtoks is None:
AttributeError: ‘Namespace’ object has no attribute ‘langtoks’
But the biggest, 12B model (12b_last_chk_4_gpus.pt) gives conversion error. It’s because of shared across several GPU’s ?
Traceback (most recent call last):
File “/root/.local/share/virtualenvs/r10-MnTosGMW/bin/my-convert”, line 13, in
converter.convert(output_dir)
File “/root/.local/share/virtualenvs/r10-MnTosGMW/lib/python3.8/site-packages/ctranslate2/converters/converter.py”, line 45, in convert
model_spec = self._load()
File “/root/.local/share/virtualenvs/r10-MnTosGMW/lib/python3.8/site-packages/ctranslate2/converters/fairseq.py”, line 94, in _load
model_spec = _get_model_spec(args)
File “/root/.local/share/virtualenvs/r10-MnTosGMW/lib/python3.8/site-packages/ctranslate2/converters/fairseq.py”, line 61, in _get_model_spec
utils.raise_unsupported(reasons)
File “/root/.local/share/virtualenvs/r10-MnTosGMW/lib/python3.8/site-packages/ctranslate2/converters/utils.py”, line 16, in raise_unsupported
raise ValueError(message)
ValueError: The model you are trying to convert is not supported by CTranslate2. We identified the following reasons:
Option --arch transformer_wmt_en_de_big_pipeline_parallel is not supported (supported architectures are: transformer_wmt_en_de_big, transformer_tiny, transformer_vaswani_wmt_en_fr_big, transformer_wmt_en_de, transformer, transformer_vaswani_wmt_en_de_big, transformer_iwslt_de_en, transformer_wmt_en_de_big_t2t)
Yes, the 12B version is using a different model architecture in order to distribute it on several GPUs.
For now it is not supported, but I will check how CTranslate2 is handling these gigantic models. A 48GB model is multiple times bigger than what we ever tested.
I think the quantized 12B model would barely run on a 16GB GPU. But before that there would be other issues to address, for example the converter currently requires 2 times the model size in memory to run.
I’m having the same issue as @alexeir and getting a AttributeError: ‘Namespace’ object has no attribute ‘langtoks’ error trying to convert the M2M-100 418M FairSeq model.
$ ls
418M_last_checkpoint.pt model_dict.128k.txt spm.128k.model
$ ct2-fairseq-converter --data_dir . --model_path 418M_last_checkpoint.pt --fixed_dictionary model_dict.128k.txt --output_dir m2m_100_418m_ct2
Traceback (most recent call last):
File "/home/argosopentech/temp/env/bin/ct2-fairseq-converter", line 8, in <module>
sys.exit(main())
File "/home/argosopentech/temp/env/lib/python3.10/site-packages/ctranslate2/converters/fairseq.py", line 340, in main
converter.convert_from_args(args)
File "/home/argosopentech/temp/env/lib/python3.10/site-packages/ctranslate2/converters/converter.py", line 50, in convert_from_args
return self.convert(
File "/home/argosopentech/temp/env/lib/python3.10/site-packages/ctranslate2/converters/converter.py", line 89, in convert
model_spec = self._load()
File "/home/argosopentech/temp/env/lib/python3.10/site-packages/ctranslate2/converters/fairseq.py", line 167, in _load
task = fairseq.tasks.setup_task(args)
File "/home/argosopentech/temp/env/lib/python3.10/site-packages/fairseq/tasks/__init__.py", line 46, in setup_task
return task.setup_task(cfg, **kwargs)
File "/home/argosopentech/temp/env/lib/python3.10/site-packages/fairseq/tasks/translation_multi_simple_epoch.py", line 127, in setup_task
langs, dicts, training = MultilingualDatasetManager.prepare(
File "/home/argosopentech/temp/env/lib/python3.10/site-packages/fairseq/data/multilingual/multilingual_data_manager.py", line 342, in prepare
if args.langtoks is None:
AttributeError: 'Namespace' object has no attribute 'langtoks'
I’m using ctranslate2 2.19.0 with Python 3.10.4 on Ubuntu 22.04.
The conversion works fine with fairseq==0.10.2, but they recently released fairseq==0.12.1 which produces this error. I will look to support this newer Fairseq version.