CTranslate2 conversion of Hugging Face fine-tuned model

tel34 · March 28, 2022, 8:22am

Hi Guillaume,
Fine-tuning a Hugging Face model gives a model with the following structure:
config.json rng_state.pth source.spm tokenizer_config.json vocab.json
optimizer.pt scaler.pt special_tokens_map.json trainer_state.json
pytorch_model.bin scheduler.pt target.spm training_args.bin

Is it doable to provide a CTranslate2 conversion script for this? It would be great to combine the speed of CTranslate2 with the convenience of fine-tuning Hugging Face pre-trained models.

guillaumekln · March 28, 2022, 9:17am

Hi,

Are you referring to the MarianMT models in Hugging Face?

tel34 · March 28, 2022, 9:35am

Yes. To be precise I’m referring to the Opus-MT models pre-trained by the Helsinki-NLP group. When fine-tuned the models have the structure shown above.

guillaumekln · March 28, 2022, 10:01am

Maybe we will add it in the future, but right now you can already add your own converter for these models.

Since the model architecture is unchanged from the original Opus-MT models, you could start from the existing Marian converter and change what is specific to the Hugging Face version. For example the layer names will be different, the model configuration should be read elsewhere, etc., but the main logic should be the same.

tel34 · March 28, 2022, 10:16am

OK, thanks. I will have a go at this.

guillaumekln · May 9, 2022, 3:20pm

The latest version added a converter for Hugging Face’s Transformers. The MarianMT models are supported.

See the related documentation: Transformers — CTranslate2 2.17.0 documentation

tel34 · May 9, 2022, 3:55pm

Thanks so much, @guillaumekln. This will be extremely useful for me as I’m doing a lof of fine-tuning. I have just completed an end-to-end conversion and inference with a fine-tuned Hugging Face model and the process is straightforward.