Hi Guillaume,
Fine-tuning a Hugging Face model gives a model with the following structure:
config.json rng_state.pth source.spm tokenizer_config.json vocab.json optimizer.ptscaler.pt special_tokens_map.json trainer_state.json
pytorch_model.bin scheduler.pt target.spm training_args.bin
Is it doable to provide a CTranslate2 conversion script for this? It would be great to combine the speed of CTranslate2 with the convenience of fine-tuning Hugging Face pre-trained models.
Yes. To be precise I’m referring to the Opus-MT models pre-trained by the Helsinki-NLP group. When fine-tuned the models have the structure shown above.
Maybe we will add it in the future, but right now you can already add your own converter for these models.
Since the model architecture is unchanged from the original Opus-MT models, you could start from the existing Marian converter and change what is specific to the Hugging Face version. For example the layer names will be different, the model configuration should be read elsewhere, etc., but the main logic should be the same.
Thanks so much, @guillaumekln. This will be extremely useful for me as I’m doing a lof of fine-tuning. I have just completed an end-to-end conversion and inference with a fine-tuned Hugging Face model and the process is straightforward.