Convert T5 for ctranslate2

Would it be possible to add the conversion for google T5 and Facebook Bart in huggingface/transformers to ctranslate2? It seems like cTranslate2 performance is outstanding and it is very few framework that can support int8 quantization in GPU. (Onnx, TF, Pytorch don’t support it)

Yes, I think we should look into adding these models. I would first need to review the exact architecture they use to estimate the amount of work.

1 Like

2 posts were split to a new topic: Question about Intel MKL

A post was split to a new topic: Port OpenNMT-py models to HuggingFace

Both T5 and BART can now be converted to CTranslate2. See the examples here:

https://opennmt.net/CTranslate2/guides/transformers.html

1 Like