OpenNMT Forum

Convert T5 for ctranslate2

Would it be possible to add the conversion for google T5 and Facebook Bart in huggingface/transformers to ctranslate2? It seems like cTranslate2 performance is outstanding and it is very few framework that can support int8 quantization in GPU. (Onnx, TF, Pytorch don’t support it)

Yes, I think we should look into adding these models. I would first need to review the exact architecture they use to estimate the amount of work.

1 Like

2 posts were split to a new topic: Question about Intel MKL