Hi. I want to launch opennmt-tf model on Android device. We use ctranslate2 for inference on the server and desktop but have difficulties doing the same for Android (problems with compiling sentencepiece, oneDNN, mkl for Android). And also I noticed that export support for tflite was added in opennmt-tf:2.20. What is the best practice right now to launch opennmt-tf model on Android devices?
I think CTranslate2 should give the best performance and flexibility, if you manage to compile it for Android. Note that for ARM processors you should compile CTranslate2 with OpenBLAS and/or Ruy (see the build options in the README).
On the other hand, it could be easier to get started with TensorFlow Lite if you don’t get errors while converting the model. However, I’m not sure about the actual performance.
I have another one strategic question about CT2, as point to support tf-lite in a near future for improve performance a transformer model.
thx
Hi @guillaumekln sorry for late reply
Now tf-lite for android provides a high-performance ML inference, mb dev team of ct2 has a plann to support this framework
There is no plan to integrate TensorFlow Lite in CTranslate2. They are 2 different inference frameworks. You should pick one.