I can see examples where people are using tensorflow model and convert it to tensorRT to speedup the inference process. Is it possible to do that with OpenNMT model.
Thanks
I can see examples where people are using tensorflow model and convert it to tensorRT to speedup the inference process. Is it possible to do that with OpenNMT model.
Thanks
NVIDIA did it with a OpenNMT-lua model and used it in TensorRT benchmarks. Unfortunately it required custom CUDA code for decoding that is not available in the public TensorRT release.
Yes I can see that in their comparison table. Thanks so much