Does the new version of OpenNMT-py (3) includes new PyTorch 1.12 Better Transformer (A BetterTransformer for Fast Transformer Inference | PyTorch) that accelerates inference time?
As far as I know the new version does not include these layers.
@vince62s Any plans to integrate the nn.modules listed in this blog post?
Note that OpenNMT-py models can be converted to CTranslate2 which is our take on accelerated inference. CTranslate2 already implements these optimizations, and more!
While working on v3.0 I tested extensively with the implementation of this function
but it did not bring any improvement compared to our implementation which already optimized.
However, the rigth anwser is: CTranslate2 is the best inference solution at this time for any kind of transformer model.