outputs.backward(gradOutput) (7)
What's the difference between model.zero_grad() and optim.zero_grad() (2)
Need a proper understanding of the preprocess step (5)
Having issue while preprocessing the data (1)
Different RNN architecture for each layer of Encoder (3)
Further Memory Optimization (1)
Coding environment for the framework (2)
Validating trained model (7)
Efficient implementation for training with uneven batch (3)
CUDA translation support (5)
GPU offloading in CTranslate (1)
How to get flatten parameters, gradient of Encoder & Decoder? (Encoder:getParameters() work?) (2)
Comparing OpenNMT and PyOpenNMT efficiency (1)
Changing training criteria of Seq2Seq model (2)
Adding biased decoding (4)
Attention on a specific word in the context ( 2 ) (28)
How should I choose parameters? (1)
Performance analysis number (2)
Improving the gpuid option (5)
Model specification (1)
Remove factored-geneartion from Beam Search (1)
Training.lua as a library (1)