The about section says that there are additional options on top of the baseline model, and a list of papers have been provided whose implementations are said to be available.
Can someone kindly tell me where can I find / use these “options” on top of the base Seq2Seq model ?
More specifically I want to experiment with :
local attention (local-m / local-p)
Character based word embeddings
Fast forward connections
Any help is highly appreciated