What is the Kaldi input format? Where can I find an example or definition?

It was mentioned in the release notes for the most recent release of OpenNMT (v0.7) that OpenNMT now “supports arbitrary vectors as inputs using the Kaldi text format on the source side.” I have been unable to find an example of what the Kaldi text format actually looks like. Can someone please provide an example or a link to the specification so that I can see how this input format is defined?

At this stage I would not recommend to try until the pyramidal brnn model is fixed.
You may try with the standard brnn model but you will be limited by the sequence length of frames.

Having said that, if you are familiar with Kaldi, this refers to the text dump of kaldi MFCC.

Explaining further is beyond the scope of this forum, maybe you need to read some docs of Kaldi.

2 Likes

What’s up with the pyramidal brnn model?

The first implementation of pyramidal was doing sum of timesteps when reducing, there is now a new implementation doing concatenation and it seems to behave better - we are not sure exactly why though. more tests are still in progress.