How to train from sequences of features instead of sentences


(DucAnhLe) #1

I want to train a model from sequences of features instead of sentences. The feature is various, so I can not make a dictionary for sequences of features as sentences. Can any one help me?


(Guillaume Klein) #2

Can you be more specific about the kind of features you want to use? Are they just continuous vectors?


(DucAnhLe) #3

Hi guilaume Klein
Yes, they are continuous vectors. One example of input is a sequence of point in online handwritten. But we also extract more features, so the input is a sequence of real number vector as follows
Input
[v1, v2, v3, …, vn]
Output
[c1, c2, c3 … cm]
where vi is a vector of real number and cj is a character in ground truth of handwritten.

Thanks for your consideration


(Guillaume Klein) #4

This has been applied in:

and:

Currently this requires a bit of coding to change the data loading part and give your own ìnputNetwork that fits your application:


(DucAnhLe) #5

Thanks for your suggestions. I used Im2Text to recognize images of handwritten, but I want recognize online handwritten (recognize from sequences of points ). I will try the second one that you suggested.
Thanks for your quick response.


(jean.senellart) #6

Hi @tintinkool - as mentioned by @guillaumekln - the feature is already available in some branches, I will prepare a PR to get it available on the master. keep tuned.


(DucAnhLe) #7

Thank @jean.senellart and @guillaumekln for helping. Could you give me files in your example in topic "[WIP] Complex Encoders and vector inputs (used for ASR) "?
I will try to use this feature.

Thanks


(jean.senellart) #8

Hi @tintinkool - the code is available in the PR https://github.com/OpenNMT/OpenNMT/pull/168 - I am preparing short training example and will write a tutorial this week before merging.