Are composite tokens possible?

faiz95ahmed · March 21, 2020, 8:06pm

For seq2seq translation, instead of having tokens composed of multiple characters be unique tokens, is it possible for every token to be 1 character in length, and the target data to output more than 1 character at a time.

Essentially, if the seq2seq ordinarily outputs a 1 hot vector (with 1 corresponding to whichever token is predicted), is it instead possible for it to output a vector containing multiple ones?)

guillaumekln · April 17, 2020, 12:47pm

How would you know the order in which the characters appear?