Input_sentence_size parameter into the spm.SentencePieceTrainer.Train

Help me figure out the parameter “input_sentence_size”, what exactly is it used for, how should it be set and why? How does it affect tokenization?

1 Like