Batch type definition

in the train documentation, this is batch type description:

–batch_type {sents,tokens}, -batch_type {sents,tokens}
Batch grouping for batch_size. Standard is sents.
Tokens will do dynamic batching (default: sents)
what is batch type? and what we mean if choose batch type : sentence or token? what is the difference?

It is the way the batches are built from your input data.
-batch_type sents -batch_size 32 will create batches of 32 sentences every time
-batch_type tokens -batch_size 4096 will fill batches up to a maximum of 4096 tokens (so not necessarily a fixed number of sentences)
This, in combination with the pooling mechanism (sample N x batch_size examples from your dataset, sort them by length, build batches, shuffle batches) allows for better hardware utilization and overall training speed.

1 Like