Shards v/s Batch size - Image2Text

Hi there,

I am new to OpenNMT-py and trying to work with Image2Text v1.0.0.rc1 model using my own dataset.
I am getting confused at one point. In the preprocessing script, we are making “Shards” of the entire dataset by dividing them into 500 datapoint. While in training script, we have set the value of
“batch size” to 20. May I request you to tell what is going on over here. Are we giving one shard at a time to the training step and then dicing that shard into batches of 20?

Thanks in advance!