I have a fairly small vocabulary (<100) and solid amount of samples (10k). How can I estimate how many iterations I need for basic input -> output mapping (no word features).
I think you should just set a high iteration number and monitor the metric you care about (e.g. the evaluation loss) and manually stop the training when this metric meets your expectation.
Then for future training, you can use this knowledge to set a more precise value.