Training back-translation model separately

I was wondering if anybody came across a research or if there is any benefit in training a model solely on back-translation then ensemble it back to the base model.

The usual way is to add the back translation synthetic corpus on top of the parallel corpus and further train the base model.

Would there be any benefit in training a model from scratch based on the back translation only then merge it back to the base model?

Should be taken into consideration that the parallel corpus is about 100k (from which is a base model) and the monolingual corpus is about 1 million(from which is a back-translation model), it’s a low resource language setting.