At what stage of the learning rate would I want to switch to my domain-specific corpus for transfer learning? My parent corpus is 2.8M, and my domain specific has 40K. I was going to switch over when the learning rate is decreasing and around 0.0002, but I really have no idea how to choose when. For reference, the learning rate peaks at 0.001 (edited my original post to reflect this).
I think it’s easier to reason in number of training steps. Maybe start with 100k training steps on generic data before starting the adaptation. As you gain more experience, you will probably find a better value.