Hi there,
I’m trying to reset the learning rate after updating my model with a new vocabulary (to do fine-tuning).
I’m executing these commands:
onmt-update-vocab --model_dir run_gen_bpe64mix_enfr/ --output_dir run_genind_bpe64mix_enfr/ --src_vocab gen_bpe64mix_enfr_en_vocab.txt --tgt_vocab gen_bpe64mix_enfr_fr_vocab.txt --new_src_vocab genind_bpe64mix_enfr_en_vocab.txt --new_tgt_vocab genind_bpe64mix_enfr_fr_vocab.txt --mode replace
And then:
onmt-main train_and_eval --model_type TransformerFP16 --auto_config --config config_genind_bpe64mix_enfr.yml
Being config_genind_bpe64mix_enfr.yml:
data:
eval_features_file: genind_bpe64mix_enfr_en_training_set_val.txt
eval_labels_file: genind_bpe64mix_enfr_fr_training_set_val.txt
source_words_vocabulary: genind_bpe64mix_enfr_en_vocab.txt
target_words_vocabulary: genind_bpe64mix_enfr_fr_vocab.txt
train_features_file: genind_bpe64mix_enfr_en_training_set_train.txt
train_labels_file: genip_bpe64mix_enfr_fr_training_set_train.txt
eval:
batch_size: 32
eval_delay: 18000
exporters: last
infer:
batch_size: 32
bucket_width: 5
model_dir: run_genind_bpe64mix_enfr/
params:
average_loss_in_time: true
beam_width: 4
decay_params:
model_dim: 512
warmup_steps: 4000
decay_type: noam_decay_v2
label_smoothing: 0.1
learning_rate: 2.0
length_penalty: 0.6
optimizer: LazyAdamOptimizer
optimizer_params:
beta1: 0.9
beta2: 0.998
score:
batch_size: 64
train:
average_last_checkpoints: 5
batch_size: 3072
batch_type: tokens
bucket_width: 1
effective_batch_size: 25000
keep_checkpoint_max: 5
maximum_features_length: 100
maximum_labels_length: 100
sample_buffer_size: -1
save_summary_steps: 100
train_steps: 500000
When the in-domain model starts training, the learning rate initial value equals the last value it had in the out-of-domain model.
Any help on how to reset an updated model with fresh hyperparameters?