Domain adaptation: Unable to converge

Hi,

I am using opennmt-tf to implement domain adaptation. When I train the out-of-domain model everything seems fine and the test scores are reasonably good (about 55 BLEU), but when I start fine tuning (after updating the vocabulary by replacement, not merging), there’s a point in the training when the loss value switches from decreasing to increasing. After evaluating my in-domain test data, the scores (BLEU, NIST, TER, CHARCUT) are absolutely terrible.

I first thought it could be a problem of overfitting, but I’m providing about 9000 validation pairs. Then I thought the problem could be using a too big learning rate, but I am using noam_decay, so I don’t understand why, after some training, the loss is growing.

I am using a TransformerBigFP16 using a BPE vocabulary of 16000 tokens (by means of SentencePiece). The out of domain dataset is about 3.5M and the in-domain dataset is about 350K pairs.

I ran out of ideas, so any suggestion would be much appreciated.
Thanks for your help

I paste below my configuration in case it could be of any help
data:
eval_features_file: gen_bpe16_enfr_en_training_set_val.txt
eval_labels_file: gen_bpe16_enfr_fr_training_set_val.txt
source_words_vocabulary: gen_bpe16_enfr_en_vocab.txt
target_words_vocabulary: gen_bpe16_enfr_fr_vocab.txt
train_features_file: gen_bpe16_enfr_en_training_set_train.txt
train_labels_file: gen_bpe16_enfr_fr_training_set_train.txt
eval:
batch_size: 32
eval_delay: 18000
exporters: last
infer:
batch_size: 32
bucket_width: 5
model_dir: run_gen_bpe16_enfr/
params:
average_loss_in_time: true
beam_width: 4
decay_params:
model_dim: 512
warmup_steps: 4000
decay_type: noam_decay_v2
label_smoothing: 0.1
learning_rate: 2.0
length_penalty: 0.6
optimizer: LazyAdamOptimizer
optimizer_params:
beta1: 0.9
beta2: 0.998
score:
batch_size: 64
train:
average_last_checkpoints: 5
batch_size: 3072
batch_type: tokens
bucket_width: 1
effective_batch_size: 25000
keep_checkpoint_max: 20
maximum_features_length: 100
maximum_labels_length: 100
sample_buffer_size: -1
save_summary_steps: 100
train_steps: 500000

Hi,

Can you post the log containing the training loss?

Do you mean the loss progress I sent below or the whole log? It’s a pretty long one…

3.594903
2.3407786
2.8953185
2.2123384
2.1265566
2.0729685
2.0836256
2.0877967
2.0437484
2.1304693
2.0340884
2.0361493
2.5588007
2.0783994
2.0088189
1.9710742
1.927145
1.968841
1.9275397
1.9379642
1.9214863
1.8963902
1.7913756
1.9321657
1.8806634
1.9029595
1.894307
1.887802
1.8232077
1.8642011
1.7917312
1.8547529
1.8165398
1.8424934
1.8619095
1.8375453
1.820529
1.8374084
1.857864
1.8274229
1.912818
1.8122609
1.8037765
1.8357688
1.8092453
1.7883942
1.7858319
1.8090591
1.7866855
1.7988394
1.790377
1.7721804
1.8794479
1.7252439
1.8504361
1.7900478
1.807195
1.6605126
1.7777212
1.7502743
1.7703589
1.7505311
1.7545028
1.7497519
1.7269424
1.767694
1.7074696
1.7796307
1.7315817
1.7087538
1.718534
1.7185166
1.753002
1.6354934
1.7325753
1.6998314
1.7119218
1.6720655
1.9107888
1.7194217
1.7138324
1.687731
1.69345
1.6857113
1.6643233
1.7184142
1.6756268
1.6791601
1.6931643
1.7102078
1.6825179
1.6959205
1.6861043
1.6456906
1.6671295
1.6921798
1.6400843
1.6901145
1.6792969
1.6705881
1.6474017
1.6161093
1.6577443
1.6600204
1.6451666
1.6161939
1.6668515
1.661199
1.6397086
1.6781694
1.6462811
1.6568234
1.6376269
1.6492486
1.6426609
1.6176004
1.6675926
1.6429777
1.6713992
1.58966
1.5879731
1.6502807
1.6329719
1.6412604
1.6116272
1.6283773
1.6142501
1.6125228
1.6053187
1.5776653
1.6064944
1.6135114
1.5704806
1.6134452
1.7249973
1.5862668
1.6465307
1.6038868
1.6208088
1.5688549
1.6214094
1.6094334
1.6124468
1.6252387
1.6078013
1.6254953
1.6153181
1.55963
1.632555
1.6316885
1.5702282
1.6128315
1.5936102
1.6269352
1.5912563
1.6101131
1.5911793
1.5781463
1.5944788
1.6034952
1.5887034
1.5710433
1.5953568
1.6009209
1.5493222
1.5847436
1.6045063
1.5902575
1.5914891
1.5785329
1.5711986
1.5820819
1.5712925
1.572539
1.582708
1.5532155
1.5618569
1.5788212
1.5929981
1.556255
1.5522785
1.5942383
1.4820292
1.5492601
1.5803652
1.5789254
1.5630046
1.5260146
1.5686256
1.6402465
1.5561231
1.5609951
1.5455523
1.5571892
1.5684619
1.5466952
1.5362103
1.5424316
1.5540032
1.5492473
1.572797
1.5622905
1.5504674
1.515492
1.5734903
1.5470386
1.537569
1.544093
1.4993469
1.534531
1.5621783
1.5489739
1.5661373
1.5227611
1.5275675
1.5524688
1.5254643
1.5810817
1.6925077
1.5287985
1.5509005
1.498908
1.5493314
1.492324
1.5285941
1.5422301
1.50281
1.5235866
1.5462353
1.5360986
1.5128922
1.5349364
1.5294535
1.5612295
1.525029
1.5212809
1.5302099
1.5068693
1.5393982
1.5352612
1.5191729
1.5114306
1.5222597
1.5222449
1.5121636
1.5299044
1.5361862
1.5261192
1.5322522
1.5219676
1.506899
1.5096841
1.5203928
1.4876577
1.5256895
1.5161734
1.4896113
1.5093237
1.5283037
1.5206901
1.5041708
1.5037247
1.5136378
1.4923104
1.7075827
1.5025994
1.5146266
1.491577
1.5041102
1.509665
1.5145009
1.549818
1.5157326
1.4889479
1.5199704
1.4998243
1.4963125
1.4914149
1.5078465
1.4995509
1.5042605
1.4296525
1.4797763
1.5053706
1.5222145
1.5024145
1.4927297
1.5028638
1.5101964
1.5031959
1.506136
1.4964578
1.461403
1.4677688
1.5039409
1.4894185
1.4874637
1.5009094
1.4737407
1.5057094
1.4658473
1.4810783
1.4976552
1.5081654
1.4924518
1.4937379
1.4985969
1.4822451
1.4908594
1.4907442
1.4834429
1.4753208
1.5010669
1.4867032
1.4785516
1.4925354
1.501946
1.4775367
1.4838648
1.4953445
1.4912685
1.5010566
1.4936533
1.483069
1.495883
1.4959817
1.466005
1.5006047
1.4911327
1.476962
1.4845725
1.4796937
1.5086839
1.4914914
1.482675
1.4824442
1.4689919
1.4724559
1.481507
1.4679762
1.4755939
1.4843066
1.481433
1.4908712
1.4693519
1.4603184
1.469499
1.4877101
1.5252502
1.5129124
1.4915919
1.5036062
1.4872961
1.449856
1.4908562
1.4726233
1.4814329
1.479414
1.4767859
1.4811969
1.4651816
1.4748614
1.4708539
1.4836919
1.4761267
1.469212
1.4746183
1.4700763
1.4418243
1.4820246
1.4901788
1.4545588
1.4714909
1.4635488
1.4580024
1.475547
1.4842217
1.4553908
1.4797525
1.4730936
1.4275364
1.4722524
1.4863393
1.4672709
1.4603734
1.4529276
1.4623462
1.4663906
1.4613012
1.4718297
1.4485158
1.6452693
1.4679565
1.457354
1.4651926
1.4536006
1.4495981
1.4690624
1.4631987
1.4813201
1.4592739
1.4549683
1.4728363
1.4516805
1.4410368
1.4703702
1.4712188
1.4585133
1.4668179
1.4616519
1.4636463
1.4606409
1.4738965
1.4543793
1.5740632
1.5103606
1.4697765
1.4453198
1.4537894
1.4498346
1.4496708
1.4794157
1.465363
1.4521894
1.4834001
1.4508702
1.4662228
1.4614162
1.457515
1.4580246
1.4302828
1.4571804
1.5330914
1.4446019
1.6226892
1.4772918
1.4493704
1.4521068
1.441955
1.4468641
1.4529816
1.4625666
1.5016012
1.4681853
1.4646106
1.4370311
1.4612696
1.49827
1.4629244
1.4542552
1.4431652
1.4478883
1.456505
1.4580584
1.4387
1.452435
1.4625996
1.4434422
1.4473315
1.4653184
1.4489108
1.4501019
1.4340819
1.4603174
1.4437929
1.4430727
1.439629
1.4320654
1.45877
1.468
1.4583206
1.4531255
1.4513791
1.4427067
1.4515268
1.6053417
1.4598228
1.4452904
1.444906
1.4502949
1.449289
1.4619374
1.44894
1.4449693
1.4633993
1.4593542
1.4384185
1.4474936
1.4688485
1.4332088
1.4370545
1.466437
1.4706925
1.4262031
1.4491456
1.4507972
1.4334995
1.562538
1.4372131
1.4671872
1.4516114
1.4490012
1.4529966
1.4627879
1.4553913
1.4534174
1.4467103
1.4394287
1.5529437
1.4388244
1.4498674
1.4561867
1.4465406
1.4561027
1.4720691
1.4619524
1.4282289
1.4488758
1.4506297
1.4344664
1.4364141
1.4490744
1.4468197
1.4443972
1.4380426
1.4651495
1.4412276
1.4437361
1.4443223
1.4378793
1.4380435
1.4327257
1.4332561
1.6172622
1.455909
1.4420382
1.4484597
1.445128
1.4412497
1.4463905
1.433346
1.4416218
1.4352393
1.4414109
1.4369113
1.6325972
1.434018
1.443072
1.4414389
1.441422
1.448397
1.4375051
1.4330922
1.4417619
1.4421206
1.4489983
1.4449341
1.4288872
1.4441645
1.4290761
1.4350106
1.4566166
1.4398708
1.4446201
1.4427495
1.4353753
1.4352607
1.4415789
1.5522138
1.4386512
1.446007
1.4442348
1.4437085
1.4473714
1.4404213
1.4363979
1.4432471
1.440109
1.4361914
1.4435626
1.4675442
1.4650527
1.4434155
1.4333807
1.4508888
1.4417849
1.4437664
1.439161
1.4280953
1.441249
1.5481644
1.4416177
1.4337462
1.4469091
1.4279463
1.4416434
1.4849372
1.4386212
1.4311676
1.432493
1.4398062
1.4372338
1.4426756
1.4325567
1.4393929
1.4394438
1.4385386
1.4327275
1.4797231
1.4383793
1.4282438
1.4377487
1.4324051
1.5435121
1.5231067
1.5916307
1.4554245
1.4294605
1.4523509
1.4300779
1.4195893
1.4340476
1.437964
1.4381729
1.4304501
1.4370794
1.422253
1.4322585
1.4275037
1.4291993
1.4435349
1.456133
1.4306614
1.4391252
1.4467219
1.4440641
1.4214399
1.439323
1.4446862
1.4373149
1.4365478
1.4404327
1.4547325
1.4301479
1.4297435
1.4306138
1.4190532
1.441621
1.4401999
1.4302356
1.4446363
1.431734
1.431936
1.4426165
1.4568148
1.4265156
1.4477272
1.4925321
1.43064
1.4302974
1.4526345
1.4299171
1.4282317
1.4449158
1.4334954
1.4326231
1.439549
1.4288237
1.4299335
1.4437766
1.4567251
1.4218267
1.4341716
1.4345425
1.4263391
1.4437691
1.4289343
1.4407213
1.4412398
1.4800838
1.4478122
1.4365485
1.429808
1.4183162
1.4317615
1.4289732
1.4444015
1.4438987
1.4151514
1.4213588
1.455722
1.4456983
1.4247888
1.4279443
1.4259605
1.4322175
1.4283801
1.449236
1.4540402
1.4280304
1.4333668
1.4477072
1.4154612
1.4655932
1.4262
1.4565111
1.4359047
1.4481908
1.4300059
1.4423356
1.4287368
1.4389864
1.4421853
1.4193833
1.452519
1.4443613
1.4287579
1.5366303
1.4372275
1.4298834
1.431316
1.4452411
1.4381374
1.5134122
1.4313304
1.4180427
1.4327458
1.4298365
1.434255
1.4393088
1.4294001
1.4255598
1.4480594
1.4301572
1.4327781
1.4267067
1.4455262
1.4355428
1.4394785
1.4376161
1.4516312
1.4212103
1.4329374
1.4332637
1.4326224
1.4837381
1.4267802
1.4232924
1.4431456
1.4357734
1.418894
1.4874542
1.4590575
1.4512664
1.4320278
1.457648
1.4392639
1.4328761
1.4345351
1.4278533
1.4172503
1.4300101
1.5205479
1.4280523
1.429677
1.4446198
1.427476
1.4384875
1.4486434
1.4535363
1.4430494
1.4437739
1.436394
1.4422867
1.4266157
1.433103
1.4382702
1.4286815
1.4325857
1.454863
1.439379
1.4330215
1.4354179
1.4281516
1.4312716
1.4304644
1.4432539
1.4309573
1.4426574
1.4755347
1.4811571
1.4325403
1.4249622
1.430242
1.428812
1.4415901
1.4264853
1.4325554
1.4389927
1.4424342
1.5258416
1.430548
1.4254375
1.4371729
1.4252653
1.4254698
1.4280188
1.4335843
1.4411137
1.4372221
1.4208105
1.4392709
1.4602363
1.4287832
1.4306442
1.4229183
1.4270332
1.4418688
1.425751
1.4273802
1.4375482
1.43324
1.4200515
1.4361829
1.4521646
1.4294841
1.4584118
1.430477
1.4186697
1.4238728
1.4342108
1.4272654
1.4339178
1.4320453
1.4215683
1.4259708
1.4251779
1.4331337
1.4329731
1.4193931
1.437521
1.432459
1.4425505
1.4283979
1.437882
1.4714596
1.4269168
1.4325976
1.4289672
1.4473345
1.4352353
1.4385118
1.4240247
1.4290266
1.4409965
1.43464
1.4383968
1.45007
1.4377267
1.4136128
1.4313048
1.4327726
1.6139371
1.4378706
1.4240336
1.4260404
1.4174615
1.42412
1.4534802
1.414313
1.4301958
1.4510207
1.43058
1.4194685
1.4146079
1.4319302
1.4411244
1.4309504
1.4384596
1.4481399
1.4289345
1.4325562
1.4397106
1.4759903
1.4559891
1.4316484
1.4355843
1.4333578
1.4367121
1.5092766
1.4194288
1.4342849
1.4342674
1.447995
1.46421
1.4294878
1.4256122
1.4446355
1.4499722
1.4285418
1.4300586
1.471328
1.4350165
1.4323115
1.5555149
1.4885391
1.4385104
1.4487816
1.4402086
1.424456
1.5020901
1.4955158
1.4282522
1.5823045
1.4299144
1.4354537
1.4720755
1.4179854
1.4340147
1.4274633
1.436055
1.4448941
1.5190694
1.4302545
1.4295604
1.4366513
1.4869025
1.4650283
1.4352337
1.4370152
1.4289362
1.4318666
1.4254918
1.4272165
1.4410566
1.4254633
1.4255022
1.4272168
1.4335577
1.4458566
1.4363673
1.6254889
1.4254948
1.4326738
1.4239694
1.4252076
1.4370453
1.4461308
1.4192461
1.4212397
1.4225721
1.4246051
1.4330024
1.4242021
1.4236445
1.4486781
1.5358045
1.4333607
1.5888191
1.4294735
1.4385831
1.4530925
1.436447
1.435457
1.4308867
1.4284568
1.4369001
1.4366522
1.4391297
1.4311138
1.4391327
1.4397806
1.4512427
1.4809407
1.4315315
1.4465939
1.441573
1.4380289
1.4326397
1.444216
1.4468027
1.4363179
1.4418309
1.4291865
1.4231803
1.4280732
1.457514
1.4213295
1.444203
1.4372177
1.4247196
1.4318942
1.4385288
1.4456588
1.4273129
1.4345412
1.4466562
1.4394883
1.4259084
1.4304888
1.4409783
1.435986
1.4240628
1.430554
1.622189
1.4377483
1.4397975
1.4134334
1.4358562
1.4277349
1.4658164
1.4335722
1.4583104
1.4643726
1.4443862
1.440056
1.4303606
1.4730695
1.4403722
1.4486036
1.4342866
1.4300808
1.4224145
1.439579
1.4262475
1.4293145
1.437459
1.5426677
1.436485
1.4335073
1.4188119
1.4242659
1.4267646
1.463345
1.4304807
1.4431734
1.4220484
1.4403483
1.4393213
1.4381195
1.4318203
1.4333538
1.4343354
1.4261492
1.6094942
1.4449688
1.4359723
1.4476583
1.4215227
1.4284859
1.4718496
1.423505
1.436327
1.4427848
1.4403988
1.4385381
1.4290603
1.4482086
1.4432428
1.4633375
1.4217297
1.4324341
1.4352106
1.4419572
1.4273796
1.4619591
1.4283155
1.4524795
1.4354413
1.4457766
1.4613314
1.4321312
1.4514245
1.4259351
1.4397424
1.4467766
1.4579419
1.4341741
1.517029
1.4220675
1.4197295
1.4297777
1.4387102
1.4824338
1.4395723
1.424556
1.4266659
1.4308729
1.4300507
1.4345086
1.4318086
1.4226607
1.4253453
1.4245172
1.4745511
1.4185331
1.429593
1.4439989
1.4338197
1.4440643
1.5252978
1.4686831
1.4450741
1.4264852
1.4382212
1.4660058
1.4298804
1.4304371
1.4430833
1.432719
1.4962862
1.4217167
1.4514034
1.4416485
1.4337709
1.430719
1.4205056
1.4329245
1.4353878
1.4340405
1.4200432
1.4292059
1.4332184
1.4630159
1.4231634
1.4361211
1.4338663
1.4224946
1.4317191
1.4413884
1.4245487
1.4297769
1.4939773
1.4415425
1.4366091
1.452527
1.4357023
1.4239666
1.4467161
1.433133
1.4434451
1.4562932
1.4205793
1.4547515
1.4485748
1.4704133
1.4341009
1.4474485
1.4303159
1.5227787
1.4404681
1.4731841
1.4325004
1.4528241
1.4265238
1.4879282
1.457417
1.4528577
1.4238132
1.4325167
1.423643
1.4468874
1.432082
1.4216709
1.441115
1.4506444
1.420134
1.4559773
1.426016
1.4582127
1.454
1.4514658
1.4327898
1.4664013
1.4371697
1.572555
1.4828167
1.429392
1.4295089
1.4293884
1.4253092
1.4277148
1.4689633
1.4274733
1.4655646
1.5711166
1.43595
1.4956225
1.4545474
1.4275498
1.4240797
1.4629378
1.4298772
1.426518
1.4398956
1.4360229
1.4346793
1.4393581
1.4303262
1.4520128
1.4279516
1.429098
1.4496461
1.4261733
1.4281076
1.4462907
1.466148
1.4644614
1.434797
1.4312646
1.4887655
1.4362385
1.4314299
1.4611467
1.4175978
1.6096582
1.4390513
1.4250672
1.4327267
1.4275479
1.4318265
1.4250339
1.4284432
1.4260659
1.4572186
1.4322435
1.4278933
1.4359099
1.4386609
1.4474111
1.4463301
1.5520487
1.500772
1.4573263
1.433861
1.4449722
1.4312128
1.4318017
1.4358488
1.4349815
1.4208859
1.4316425
1.4345145
1.4365073
1.4461412
1.451095
1.4419576
1.4504223
1.4544096
1.4763335
1.4334564
1.4398527
1.4547187
1.4267361
1.4307712
1.4439168
1.4427016
1.4237262
1.437408
1.4364924
1.4832705
1.4470122
1.4354451
1.436618
1.4640968
1.4347792
1.4276967
1.433569
1.4316285
1.4311348
1.5289828
1.4508052
1.4359267
1.4180837
1.4682046
1.4320923
1.4461398
1.4409341
1.4499207
1.4326713
1.4276228
1.4411665
1.440106
1.4619251
1.4190967
1.4369
1.4378662
1.440247
1.4419285
1.4464724
1.4340944
1.4408695
1.4413481
1.4311032
1.4523103
1.4496446
1.4404446
1.4437087
1.4496894
1.442157
1.4329613
1.4436581
1.4369113
1.4379263
1.5795801
1.441459
1.438657
1.4472882
1.5554299
1.4788518
1.4411606
1.4391491
1.4531671
1.4431748
1.5887083
1.4412516
1.4507891
1.4390107
1.4349638
1.4415994
1.4541585
1.4383627
1.4339429
1.4460578
1.5528461
1.4579844
1.4368232
1.4245582
1.4467373
1.4286392
1.560092
1.438708
1.4720545
1.4380485
1.4474537
1.4510043
1.4379154
1.4454376
1.4466196
1.4364864
1.4329973
1.4382861
1.439529
1.4408605
1.4444872
1.4446836
1.4361255
1.4400423
1.4359049
1.480309
1.4490013
1.4339806
1.4378532
1.4410423
1.442446
1.4570014
1.4499522
1.4418085
1.4378309
1.4587109
1.4423811
1.4502554
1.4359549
1.4558778
1.4381967
1.4356933
1.461668
1.4438893
1.4443088
1.4332231
1.4339992
1.4283357
1.4443963
1.4314966
1.4369732
1.4294165
1.4385457
1.4412717
1.4482197
1.4628145
1.5706674
1.4976648
1.4414195
1.4442941
1.473849
1.5338902
1.4406025
1.590746
1.4307293
1.4575342
1.4315397
1.4282409
1.4337201
1.4504648
1.4483199
1.4360367
1.6424706
1.47219
1.4493319
1.4278214
1.4460856
1.4267676
1.4462695
1.4607861
1.4429929
1.455388
1.4582101
1.5719308
1.4328701
1.4462624
1.4382857
1.4421687
1.429934
1.6022124
1.453008
1.4396114
1.4409081
1.4458514
1.4304411
1.4423018
1.4428359
1.4341632
1.4500967
1.4456636
1.4413168
1.475835
1.4434776
1.4387966
1.4534419
1.4464052
1.4384272
1.4405713
1.4334431
1.4393032
1.4744662
1.5134201
1.4435637
1.439055
1.4557947
1.4345391
1.435426
1.4673555
1.4399722
1.4408545
1.4431297
1.4538954
1.445546
1.4489088
1.4800283
1.4299401
1.4390576
1.4512309
1.4621598
1.4391838
1.4417061
1.4588895
1.432018
1.474277
1.5018373
1.4554636
1.435486
1.443856
1.4676245
1.454743
1.4426124
1.4376029
1.456608
1.4502279
1.4375446
1.4972192
1.4399091
1.4398681
1.4427228
1.4619931
1.4999516
1.4491601
1.4620599
1.4486204
1.4520212
1.4444687
1.4439838
1.4489214
1.4441822
1.4595568
1.4743564
1.4738342
1.4907857
1.4567289
1.4611069
1.448564
1.4382781
1.4382256
1.4815347
1.4533099
1.4442793
1.4284931
1.4460919
1.4253519
1.445788
1.4676151
1.4389089
1.4694983
1.4390957
1.4669459
1.4691759
1.4589279
1.4600252
1.4576219
1.475575
1.4558216
1.4570103
1.473488
1.4734344
1.5020555
1.456775
1.5525299
1.4833516
1.6512734
1.4805459
1.4833392
1.508763
1.4642446
1.5084099
1.4506174
1.4551613
1.4523454
1.4510026
1.4570291
1.5127933
1.4608669
1.4653907
1.4503628
1.736922
1.464765
1.4613824
1.4626627
1.4504387
1.4636304
1.4485774
1.4747963
1.4523276
1.4570822
1.4610881
1.4644521
1.469941
1.4533569
1.4833555
1.4778165
1.4531722
1.4703592
1.4774882
1.4754251
1.4637419
1.4787375
1.4999518
1.46082
1.466927
1.4884515
1.4946808
1.4935384
1.4751267
1.4676224
1.4621372
1.4983962
1.4759961
1.4789441
1.4548
1.47314
1.4539164
1.4611878
1.4939367
1.4838833
1.4572763
1.4722651
1.528209
1.4667828
1.6473255
1.5886412
1.4821959
1.4604143
1.4787934
1.5125486
1.5137547
1.4570718
1.5025957
1.4689487
1.4815181
1.4829776
1.4798348
1.4772228
1.4946243
1.5016092
1.534481
1.4792506
1.4627926
1.4928609
1.4836736
1.4828186
1.4757948
1.494853
1.4898927
1.4941249
1.4852207
1.4679624
1.4679272
1.4843167
1.4820344
1.5087632
1.6605088
1.4712006
1.4829232
1.482357
1.4859625
1.4504566
1.5077592
1.4649692
1.4873585
1.4736162
1.5000998
1.4702584
1.4699382
1.5443777
1.5236341
1.4861085
1.4754695
1.4916141
1.4983854
1.4816934
1.4923688
1.4901505
1.5098892
1.4736675
1.5903944
1.4882212
1.4828292
1.4766961
1.5130057
1.4640255
1.4790208
1.5079416
1.4845034
1.476687
1.4863058
1.4815695
1.5376194
1.4925373
1.459176
1.4989312
1.4863102
1.4911195
1.5268091
1.4835633
1.492259
1.4688245
1.4773109
1.5112664
1.5033383
1.4816122
1.4893537

Find here the complete log. Thanks!

The loss is too low. I suspect a vocabulary issue. How did you generate the vocabulary that you then used for replacement?

Hi Guillaume,

1st of all: Thanks very much for your help

I used SentencePiece as you suggested to other people in the forum
I created the subword model by doing:

spm_train --input=my_generic_normalized_file --model_type=bpe --model_prefix=gen --vocab_size=16000 --character_coverage=1.0 --user_defined_symbols=<list_of_normalized_tokens>

where <list_of_normalized_tokens> are tokens looking like ⦅CODE0⦆⦅CODE1⦆…⦅NUMBER0⦆⦅NUMBER1⦆…⦅URL0⦆⦅URL1⦆…
These tokens come from a previous normalization stage where the original text is preprocessed to replace URL’s, emails, codes, numbers, etc… by these tokens. All these things are never translated into something different, so I think considering them atomic symbols make sense to “alleviate” the neural learning process.
Once I have both BPE models (source and target language), I encode the normalized files into BPE-encoded files ready for training.

I follow the exact same process to create the in-domain BPE model, to obtain the BPE-encoded files ready for fine-tuning training. The only difference is the source data, which this time contains just in-domain normalized data.
The generic dataset does not contain in-domain data at all, I don’t know if this approach is correct.

The initial in-domain model is obtained by updating the vocabulary with replacement (not merge) like this:

onmt-update-vocab --model_dir run_generic_bpebig16_deen/ --output_dir run_indomain_bpebig16_deen/ --src_vocab generic_bpebig16_vocab_deen_de.txt --tgt_vocab generic_bpebig16_vocab_deen_en.txt --new_src_vocab indomain_bpebig16_vocab_deen_de.txt --new_tgt_vocab indomain_bpebig16_vocab_deen_en.txt --mode replace

And then using the in-domain data to follow the training. I am trying as much as I can to follow the canonical approach to domain adaptation.

I would suggest:

  1. Using the initial BPE models to encode the in-domain data
  2. Using the “merge” vocabulary replacement mode (don’t forget to update the configuration, the new vocabularies are generated in the output_dir)

Can you give it a try?

Hi Guillaume,

Indeed I’ll try both (In fact it’s WIP)
Would you say the dataset for the fine-tuning stage should contain just in-domain data, or Would you recommend having some amount of generic data too?