OpenNMT Forum

How do you use '-replace_unk' on server.py? (TranslationServer.py)

Hi, there. When you use a non-argument translational option like ‘-replace_unk’, how do you write conf.json?

Examining TranslationServer.py => class ServerModel => parse_opt(),
I suspected a line of this function prevents you from using such options:

for (k, v) in opt.itmes():
    sys.argv += ['-%s' % k, str(v)] # <== this line

So I wrote conf.json like:

{
    "models_root": "./available_models",
    "models": [
        {
            "id": 0,
            "model": "whatever.pt",
            "timeout": -1,
            "on_timeout": "unload",
            "load": true,
            "opt": {
                "gpu": 2,
                "beam_size": 5,
                "replace_unk": true # <== like a boolean
             }
        }
    ]
}

Then, I modified the line above like:

for (k, v) in opt.itmes():
    sys.argv += ['-%s' % k] if v == True else ['-%s' % k, str(v)]

This seems to be working fine. But I’m not sure if this is a bug or I just don’t know the correct way to specify such switches. Can somebody tell me?

1 Like

Yes. I have the same question. I did the same thing, but instead of replacing unknown words with untranslated source, it just simply leaves it out.

Please I would like to get an answer to this?

As you can see in my configuration

   {
        "models_root": "./available_models",
        "models": [
            {
                "id": 101,
                "name": "PT-EN (bidirectional encoder of Long Short-Term Memory, Recurrent Neural Network)",
                "model": "BiLSTM2-650-600-Brasil_Celex2012-2017_stok_2019_05_27-07_30__step_795000.pt",
                "dynamic_dict": true,
                "timeout": 1000,
                "on_timeout": "to_cpu",
                "model_root": "/opt/models",
                "load": true,
                "opt": {
                    "gpu": 0,
                    "beam_size": 5,
                    "max_length": 650,
                    "batch_size": 32,
                    "share_vocab": true,
                    "replace_unk": true,
                    "verbose": true
                },  
                "tokenizer": {
                    "type": "pyonmttok",
                    "mode": "aggressive",
                    "params": {
                        "no_substitution": false,
                        "spacer_annotate": false,
                        "spacer_new": false,
                        "joiner_annotate": true,
                        "joiner_new": false,
                        "case_markup": true,
                        "case_feature": false,
                        "preserve_placeholders": true,
                        "preserve_segmented_tokens": true,
                        "segment_case": true,
                        "segment_numbers": true,
                        "segment_alphabet_change": false
                    }
                }
            }
        ]
    }

replace_unk is part of opt list.