Simple OpenNMT-py REST server

xie · June 3, 2019, 8:21am

Hello.
opennmt-py server tps is low,how can i do?

I hope someone can help me .thanks.

vince62s · June 3, 2019, 6:49pm

We changed the dropout option from int to list.

I think your model has been trained with the new list thing but you must have exposed your model on a server.py that has notbeed updated with master.

git pull on your server and let me know.

park · June 3, 2019, 11:35pm

Thank you so much.
Problem solved.

guillaumekln · June 18, 2019, 8:56am

32 posts were split to a new topic: Issues running the OpenNMT-py REST server

guillaumekln · June 18, 2019, 8:58am

guillaumekln · June 19, 2019, 6:00am

This topic was automatically opened after 21 hours.

ngdave · June 22, 2019, 3:24am

hello @pltrdy ,
while running the server i got the above error. Your help in solving the issue will be appriciated. Thanks.

ngdave · June 22, 2019, 4:37am

Hello @park, while starting the server i get the following error, yet my available_models directory is in the path: ng@ng:~OpenNMT-py/ please help, thanks.
python3 server.py --ip 0.0.0.0 --port 5000 --url_root “./translator” --config “./available_models/conf.json”
Traceback (most recent call last):
File “server.py”, line 129, in
debug=args.debug)
File “server.py”, line 24, in start
translation_server.start(config_file)
File “/home/ng/OpenNMT-py/onmt/translate/translation_server.py”, line 80, in start
with open(self.config_file) as f:
FileNotFoundError: [Errno 2] No such file or directory: ‘“./available_models/conf.json”’

ngdave · June 22, 2019, 3:26pm

I think i just figured out where the issue might be; i had not input my trained model. Working on that.

park · June 23, 2019, 11:56pm

Okay! Nice Work

KishorKP · June 25, 2019, 10:58am

Hi Park,

I am getting the following error message while loading the tokenizer.

**[kishor@nvidiagpu OpenNMT-py]$ python3 server.py --ip “0.0.0.0” --port 7785 --url_root “/translator” --config “./available_models/conf_pyonmttok.json”
Pre-loading model 1
[2019-06-25 16:11:18,950 INFO] Loading model 1
[2019-06-25 16:11:19,622 INFO] Loading tokenizer
Traceback (most recent call last):
File “server.py”, line 129, in
debug=args.debug)
File “server.py”, line 24, in start
translation_server.start(config_file)
File “/home/kishor/OpenNMT/OpenNMT-py/onmt/translate/translation_server.py”, line 102, in start
self.preload_model(opt, model_id=model_id, **kwargs)
File “/home/kishor/OpenNMT/OpenNMT-py/onmt/translate/translation_server.py”, line 140, in preload_model
model = ServerModel(opt, model_id, model_kwargs)
File “/home/kishor/OpenNMT/OpenNMT-py/onmt/translate/translation_server.py”, line 227, in init
self.load()
File “/home/kishor/OpenNMT/OpenNMT-py/onmt/translate/translation_server.py”, line 308, in load
import pyonmttok
ImportError: /usr/local/lib/python3.5/site-packages/pyonmttok.cpython-35m-x86_64-linux-gnu.so: undefined symbol: _PyThreadState_UncheckedGet
[kishor@nvidiagpu OpenNMT-py]$

Could you please assist me in resolving this issue.

Regards,
Kishor.

park · June 26, 2019, 1:00am

It seems pyonmttok is not install well.
I suggest you to install OpennmtTokenizer again.
Or I suggest you to tokenize your data used by SentencePiece.
In my case I use SentencePiece model.

KishorKP · July 15, 2019, 8:12am

okay…got it…
i will try these.

IchoKatsarski · November 18, 2019, 10:54am

Hi @pltrdy, I’m trying to run the server with arabic<>english pair and have some troubles with the tokenization. I have BPE codes, generated with learn_bpe.py and applied to my corpora with apply_bpe.py (all with default options). My conf.json regarding tokenization looks like this:

...
"tokenizer": {
    "type": "pyonmttok",
    "mode": "none",
    "params": {
        "joiner": "@@",
        "joiner_annotate": true,
        "bpe_model_path": "cee81450-e9af-0137-849b-107b44b00092/data.eng.bpe"
    }
},
...

Input:

Oh. Okay. Because I thought it was something different.
Greg barber.

pyonmttok-ouput:

O@@ h@@ .@@ @@ Okay@@ .@@ @@ Bec@@ aus@@ e@@ @@ I@@ @@ though@@ t@@ @@ it@@ @@ was@@ @@ some@@ thing@@ @@ different.
G@@ reg@@ @@ bar@@ ber.

apply_bpe-output:

Oh. Okay. Because I thought it was something different.
Gre@@ g bar@@ ber.

There is difference between tokenizations. Maybe I’m not configuring something right? Whats the difference between joiner (pyonmttok) and separator (apply_bpe). What to do to get the tokenization from apply_bpe.py to the opennmt-py server ? @park any thoughts I saw you was working with BPE yourself ?

anderleich · June 5, 2020, 10:50am

Hi,
Which is the recommend way to use the server?
Let’s say I have 10000 sentences to translate:

Send all the 10000 senteces together and let the server handle the batches
Create batches of 32 (example) and send them to the server

With the second option it seems batches in server side are not necessary
Thanks

lmsverige · August 8, 2020, 12:46pm

Hi, thank`s for sharing. I read a lot of papers last time.
The steps are not logical for me.
I have 2 *.txt files for 2 languages (parallel corpora).
On which point comes sentencepiece in the game. Must i train/prepare both txt files with sentencepiece or first preprocess with OpenNMT-py and then train the same txt source in a single file with sentencepiece? or?

What kind of source stay behind that procedure to realize to train a model with the sentencepiece option in OpenNMT-py?

I found nothing about that.
Many thanks for Information
lmsverige

guillaumekln · August 12, 2020, 6:53am

The OpenNMT-py preprocess expects tokenized data. So SentencePiece should be applied before using OpenNMT-py.

Does that help?

francoishernandez · August 17, 2020, 1:16pm

You might want to have a look at this post for instance: Using Sentencepiece/Byte Pair Encoding on Model

lcaei · December 5, 2020, 8:52am

I’ve same problem, my server works fine but I’m working on windows which does not support pyonmttok so I configured json file with sentencepiece and the sentencepiece model is a pretrained wikipedia tokenizer trained with 275 languages. Yet the predictions of the sentences are extremely poor, not even one sentence translates correctly but however it works just fine from the notebook view.

langtechone · March 26, 2021, 4:12pm

Everything is fine with the given REST-api. However, adding
“opt”: {…, “replace_unk”: true,
“phrase_table”: “a_phr_table”,…}
does not give the desired result. The untranslated src token is just copied as it is, even if it is present in the table. That works fine in normal command line onmt_translate without REST-api.
Any help to resolve this issue will be appreciated.