Currently, I am using decoding feature(alternative at position) of ctranslate2 model. It is working perfect but the response of ctranslate2 is very low. When I run the script, It response (give output-alternatives) after 26 sec and sometimes after 35 - 40 sec.
Is it because of bad performance of my CPU or something else?. I would be thankful if someboday help me to resolve this problem.
I run CTranslate2 from Python and here is the code that I use. It takes less than half a second. I have a lot of RAM though on a server machine. Still, you can run it on PythonAnywhere for example and see what you get.
Obviously, make sure you change the variables, and if it for real work, update the tokenization and detokenization functions as needed.
I also write almost the same code instead of âcpuâ. But this solution is not working eventhough after write the âcpuâ. Still I am getting the response after 25-40 secs.
Does you have another solution? or is it I am missing something.
please have a look below on the code and system specifications.
Code:
import ctranslate2
import sentencepiece as spm
Input = "This project is geared towards efficient serving of standard translation models but is also a place for experimentation around model compression and inference acceleration."
def tokenize(data):
return sp.encode(data, out_type=str)
def detokenize(data):
return sp.decode(data)
translator = ctranslate2.Translator("ende_ctranslate2/", cpu)
sp = spm.SentencePieceProcessor(model_file='wmtende.model')
results = translator.translate_batch(
[tokenize(Input)],
target_prefix=[tokenize("Dieses Prokekt ist auf die")],
num_hypotheses=5,
return_alternatives=True)
for hypothesis in results[0]:
print(detokenize(hypothesis["tokens"]))
system specification:
Currently, I installed virtual ubuntu on my windows 10 and the ctranslate2 is running from ubuntu. I am sharing the specifications of both operating systems.
1. Windows:
Edition: windows 10 Pro (intel corei7) Processor: IntelÂź Coreâą i7-5600U CPU @ 2.60GHz, 2601 Mhz, 2 Core(s), 4 Logica Processor(s) Intalled RAM: 12,0 GB (11,9 usable). System Type: 64-bit os
Do you know what is taking the most time in your script?
I would assume it is when creating the translator and loading the model. WSL (especially version 1) has very poor I/O performance, especially when reading files from the Windows filesystem.
I dont think so that i have any script that is taking too much time.
If I will run ctranslate2 on Linux os not virutual ubuntu then any possibility to get quick response?.
Hi @atikaakmal, Having spotted this thread I thought Iâd mention that I am running ctranslate2 on WSL2g so I can deploy a GUI. Even on my Asus Zenbook (8GB RAM) speed is not an issue (real 0m1.652s
user 0m1.122s sys 0m1.336s for translation of one sentence) but I am finding the translation output a lot poorer than when running the same underlying model (Transformer) using the commonly used Python script on my Asus Zenbook. Edit: Further to installing the latest OpenNMT-tf and exporting with ââexport_format ctranslate2â I am seeing a significant improvement in translation output.