Memory leak in Argos Translate

What is the best way to clean up ctranslate2.Translator after it has been initialized and used? Should the Python garbage collector be able to free up allocated memory?

self.translator is always None on every translation request. This means that for each translation event, the program creates a new PackageTranslation instance. This can lead to a memory leak if suddenly ctranslate2 stores some process-bound data (globally).

1 Like

Hello,

have you tried forcing the garbage collector?

I had a similar issue with DataFrames in my preprocessing… and forcing the garbage collector solve it for me.

Example:

import gc
n = gc.collect()
print("Number of unreachable objects collected by GC:", n)

I basically did the n = gc.collect() just right after I was done with some massive DataFrame to ensure the garbage collector get them really quick.

Best regards,
Samuel

2 Likes

Hello Argo,

Just out of curiosity, did my suggestion solve your problem?

2 Likes

I’m not sure, I haven’t been able to reliably reproduce the memory leak. However, people have told me they get memory leaks.

When I run this script on my computer the memory doesn’t noticeably increase.

import argostranslate.package
import argostranslate.translate

from_code = "en"
to_code = "es"

# Translate
while True:
    translatedText = argostranslate.translate.translate("Hello World", from_code, to_code)
    print(translatedText)
2 Likes

Hello Argo, were you able to find a working solution for the problem?
I am also encountering a similar issue. I am using translator.translate_batch() in the API call and the memory seems to increase with every hit and it does not go down if I stop hitting the API.
Need to restart the services to free up the memory

Thanks

2 Likes

I still haven’t been able to find the source of the memory leak. We run CTranslate2 inside of LibreTranslate on a Nginx server with multiple processes that automatically restart so the memory leak doesn’t cause many problems.

1 Like