Hi all!
I have a ctranslate2 model. I’m trying to predict a big file >50k sentences.
Each time I get the following error:
terminate called after throwing an instance of 'std::runtime_error'
what(): CUDA failed with error an illegal memory access was encountered
Aborted (core dumped)
I create the Translator object and use translate_batch function. My input tokens looks like the following:
Sources encoded [['⦅GEN⦆', '■', 'grb'], ['⦅GEN⦆', '■', 'dvostruka'], ['⦅GEN⦆', '■', 'te'], ['⦅GEN⦆', '■', 'mala'], ['⦅GEN⦆', '■', 'svijeća'], ['⦅GEN⦆', '■', 'postoji'], ['⦅GEN⦆', '■', 'aktivnosti'], ['⦅GEN⦆', '■', 'prigovor'], ['⦅GEN⦆', '■', 'više'], ['⦅GEN⦆', '■', 'manje']]```
I use the following config:
{'batch_type': 'tokens',
'beam_size': 1,
'length_penalty': 0.9,
'max_batch_size': 512,
'max_decoding_length': 256,
'replace_unknowns': True,
'sampling_topk': 10},
'translator': {'inter_threads': 2, 'intra_threads': 4}}
Do you have any idea of what can be wrong with it? Thank you in advance