Translate_batch then convert to df

SamuelLacombe · November 30, 2021, 5:18am

Hello,

I’ve been using ctransalte2 for a little while with batch mode. It has been working really well, but since I change to the latest version of Ctranslate2 I believe the output also changed? Or at least my code stopped working!

In order to consume the results of Ctranslate2 I was converting it into a dataframe:

translation_ob = translator_model.translate_batch(dfChunk['SourceTokenized'].values.tolist(), return_scores=predictScore, normalize_scores=True, return_attention=False)
translationDF = pd.DataFrame(translation_ob)
df[['TargetTokenized', 'predictScore']] = translationDF[0].apply(pd.Series)

But I don’t seem to be able to do that anymore, yet I’ve been trying many things… but to no avail… anyone has a hint how to do it? I’m trying to avoid conventional loop!

Best regards,
Samuel

SamuelLacombe · November 30, 2021, 5:38am

Well, seem like I just needed to ask it to come up with the solution!

For reference if anyone needs it:

pd.DataFrame([{'tokens': s.hypotheses[0], 'score': s.scores[0]} for s in translation_ob])