OpenNMT Forum

Apply OpenNMT to subtitles

Hello,

I am getting pretty good results using OpenNMT-py to translate multiple plain texts between different languages. Currently, my intention is learning how to apply NMT to translate not plain texts splitted into sentences, which is what I am doing right now, but subtitles in SRT format. I mean, I want to find a time correspondence between the original text and the predicted translated one.

How difficult is it? Are there some state of the art algorithms to start? Can you please suggest me some papers, code, libraries, etc. to start? And finally, can I solve this task just using OpenNMT?

Thank you so much!

Dear Ana,

If I got you right, you want to translate SRT files and place the timing back in the target.

When translating such files with “outer” tags/placeables, you do not need to pass such tags for translation. You simply save them to variables, pass only the plain text to the MT engine, concatenate the output with the timing, and finally save it to a file. You need to loop on each sentence in the source file.

It a simple coding problem; no sophisticated algorithm is needed.

It might help to use a Simple OpenNMT-py REST server

Kind regards,
Yasmin

Hello Yasmin,

thanks for the reply. However, the problem is that the srt subtitles files are not splitted into sentences with their timings, they are usually splitted into smaller “subsentences”, which I shouldn’t pass over the network because all the context gets lost. So, if I pass a complete sentence as I am doing right now, I am not sure how to recover the timings of each subsentence of that sentence.

To ilustrate what I am saying for anyone who is interested: imagine I want to translate from spanish to english this sentence splitted into 3 subsentences with their start and end timings:

1
00:00:10,660 --> 00:00:16,390
La video clase de hoy la vamos a
dedicar a ciudades históricas

2
00:00:16,490 --> 00:00:21,870
patrimonio y turismo es la segunda
parte de paisajes culturales urbanos

3
00:00:21,970 --> 00:00:24,650
como lo señalaba en mi anterior
video clase.

The complete original sentence is:

La video clase de hoy la vamos a dedicar a ciudades históricas patrimonio y turismo es la segunda parte de paisajes culturales urbanos como lo señalaba en mi anterior video clase.

which its corresponding translation in english could be:

Today’s video class will be dedicated to historic cities heritage and tourism is the second part of urban cultural landscapes as I pointed out in my previous video class.

I am able to get this translation, but how can I recover the start and end timestamps from the original srt? I can suppose that the complete sentence goes from 00:00:10,660 to 00:00:24,650, but is it possible to recover the timestamps of each of the 3 subsentences?

Maybe I am wrong but it doesn’t seem to me a trivial problem. However, I’ve seen a lot of videos with subtitles which (I suppose) have been automatic translated in a lot of different languages (for example, in Youtube); any idea how they generate them?

Another option could be to translate subsentence by subsentence, I mean to translate the one first one:

La video clase de hoy la vamos a dedicar a ciudades históricas

to:

Today’s video class will be dedicated to historic cities

and put the original timestamps (00:00:10,660 --> 00:00:16,390) but I think the predicted translation would be worse because a lot of context from the rest of the sentence gets lost.

Regards!
Ana

1 Like

Hello Ana, SDL Trados 2019 is claimed to be suitable for translating subtitles (https://www.sdltrados.com/events/webinars/2019/april/a-new-easy-way-to-translate-subtitles-in-sdl-trados-studio.html) . There are plug-ins which will allow you to access OpenNMT from Trados. I have a colleague who specialises in plug-ins who might be able to help with a long-term solution.

Dear Ana,

What is the process you currently follow to figure out complete sentences? This could be a clue to the answer.

Kind regards,
Yasmin

Oh my! This is an interesting problem. A possible approach is:

  1. Build a tagged parse tree of the full translated sentence.
  2. Use a rule set that breaks a large tree into a few short sentences.
  3. Emit the rearranged sentences.

#2 is not a simple project! In my project, I am doing a much simpler use case. I am searching long sentences for coherent short phrases.

I have used the ‘allennlp’ distribution to get the tagged parse tree of 100k sentences (MSCOCO caption text)- it took some hacking and then about 12 hours on my laptop. The NLTK toolkit includes a parser and library for standard parse trees.

1 Like