Hello Yasmin,
thanks for the reply. However, the problem is that the srt subtitles files are not splitted into sentences with their timings, they are usually splitted into smaller âsubsentencesâ, which I shouldnât pass over the network because all the context gets lost. So, if I pass a complete sentence as I am doing right now, I am not sure how to recover the timings of each subsentence of that sentence.
To ilustrate what I am saying for anyone who is interested: imagine I want to translate from spanish to english this sentence splitted into 3 subsentences with their start and end timings:
1
00:00:10,660 --> 00:00:16,390
La video clase de hoy la vamos a
dedicar a ciudades histĂłricas
2
00:00:16,490 --> 00:00:21,870
patrimonio y turismo es la segunda
parte de paisajes culturales urbanos
3
00:00:21,970 --> 00:00:24,650
como lo señalaba en mi anterior
video clase.
The complete original sentence is:
La video clase de hoy la vamos a dedicar a ciudades históricas patrimonio y turismo es la segunda parte de paisajes culturales urbanos como lo señalaba en mi anterior video clase.
which its corresponding translation in english could be:
Todayâs video class will be dedicated to historic cities heritage and tourism is the second part of urban cultural landscapes as I pointed out in my previous video class.
I am able to get this translation, but how can I recover the start and end timestamps from the original srt? I can suppose that the complete sentence goes from 00:00:10,660 to 00:00:24,650, but is it possible to recover the timestamps of each of the 3 subsentences?
Maybe I am wrong but it doesnât seem to me a trivial problem. However, Iâve seen a lot of videos with subtitles which (I suppose) have been automatic translated in a lot of different languages (for example, in Youtube); any idea how they generate them?
Another option could be to translate subsentence by subsentence, I mean to translate the one first one:
La video clase de hoy la vamos a dedicar a ciudades histĂłricas
to:
Todayâs video class will be dedicated to historic cities
and put the original timestamps (00:00:10,660 --> 00:00:16,390) but I think the predicted translation would be worse because a lot of context from the rest of the sentence gets lost.
Regards!
Ana