I am trying to create my own corpus from Hindi business news .
For machine translation, corpus are usually single sentences.
Does taking sentences or paragraph makes any different in terms of accuracy? Just for my curiosity.
I prefer sentence by sentences, but in that way I need long preprocessing of converting each sentences to target language for building the corpus.