Grammar Correction with OpenNMT

guz18 · April 9, 2022, 10:28am

Hi guys, I am trying to create a model to correct specific grammar errors in Turkish language, however I could not get any output close to the expected output is there anyone who tried OpenNMT for grammar corrections, how can I change the config file so that I can get better results?

ymoslem · April 11, 2022, 8:52am

Dear Göksu,

I am no expert in the topic, but I have one paper on spelling correction. [paper][repository]

We need to first talk about approaches rather than tools. There are multiple approaches to spelling/grammatical correction, including sequence-to-sequence learning, language modelling, and sequence labelling. Even each approach has sub-approaches; for example, it is common in grammatical correction tasks to use pre-trained contextualized and/or masking language models such as BERT and ELMO (e.g. Muralidhar et al., 2020).

I assume (and correct me if I am wrong) that you worked on grammatical correction as a sequence-to-sequence task like machine translation, i.e. a wrong sentence is translated into a grammatically correct sentence. The main issue with this approach is that it can replace words that are correct, or even generate words that do not exist in the original source. To alleviate this problem, some researchers investigated augmenting the architecture with a copying mechanism to move unchanged words from the source to the target (Zhao et al., 2019). This was one reason, in our paper (Arabisc), I thought that using language modelling would give more control over the process, as it checks word by word, rather than the whole sentence as a whole. Still, language modelling also has its issues, like complexity of handling sub-words, and --depending on the architecture you use-- it might get somehow slow. That is why again pre-trained Transformer language models can be a good choice (e.g. see this article by Grammarly).

As I said, I am no expert here; however, if I have to build a spelling/grammatical correction model today, I would investigate further diverse methods that depend on pre-trained models such as BERT, ELMO, GPT-2, etc.

These are my two cents on the topic. Hope other colleagues will share their experience as well.

Kind regards,
Yasmin