Thoughts on MT Fluency, Context
I’m just starting my PhD (technically it starts from October), so I have more questions than anything worthwhile to add to the discussion, but here goes.
Fluency and Translation
I stumbled on this after reading a paper on fluency from ACL 2010.
Translation quality can be broken down into two aspects - adequacy and fluency:
- Adequacy depends on how well information in the original text is translated into the foreign language. Peoples, places, verbs, nouns must all be present and joined with unambiguous grammar.
- Fluency is judged by how natural the translation seems. Articles like a, the can be left out and not harm adequacy but the fluency will suffer.
Adequacy requires knowledge of the original text to be able to evaluate the translation, whereas fluency can be measured just with monolingual text. Arguably adequacy is more important than fluency.
TODO: Are there metrics for evaluating adequacy and fluency?
Side-note on BLEU:
BLEU is used to evaluate almost all translation algorithms. Its easy to calculate and works well with most languages. However it’s based on the presence of n-grams and has little or no direct correlation with translation adequacy and fluency.
TODO: Read more on correlation between BLEU and human evaluation.
Context
When helping friends translate odd bits of English and Japanese, they often give very small snippets which are especially hard to translate on their own. I often have to ask more about the context in order to give a more accurate translation.
All machine translation algorithms I’ve come into contact are in exactly the same situation. They have knowledge of the language, they know that word X is translated to words a, b, c with various probabilities, and that previous words in the sentence can affect that.
Japanese is especially hard as it is a pro-drop language in which pronouns can be omitted if the sentence is still unambiguous. This aspect of Japanese is dealt with zero-anaphora resolution, which is something already being researched.
The case I am thinking of is when one word in the source language can be translated into many in the target language, and the choice would depend on the larger context outside of the sentence.
The problem with writing an algorithm to deal with this would be deciding how much context of previous sentences it would remember, and the increased processing and storage required for this cache.
TODO: What algorithms exit to deal with wider context? Is it even feasible?