(23rd-Nov-2020)
• Machine translation is the task of reading a sentence in one natural language and emitting a sentence with the equivalent meaning in another language. Machine translation systems often involve many components. At a high level, there is often one component that proposes many candidate translations. Many of these translations will not be grammatical due to differences between the languages. For example, many languages put adjectives after nouns, so when translated to English directly they yield phrases such as “apple red.” The proposal mechanism suggests many variants of the suggested translation, ideally including “red apple.” A second component of the translation system, a language model, evaluates the proposed translations, and can score “red apple” as better than “apple red.” The earliest use of neural networks for machine translation was to upgrade the language model of a translation system by using a neural language model (Schwenk et al., ; 2006 Schwenk 2010 , ). Previously, most machine translation systems had used an n-gram model for this component. The n-gram based models used for machine translation include not just traditional back-off n-gram models (Jelinek and Mercer 1980 Katz 1987 Chen and Goodman 1999 , ; , ; , ) but also maximum entropy language models ( , ), in which an affine-softmax layer Berger et al. 1996 predicts the next word given the presence of frequent -grams in the context.
Comments