All of us have now used Google Translate, Reverso or Deepl at some point. You know, all these online translators that allow you to switch from one language to another! This is what’s referred to as machine translation (MT): the conversion of a text from one language to another, carried out by computers without any human intervention.
Some of these tools still leave something to be desired, and various questions always arise. Will a generic machine translation engine be suitable for my business needs ? Is my data secure when using machine translation? What is Neural Machine Translation and how can I integrate it into my own specific workflow?
Machine translation: origins, functioning and development
The story of machine translation goes back to the 1950s. Three types of systems have appeared successively over the years:
Rule-based systems (1980s)
These systems are based on machine translation software that combines dictionaries of common terms with sets of linguistic and grammatical rules. It’s advisable to add user dictionaries to these systems in order to improve translation quality, though the end result will still not necessarily be up to user expectations. Nevertheless, when used with specialist dictionaries, rule-based systems are generally capable of producing coherent and logical translations.
Statistic-based systems (1980-90)
These systems do not use any linguistic rules to carry out the translation. They instead translate using statistical models constructed automatically from corpora. The machine translation software analyses a different database for each language, which enables the production of relatively fluid but not always very logical translations.
Neural algorithm-based systems (2015)
These systems can translate in real time and estimate the probability of word sequences. It’s an approach that enables translation engines to learn to translate by means of neural networks formed from connections like those in the human brain. These neural machine translation systems enable better quality translations: in fact, translations produced by neural translation systems contain 50% fewer word order errors, 17% fewer lexical errors and 19% fewer grammatical errors. Neural networks have even learned to correctly match the gender and case systems of different languages (without anybody teaching them how to do it!). Here’s an example of a phrase translated from English to French by both generic machine translation engine that does not take the context of the sentence into account and a neural machine translation engine that has been trained in the field:
Find out how to expand your business internationally with the help of technology
NMT, a major weapon in the machine translation armoury
Unlike other approaches such as statistic-based machine translation and rule-based machine translation, neural machine translation employs a large neural network that uses artificial intelligence to operate in a way that’s based on the human brain.
It’s the most advanced form of machine translation available, with enormous progress made in recent years due to artificial intelligence-based self-learning, big data and deep learning. Today, it’s possible to use neural machine translation engines as a basis for the production of professional translations.
These systems are able to repeatedly reproduce reliable translations and learn new languages. This enables them to continually improve the quality of the information translated. To get them into an operational state, they must be trained by a human. This means supplying the program with a substantial volume of data, the aim being to improve the reliability of the final results.
It’s also possible for people to train these programs to meet the specific needs of sectors that have specific professional terminology, such as the legal, financial and medical professions.
Two of the big five GAFAM companies are already adept at this kind of verified machine translation:
- Google, with its Google Neural Machine Translation (GNMT), a neural network available in eight languages
- Microsoft, with its Microsoft Translator mobile app, which can be used to translate documents in sixty different languages
Skype also has Skype Translator, which is effective at facilitating group chats involving up to 100 participants.
Put simply, NMT is useful in various sectors, especially that e-commerce, though on the condition that various criteria that help to facilitate the translation process are met, such as a sufficient number of repetitions, a sufficient volume of specialist data available to train the translation engine, and a large enough volume of text to translate.
However, there are certain things that need to be taken into account…
The limitations of neural machine translation
As with other forms of machine translation, the disadvantage of NMT is that the source-text phrases need to be very clear and coherent if a quality translation is to be obtained. Every little ambiguity must be incorporated into the software beforehand to avoid ending up with a translation that no longer makes any sense. Neural Machine Translation encounters difficulties when faced with highly technical language, or the use of rare words and proper nouns. There are various aspects that need to be dealt with before carrying out a neural machine translation:
- The clarity of the text to be translated (to avoid ambiguity issues)
- Training and human judgement when dealing with certain specific sectors (legal, medical, etc.)
- Data privacy management: it’s essential to be aware that publicly available translation engines save and store all the data and information they process on their servers. This means it’s difficult or indeed impossible to guarantee the confidentiality of customer data.
- The creativity aspect: a translation engine trains with what it considers to be the norm and will therefore always produce a translation which, from its point of view, is the most appropriate with respect to what it’s learned. However, the language used by specific brands (especially in the e-commerce sector) needs to provide differentiation (a “top” at Topshop might be a “t-shirt” at Zara, even though these two products are essentially the same kind of clothing item).
In order to deal with these kinds of issues, human interaction is essential.
The essential need for human involvement
Machine translation, even the neural type, still has one or two shortcomings when it comes to dealing with context. It’s for this reason that human verification is necessary, as certain subtleties are still beyond the capabilities of machine translation. Human expertise in project management, together with the advice and knowledge of neural machine translation specialists, are essential to the success of verified machine translation projects.
This leads us nicely to verified machine translation, or proofread machine translation. This is commonly referred to as post-edited machine translation, or PEMT. Managed by professional translators with knowledge and awareness of the issues involved in neural machine translation, this kind of work involves correcting the translations that are produced by automated systems in order to obtain a fluid and coherent final target text. There are two kinds of post-editing:
- Light post-editing, which consists of correcting the machine translation at a simple level, without going into depth
- Full post-editing, which requires extensive, in-depth correction on the part of the human translator
Light PEMT is used for the following kinds of errors: spelling or grammatical mistakes, mistranslations, inappropriate or offensive content, omitted words, etc. Full PEMT, on the other hand, is more useful for anything touching on terminological errors, individual sentence structure, punctuation or writing style and is used to make the text more natural and fluid.
Light PEMT is therefore used to make texts in foreign languages comprehensible in a general sense, whereas full PEMT is used to make the sentences produced by the machine translation engine perfect… hence the term verified or proofread machine translation.
Neural machine translation therefore represents a significant step forward in terms of the effectiveness and usefulness of machine translation engines. These translation engines, when enriched and trained on a per industry basis, serve as a highly qualitative resource for the translators tasked with checking and proofreading the translated texts. In concrete terms, NMT offers undeniable advantages for businesses:
- Firstly, the considerable savings generated by machine translation;
- Secondly, significant savings in time: a professional human translator is capable of translating 2,000 to 3,000 words per day on average, whereas a translation engine needs just few seconds to translate large volumes of text;
- And finally, the assurance of impeccable quality thanks to the human expertise of the team in charge of the project and the essential work carried out by the professional translators tasked with post-editing.
It is therefore essential that the work carried out by neural machine translation engines is combined with human support and assistance provided by translation professionals, i.e. localisation and translation project managers, developers and translators/proofreaders. Verified machine translation therefore enables you to ensure that your translations are rapid, coherent, less costly to produce and properly adapted to your field of expertise, your brand’s language and your target audience.