Machine Translation: Guardrails Not Included!

By Pauline Forté●August 31, 2023Topics: GDMLC Graduate Contribution, Machine Translation

While it first appeared in the 1950s, Machine Translation has been at the forefront of the language industry in recent years with huge strides in generative AI models, including Chat GPT and Chat GPT-4 (March 2023). As a result, many companies are moving away from using pure Human Translation to enable their localization strategy, to a hybrid model, combining Human Translation (HT) and Machine Translation (MT).

For large companies, switching their localization technology solution requires making customer-centric strategic decisions and getting leadership buy-in, without the ability to guarantee cost savings or satisfactory quality translations – at least not initially. Looking specifically at a global hospitality company evolving its digital online presence from a limited reservation and property site scope, varying per language, to the full English customer journey in more than 20 languages, provides valuable insight into the risks and benefits of introducing Machine Translation on a large scale.

Machine Translation (MT) or automated translation refers to the process when computer software translates a text from the source language into the target language without human intervention (Globalization and Localization Association). While machine translation software is still immature, and often results in errors related to idiomatic and conceptual equivalency (Singh, 2022), language service providers (LSPs) reported that the percentage of projects for end-clients using machine translations climbed from 13% in 2019 to 24% in 2020 (CSA Research, 2021).

Yet English is king in this global connected world (accounting for 25% of all internet content) – to the despair of many. Nearly half (44%) of global consumers are frustrated by the dominance of the English language online, according to a RWS survey report (“In Understanding We Trust”) of 6,500 respondents across 13 global markets. English writing is correlated to linear thinking, which can help explain why generative models are so proficient at producing content that is based on English contextual cues and terms.

Looking back in time is a great way to put machine translation into perspective. In 1930, Charles Ogden published his “Basic English” book, combining the English language into 850 words, and in a way setting up the stage for modern-day MT. The purpose of Ogden’s system was to provide an international secondary language that is “simple enough to be quickly and easily learned, yet sufficiently flexible to be adequate for conveying information and expressing ideas.” This “Controlled Language” presents itself as the perfect base to produce good MT leverage.

For a hospitality company managing millions of words per year, it takes machine translation to truly reach its customers globally in their language and tap into their cultural values. Indeed, customers are looking for consistent end-to-end experiences online and in-person interactions, no matter where they are in their customer journey. As such, the brand’s global media channels (website on desktop and mobile, App, blogs, on-property marketing etc.) must embed quality and relevant content at every step of the journey. According to CSA Research “Can’t Read, Won’t Buy – B2C” (2020), 65% of consumers prefer content in their own language. It also shows that if a company chooses not to localize the shopping experience, it risks losing 40% or more of a potential market.

E-commerce companies must answer key questions in order to determine their localization strategy. Does the market require extensive translation-related investments? (Singh, 2022) Does the content delivered in a specific locale need to be of high quality (i.e. for creative/marketing campaigns)? Or can a certain percentage in terms of translation quality, such as 80% and above, be satisfactory for large volume of translations (i.e. hotel room descriptions)? In addition, these companies have choices to make in terms of their digital strategy and process, with the localization tools they want to use, including raw MT, customized MT, and/or machine translation post-editing (MTPE) with light post-editing or full post-editing. RWS’ global report indicates that when looking at customer experience (CX), 96% of global consumers agree that automatic, real-time translation should be an online service standard; yet only 31% definitely agree that brands are making this cultural effort.

Customized machine translation is trained to perform better for specific content types. As such it makes sense for a hospitality brand to leverage its existing Translation Memory, Style Guides and Terminology glossaries, that have matured over the years, and avoid taking as many risks as if it were relying on out-of-the box generative AI models with no trained output.

Working with a long-time localization vendor when launching a machine translation strategy is key. It allows the LSP to “simply” plug the customized MT into the translation management system, to replace the prior full human translation workflow. In a current language technology landscape with 87 different brands of machine translation (Nimdzi, 2023), it’s a great advantage to partner with a vendor that can handle the whole localization lifecycle, from capturing the source content to processing the machines translations, and maintaining proprietary TMs, Style Guides and Glossaries, all with Computer-Aided Translation (CAT) connected to the enterprise’s systems. These efforts can result in significant gains in terms of volume of translations and speed to market.

In this context, customized MT reduces the time to reach targeted markets from weeks to almost real time and allows for exponential cost savings. A scope that once only enabled a subset of pages and channels to be localized, depending on the language, can now turn into the full customer journey in all supported languages – hitting Sim-Ship by ensuring simultaneous content deliveries on a global scale.

That, of course, only happens after specific key performance indicators and measures of success have been established between the brand’s localization, product, engineering, and designated in-country language stakeholders, and the localization vendor. These KPIs include defining the MT levels of service (customized MT, MT with light post-editing, MT with full post-editing) based on content types, setting up quality metrics based on language tiers for MT suitability, and establishing a quality feedback implementation process to continuously improve the quality of MT output.

As previously mentioned, the cost saving associated with MT can be huge, since it removes the notoriously expensive process of full human translations. However, at least during the MT implementation and measuring phases, e-commerce businesses must be ready to invest in the tools and strategy to assess the translation quality. Translation encompasses so many complexities because of its subjective nature that one can wonder whether MT will ever produce translations that truly meets understandability, accuracy, and fluency standards – three common machine translation scoring criteria. A key metric in measuring quality and achieving idiomatic translation equivalence is to look at the post-editing distance, which tracks the minimal number of edits necessary to change one string from the MT output to the final version of the target language. To determine the post-editing score, translation vendors measure post-editing effort, or the percent change of any given segment required to achieve the final translation.

The challenge with MT quality is not only that it comes at a high cost, but it relies on continuously increasing volumes of data to give it more context and improve its output. So, it is essential for localization teams to set expectations with their leadership and in-market partners that it takes time – cadence to be agreed upon and refined as needed – for the translation process and machine to improve, by leveraging both the in-house translation management tools and open generative AI models. Upon reevaluation, the level of MTPE can be adjusted, resulting in cost savings.

A common fear in the language service industry is whether MT will replace translators. As of right now, the consensus seems to be that it won’t, and instead it displaces certain lower value tasks, i.e. translating user generated content which is now done by MT because of the cost advantage. What MT is doing is increasing the expectation of translation and driving greater volumes of translation overall. Arle Lommel, Director of Data Services at CSA Research, explains that “Per CSA research, MT has created opportunities for translators. There’s the perception of harm, but the contention is that MT thus far has actually benefited translators by increasing the amount of work.”

The language industry is a transformative industry, and as such those involved in the process of localizing content aimed at reaching global audiences do not create anything. Instead, they are part of a fast-evolving eco-system relying on humans to provide a necessary level of expertise, management, knowledge, and context that a machine is not able to in all situations. With new technology comes new complexities that can be daunting at first for e-commerce businesses incorporating MT into their localization workflow. With too many choices of AI solutions and language services comes “analysis paralysis,” says Renato Beninatto, Chief Executive Officer at Nimdzi. “We’re competing with ‘free’ but there are more liability and risk when using machine translation. While MT expediates the process of translating words, translation is part of a more complex process that requires human intervention.”

Stay up to date with the latest posts from The Localization Insights Blog

For global marketing brands, developing a strategy to produce rapid and high volume translated content is critical to gaining and maintaining customers and market dominance – all without sacrificing turnaround times or the quality needed to engage and convert customers. Unbabel surveyed over 1,600 global marketers across eight countries to understand how they overcome the challenges of localizing content across cultures and scaling their business success internationally: 39% of marketers said they are using MT as part of their localization strategy, 83% of which are confident in the quality of their translations.

Machine translation allows global brands to translate at the click of a button, but that is not enough to satisfy international customers’ needs and wants. Guardrails need to be put in place so that consumers in locale-specific markets understand, approve and engage with their products and services.

Pauline’s comments on the Global Digital Marketing and Localization Certification

This is quite an extensive course, so you need to ensure you plan times during and outside business hours to complete the lectures, quizzes, reading materials and final written assignment. I found Modules 4, 5 and 6 the most instructive and useful, and learned a great deal of digital globalization and localization concepts and resources.

REFERENCES

Beninatto, R., 2022, The MT Sommelier and The Paradox of Choice
Can’t Read, Won’t Buy – B2C, by CSA Research/Kantar, 2020

Cappelli, G., 2007, The translation of tourism-related websites and localization: problems and perspectives
Global Understanding. Unlocked, by RWS, 2023

In Understanding We Trust (PDF), by RWS, 2023

Language Technology Landscape, by Nimdzi, 2022

Lommel, A., 2023, CSA Research, Open Q&A discussion on MT

New Report: Global Trends in Marketing Localization for 2023, by Unbabel, 2022

Singh, N., 2022, Global Digital Marketing and Localization Certification

ABOUT THE AUTHOR

Pauline Forté

Since 2013, Pauline has gained experience as a subject matter expert in localization and content management. Initially focusing on French translations and revisions for property sites, reservation engines, and marketing campaigns, she then began managing the lifecycle of localization projects such as rebrands, campaign launches, and Hilton’s Covid-19 corporate information pages and alerts across 24 languages. She really enjoys working on localization system enhancements, workflow optimization and language quality management – while collaborating cross-functionally with internal stakeholders and LSPs . She’s also keen on keeping abreast of the ever-changing trends and developments in CAT tools and technology.

Born and raised in France, Pauline holds a Master of Arts in Journalism Research from the University of Arkansas at Little Rock, and recently completed the Global Digital Marketing & Localization Certification.

Before returning to the U.S. in 2013, Pauline spent four-and-a-half years living and working in Dubai, first as a features writer for the region’s largest newspaper Gulf News, and then as sub-editor and copywriter for a digital ad agency. In Dubai, she also played in a competitive women’s soccer league.

In her spare time, Pauline lives to travel, loves DYI projects, enjoys listening to all kind of music genres, and turns into a passionate French national team supporter during international football/soccer competitions.

Disclaimer: Copyright © 2021 The Localization Institute. All rights reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published, and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this section are included on all such copies and derivative works. However, this document itself may not be modified in any way, including by removing the copyright notice or references to The Localization Institute, without the permission of the copyright owners. This document and the information contained herein is provided on an “AS IS” basis and THE LOCALIZATION INSTITUTE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY OWNERSHIP RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

EXPLORE TOPICS

Machine Translation

GDMLC Graduate Contribution