Machine Translation Master Class Newsletter – Issue 6

Machine Translation Master Class Newsletter – Issue 6

Dr. Peng Wang

Since ChatGPT was launched at the end of November 2022, it has caused a sensation. The hype is even more relevant to the language industry, as the Generic Pre-trained (GPT) language model can accomplish multiple natural language processing (NLP) tasks, including machine translation. In this newsletter, I will draw the connection between artificial general intelligence (AGI) and neural MT (NMT).

1. Modeling: AGI vs. NMT

In essence, most conventional NMT and AGI models share a common heritage. In 2017, a Google research team published their findings in a paper titled “Attention Is All You Need”. It proposes a model architecture widely known as ‘Transformer’. Though being designed to solve the MT problem, Transformer has proved to be very effective for many other artificial intelligence (AI) tasks. It has even surpassed the field of NLP and started to be a feasible alternative to convolutional neural networks in computer vision. 

While most NMT architecture contains both the encoder and decoder, ChatGPT uses a multi-layer Transformer decoder for its language model. 

2. Game changer: data size

From GPT 1 to ChatGPT, the quantity of data gradually pushes the model to the tipping point. When humans feed new language data to the engine, intuitively we see the increase of data volume, in the form of documents, phrases, words or tokens. Inside the engine, these entities start to build the connections with other entities in the neural network, and typically the weights of these connections are known as model parameters. The number of connections is much larger than that of entities – just imagine how many connections 1 million neurons are capable of building among themselves. In other words, the entities scale up the model exponentially.

One obvious benefit of model scaling up is the neural network is more capable to deal with errors. In this regard, ChatGPT offers some preliminary evidence that larger models are more robust in places where such issues as bias can make language models susceptible to error.

3. Data format: AGI vs. NMT

The fundamental idea behind generic language models is that language provides a flexible way to specify tasks, inputs, and outputs all as a sequence of symbols. For instance, a translation training example can be written as the sequence “translate to French” and a reading comprehension training example can be written as “answer the question” (see the paper on GPT2). That is to say, the model “learns” from the data on its own as what a chunk of information is created for. On the other hand, conventional NMT engines use aligned source and target segments as training data. Translation memory, for example, is a process mainly dependent on human judgment, though “non-intelligent” automation also happens in the process.

Stay up to date with the latest posts from The Localization Insights Blog
  • This field is for validation purposes and should be left unchanged.

4. General vs. specialized purpose

ChatGPT opens a new alternative to automated translation, though more experiments need to be conducted as to the performance of AGI trained on general purpose text. Again, data is the key. For machine translation, we need to consider throughout the whole naturally occurring training set, whether the amount of translation-related data is sufficient to train the model to successfully perform a translation task. 

The trade-off of using generic models is that the same model can work on multiple tasks in an MT-driven localization process, e.g., summary, grammar check and autocomplete. This is consistent with the general practice in the language industry, as most commercialized MT solutions are usually tied with many supporting technologies, such as translation memory, terminology management and project management. The only thing we need to figure out is which deployment plan brings more benefits in the long run, one model for all (AGI) or multiple models integrated with conventional MT.  

5. A new era of hybrid intelligence 

The gradual acceptance of AGI ushers in a new era, when a hybrid of intelligence, that is, human intelligence and AI co-exist in our daily work and lives. Risks exist as well, for sure. The problem is how to mitigate the negative aspects of embracing AI and what type of control we can apply. In a commercial setting, considerations such as security and privacy are crucial.

To address the root cause of the problem, let us step back and try to decipher the relationship between reality and humans from the perspective of data. There are two aspects when it comes to information: organizing and processing data in order to convert it to something meaningful. The underlying structures which organize units and rules into meaningful systems come from inside, that is, the human mind. With more powerful computing power available, the dominant role of humans must remain the same.

The key is if we can successfully put AI in the hands of a much larger community of users, in particular, domain experts, who may not possess any coding skills. By providing them with a more intuitive machine learning (ML) interface, they can customize, to ensure the quality and develop ML systems. While the developers need to create such an ML interface for non-coders, domain experts, for example, localization practitioners, need to improve their AI competence to ingrain rules in their minds in ML systems. To some degree, responsible AI, including intention alignment and social justice, is realized through human-led technology.

Machine Translation Master Class

Machine Translation Master Class


Dr. Peng Wang



Dr. Peng Wang is a part-time professor for the School of Translation and Interpretation at the University of Ottawa and a freelance conference interpreter with the Translation Bureau of the Canadian government. Before that, she was a CAT Tools Coordinator at the Graduate Studies in Interpreting and Translation at the University of Maryland. She was the coach and curator for the automation track for LocWorldWide42. Her current research interests include cognitive interpreting/translation studies and AI, risk management of NMT implementation, terminology and multilingual data analysis.

Dr. Wang began conducting corpus-based translation studies at the University of Liverpool and later she worked in the corpus linguistic program at Northern Arizona University. She has a rich experience of teaching multilingual classes, with students aging from 22 to 75, in over 10 language combinations, coming from UAE, China, Italy, Spain, Germany, Morocco, Colombia, Mexico, and Haiti, to name just a few. She is an expert in approaching technology in the context of culture and common core humanity.


Disclaimer: Copyright © 2021 The Localization Institute. All rights reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published, and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this section are included on all such copies and derivative works. However, this document itself may not be modified in any way, including by removing the copyright notice or references to The Localization Institute, without the permission of the copyright owners. This document and the information contained herein is provided on an “AS IS” basis and THE LOCALIZATION INSTITUTE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY OWNERSHIP RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Machine Translation Master Class

  • Master Class

  • Machine Translation

  • Localization

Stay up to date with the latest posts from The Localization Insights Blog
  • This field is for validation purposes and should be left unchanged.