While still in high school, Xinyi Liu worked briefly in a lab at Beihang University in Beijing and was surprised to see Chinese researchers regularly using Google Translate to generate the first draft of scientific papers in English. Translation is essential if scientists want to submit papers to high-level journals, almost all of them in English.

“It was normal for postdocs to just use Google Translate to first translate everything, then edit and polish it. But after the first translation, the whole article didn’t make sense,” Liu said, a young student at the University of California. , Berkeley, who specializes in molecular and cellular biology. “Literally all the words, all the terms were just randomly glued together.”

There had to be a better way, she thought.

So last year, when she saw a new seminar given by Rebecca Tarvin on breaking down language barriers in science, she signed up.

This class, which will be taught at UC Berkeley for the third time in the spring of 2023, was a trial balloon for Tarvin, assistant professor of integrative biology. With a renewed campus-wide focus on diversity, equity, and inclusion, she and working groups within her department thought the class could help UC Berkeley solve a scientific problem of long-standing: English, the dominant language of science, is a major barrier for scientists who are not native English speakers.

Foreign students and scientists are not the only ones at a disadvantage when science is communicated primarily in English. The same is true for many US-born students. As of fall 2020, approximately 40% of freshmen at UC Berkeley were first-generation students, and in the University of California’s 10-campus system, 39% of first-generation students grew up with a language other than English as the first language.

“A lot of our California students grew up translating for their parents,” Tarvin said. “Translation has been part of their life since they were very young.”

For Tarvin, the course — Breaking Language Barriers in Evolution and Ecology — was an “opportunity both to teach students translation literacy skills and to encourage students to become activists in this area of ​​structural change.” And actually, I’ve seen a really positive reception to this kind of activism from the students, because they all seem to agree that it’s really important to overcome language barriers after taking the course.”

The class led Tarvin and some UC Berkeley graduate students, along with collaborators in Canada, Israel, and Hungary, to write a scholarly paper evaluating new machine translation tools that can be used by people around the world to make their scientific articles accessible to non-English speakers. . The article appeared online this month in the journal Bioscience. Translations into Spanish, French, Portuguese and Hungarian, the languages ​​of the co-authors, are also online.

“The idea here is that we’re trying to give people the tools and the motivation to translate their own scientific research,” Tarvin said. “Science does not need to be based on a single language. And there are many additional benefits that come from integrating multilingual approaches into every phase of science. For example, publishing in multiple languages will benefit society through better science communication.”

“Language can be a barrier, as well as a fantastic tool, for bringing people together,” said Emma Steigerwald, who is the paper’s first author and a graduate student at UC Berkeley in science, policy and management. of the environment. “It’s an obstacle that we can overcome by using this new technology. We explain technology and how it can be implemented and the things we need to be aware of when using technology, and all the wonderful and positive ways that science communication can be transformed by leveraging this new technology.”

Towards a multilingual scientific network

Until recently, computer translation was the butt of jokes. People shared amusing examples of mistranslations, often seeming to belittle languages ​​other than English and, by implication, other cultures.

But machine learning, or artificial intelligence, has dramatically increased translation accuracy as tourists use internet services like Google Translate to communicate with people in the countries they visit.

But for texts that contain a lot of jargon – much of it scientific, but also from many other academic fields – Google Translate is woefully inadequate.

“Translation quality is not for a journal,” said Ixchel Gonzalez Ramirez, one of the course’s graduate student mentors. “Often people have to pay to have their work translated by a professional translator, and it’s very expensive.”

The new article highlights some of the many services – most of them free – that can convert English scientific writing into other languages. Besides the well-known Google Translate platform, these include DeepL, which uses neural networks and claims to be much more accurate than competitors when translating from English to Chinese, Japanese, languages Romanesque or German, and vice versa; Baidu Translate, a service from Chinese internet company Baidu that initially focused on translating between English and Chinese; Naver Pagago, a multilingual translator created by a company in South Korea; and Yandex.Translate, which uses statistical machine translation and focuses primarily on Russian and English.

“Translation is becoming more and more within the reach of anyone. Whether you are an expert or not, and whether you are even bilingual or not, the ability to translate is so accelerated by so many technologies that we have today’ today,” Steigerwald said. “So how do we integrate that into our workflow as scientists, and how does that change the expectations that surround science communication?”

English is the lingua franca of science

Tarvin’s interest in translation grew out of one of his graduate students, Valeria Ramírez Castañeda, who in 2020 published an article describing the costs incurred by fellow Colombian doctoral students who wished to publish or interact with colleagues in a world dominated by English.

As an evolutionary biologist interested in how some animals came to use poison, Tarvin decided to focus his new seminar on translating papers in the fields of evolution and ecology, although students who signed up eventually plotted their own courses. She particularly sought out students, like Liu, and mentors, like Gonzalez Ramirez, who are bilingual or multilingual.

“Everyone in the class had some sort of family relationship with the language,” Tarvin said.

Tarvin also asked Mairi-Louise McLaughlin, professor of French and linguistics at UC Berkeley and expert in journalistic and literary translation, to talk to the class about how professionals approach translation and how translation affects meaning. . This subject touched the students when they tried their hand at translating scientific summaries and sometimes entire articles.

Ruoming Cui, a sophomore who took the course in the spring of 2022, chose Baidu to translate scientific summaries. She immediately discovered that the long, complex sentences of English and the use of multiple words to describe a concept seemed redundant when translated into Chinese.

“We don’t usually do this in Chinese because it makes every sentence very long and it’s very tedious,” she said.

Liu added that without considerable polishing, many English translations are garbled, she said.

“I heard that even if your result is amazing, if you write a confusing article because of the translation, people will be annoyed because they can’t understand what you’re doing,” Liu said. “And that will greatly affect how people validate the research or even if they read it. I think that’s a big hurdle in the scientific world.”

Steigerwald, Tarvin and their co-authors also realized that writing scientific papers in simpler English – something non-scientists have long encouraged – benefits English-speakers and non-English-speakers alike.

“If your first language isn’t English and you’re just trying to read the English version of the article, it will seem a lot less ambiguous and a lot more readable when the author has used plain language,” Steigerwald said. . “But also, very importantly, when you’re going to translate that piece of text, machine learning tools will have a much easier time translating something that’s written in plain language. So that’s kind of a future proof of your writing, so if someone wants to translate it into a million languages, they’ll have a much easier time when it’s written that way.”

There remain barriers to the widespread translation of scientific articles, including where to make them available and how to manage copyright. Most journals do not even accept articles that are not in English, and few explicitly allow co-publication of articles with a translation. Tarvin found that few journals have policies on translations, and due to general copyright restrictions, many publishers charge exorbitant fees to publish a translation online after publication.

“It’s quite amazing how many journals don’t allow you to freely publish translations after publication, and how few have platform support where you could even have just an abstract in a second or third language,” said Tarvin. “I think a major barrier to that is web platforms; not just publishing and copyright rules, but also the functionality of the platform.”

With the Breaking Barriers Seminar and now the Bioscience paper, Tarvin and his colleagues hope to gradually change the standard in science to default to translating articles into other languages, particularly the language of the country where the research was carried out and the languages ​​of the co-authors.

And the more translations, the more material to train machine translation systems to do a better job, gradually increasing the quality of scientific translation.

“In my lab, we translate a lot of our research, and now the people in Emma’s lab do that too,” she said. “I think sharing our positive attitude about it and how it can make a difference for people has influenced a small but growing group of people who are starting to integrate translation into their scientific workflow. “

Other co-authors of Bioscience the article features UC Berkeley PhD students Valeria Ramírez-Castañeda and Débora Brandt; András Báldi from the Institute of Ecology and Botany at the Ecological Research Center in Vácrátót, Hungary; postdoctoral fellow Julie Teresa Shapiro of Ben-Gurion University of the Negev in Be’er Sheva, Israel; and Lynne Bowker, professor of translation and interpretation at the University of Ottawa in Canada.