We show that content on the web is often translated into many languages, and the low quality of these multi-way translations indicates they were likely created using Machine Translation (MT). Multi-way parallel, machine generated content not only dominates the translations in lower resource languages; it also constitutes a large fraction of the total web content in those languages. We also find evidence of a selection bias in the type of content which is translated into many languages, consistent with low quality English content being translated en masse into many lower resource languages, via MT. Our work raises serious concerns about training models such as multilingual large language models on both monolingual and bilingual data scraped from the web.↫ Brian Thompson, Mehak Preet Dhaliwal, Peter Frisch, Tobias Domhan, Marcello Federico
As a translator myself, this is entirely unsurprising. Translating is a craft, a skill, and much like with any other craft, you get what you pay for. If you pay your translator(s) a good rate, you get a good translation. If you pay your translator(s) a shit rate, you get a shit translation. If you pay nothing, you get nothing.
I’m definitely seeing more and more people in my industry integrate machine translations, but so far, it’s not been an actual issue – I have no qualms about accepting a job where I take a machine-translated text and whip it into shape and turn it into a human-readable, quality translation… As long as people pay me a reasonable rate for it. Working from a machine translation is often quicker and easier, so the going rate obviously reflects that.
The quality of machine translations is absolutely atrocious, however, and the idea of relying on it for texts other people – customers, clients, employees, etc. – are actually supposed to read and work from is terrifying. Google Translate is an effective tool for personal use, but throwing, I don’t know, your product’s manual at it and dumping the unedited result onto your customers is borderline criminal.
Pay nothing, get nothing.
#foss?
Isn’t the reason for this is we fully accept/expect developers to provide their time and skills but don’t expect the same from other professionals?
Why arnt language professionals offering their services for free to improve translations in software?
P. S. I am being a bit clickbaity on this to make the point as there is a clear double standard here.
Adurbe,
Interesting point, anyone think there’s a good reason for this?
I don’t know about “expectations”, but I believe there’s a reason indeed that free software (in both senses, for this occurrence) exists.
It started with people wanting to solve their own problems and/or show off their craftsmanship, learning new things in the process (from research and/or from others). Since it turned into something functional, users (personal and corporate) got interested and it lead to the most skilled devs being employed to improve and maintain said software.
There have been free amateur translations floating about for some high profile books (I remember this was a thing for latest Harry Potter releases around the time they were published, for example), that were pretty low quality, so that answers the “problem solving” part, but not the “showing off craftsmanship” part. There are also a fair number of collaborative translation projects, but none as high profile and visible as the most trendy software projects out there, I believe.
Reason might be that’s it’s harder to collaborate on a “minimum valuable product” for translations than for software. Or that the userbase is less visible, generating less interest. Or that translators are actually a rarer breed than developers. Or, more likely, that their skills are less celebrated than those of developers. And adding to that the absence of a high profile copyright-compliant translation project out there in the wild, the incentive for a good pro translator to spend free time on such an endeavour is just nowhere to be found, I guess.
Thom Holwerda,
I see some similarity between what you are saying, and what I experience as a developer. Tons of local businesses have been offshoring skills like web development. Since I am unable to match the prices of cheaper foreign labor, I’m seeing quite a lot of “sorry, we went with a cheaper Indian company”. Many of those offshored projects do go bad. I would laugh at their cheap follies if it weren’t impacting my profession so seriously. Sometimes even after a bad experience businesses still prefer to go back to cheap offshore option rather than hire a local dev. I’ve seen this so many times that it irks me.
Anyway, here is the point that might apply to you as a translator, just like it does to me. If a business is adamant on paying less, these arguments for why ML translations are worse, even if entirely true, don’t necessarily mean you won’t loose business to it. It’s not our opinion that matters, but the opinion of the those in the company who actually make the call.
I earnestly hope the best for your career, Thom, just be mindful that your points about ML don’t necessarily mean human translators are safe from automation.
https://www.baselinemag.com/artificial-intelligence-ai/duolingo-embraces-ai-the-impact-on-language-learning/
@Alfman
The problem is often the case those secondary languages aren’t the primary marketing targets, so they get little budget support and are bunched in the “nice if we get some” category.
Until some local agency arises the quality isn’t even on the radar. I recently had a debate about this with a Chinese company, they were adamant the machine translation to English was already good enough, in that it conveys the meaning. They were so convinced the MT is good enough they can’t even be bothered making foreign language versions of their product brochures, they are happy o leave it up to Google Translate. It’s what they don’t know that hurts them, so I pointed out the MT won’t make the sale no matter how clear the result, it’s the impression it leaves that is far more important.
cpcf,
Most companies have no idea how to judge the quality of translations. To them, anything that’s returned as “Done” may equal “Good enough”. To really get the Q/A right, they not only have to commission the original translations, but even more translators to check those translations across all languages. That’s a lot of people involved!
That’s a good point, I’ve relied on google translate to help me read chinese documentation that wasn’t available in english. It got me the information I needed, but it was by no means good.
Another thing I sometimes notice is when documentation is written in valid english but isn’t written by a technically savvy writer. Having translators who are competent in both translating as well as the technical subject matter is probably too much to ask for though.
Google is very bad at automatic translating, that’s true, however there are much better translation services out there such as DeepL. It may occasionally spit out sentences with awkward grammar, but that can be fixed by anyone who has a decent handle on the language being translated to, without needing to know the language being translated from.
What you don’t understand Thom is that some jurisdictions require some kind of local-language manual to be provided, and machine translation or cheap human translation is the preferred way to achieve minimal compliance with the law.
(before such laws were enacted, there were cases of no local language manual at all being provided, for example, my dad bought a Panasonic HiFi in the early 2000s in Greece and it didn’t ship with a Greek manual at all)