“AI” translations are ruining Wikipedia

Thom Holwerda 2026-03-09 Clown car 13 Comments

Oh boy.

Wikipedia editors have implemented new policies and restricted a number of contributors who were paid to use AI to translate existing Wikipedia articles into other languages after they discovered these AI translations added AI “hallucinations,” or errors, to the resulting article.
↫ Emanuel Maiberg at 404 Media

There seems to be this pervasive conviction among Silicon Valley techbro types, and many programmers and developers in general, that translation and localisation are nothing more than basic find/replace tasks that you can automate away. At first, we just needed to make corpora of two different languages kiss and smooch, and surely that would automate translation and localisation away if the corpora were large enough. When this didn’t turn out to work very well, they figured that if we made the words in the corpora tumble down a few pachinko machines and then made them kiss and smooch, yes, then we’d surely have automated translation and localisation.

Nothing could be further from the truth. As someone who has not only worked as a professional translator for over 15 years, but who also holds two university degrees in the subject, I keep reiterating that translation isn’t just a dumb substitution task; it’s a real craft, a real art, one you can have talent for, one you need to train for, and study for. You’d think anyone with sufficient knowledge in two languages can translate effectively between the two, but without a much deeper understanding of language in general and the languages involved in particular, as well as a deep understanding of the cultures in which the translation is going to be used, and a level of reading and text comprehension that go well beyond that of most, you’re going to deliver shit translations.

Trust me, I’ve seen them. I’ve been paid good money to correct, fix, and mangle something usable out of other people’s translations. You wouldn’t believe the shit I’ve seen.

Translation involves the kinds of intricacies, nuances, and context “AI” isn’t just bad at, but simply cannot work with in any way, shape, or form. I’ve said it before, but it won’t be long before people start getting seriously injured – or worse – because of the cost-cutting in the translation industry, and the effects that’s going to have on, I don’t know, the instruction manuals for complex tools, or the leaflet in your grandmother’s medications.

Because some dumbass bean counter kills the budget for proper, qualified, trained, and experienced translators, people are going to die.

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

13 Comments

2026-03-09 6:31 pm
Pfaffa
Exhibit one: The Legend of Zelda
It translated by I assume using a Japanese/English dictionary and it shows. Some lines are okay, some cringe, and some confusing, but at least the dictionary was penned by a pro. My favorite is: “Master using it[,] take this” which can be read with Master as a noun instead of a verb. The newer translation is “Take this[,] master using it”
Log in to Reply

2026-03-09 7:24 pm
Alfman verbose=1
Pfaffa,
Exhibit one: The Legend of Zelda
It translated by I assume using a Japanese/English dictionary and it shows. Some lines are okay, some cringe, and some confusing, but at least the dictionary was penned by a pro. My favorite is: “Master using it[,] take this” which can be read with Master as a noun instead of a verb. The newer translation is “Take this[,] master using it”
Yes, I suspect we’ve all seen examples of bad translations. In the case of older games it probably was done by human translators who just weren’t very proficient in English.
This one makes me laugh…
“All your base are belong to us”
https://www.youtube.com/watch?v=qItugh-fFgg
With wikipedia cited in the article, there’s just no realistic way to hire enough humans to translate all the text they have into all the desired languages they want to serve. Without AI, you’d end up with only a tiny fraction of the content being available in other languages and most content being completely unavailable. Even if there were enough labor, most companies are unwilling to pay more to hire humans even if they are more proficient.
Thom, I am sympathetic with your view that the AI is not on par with human pros. As a developer I am frequently frustrated by corporate cost saving measures that decrease the quality of the product. I see this with most of my clients and it’s been of one of my big gripes in the industry. We can highlight these problems and maybe it’s important just to talk about it, but is there a solution? I genuinely don’t know what it is.
Log in to Reply

2026-03-10 1:32 am
ssokolow (Hey, OSNews, U2F/WebAuthn is broken on Firefox!)
We can highlight these problems and maybe it’s important just to talk about it, but is there a solution? I genuinely don’t know what it is.
I’d argue that part of the solution is requiring that any form of machine translation that’s not human-vetted be applied by the end-user using something like Google Translate. That way, expectations are more realistically set.
(Similar to how, when I privately goof about with Stable Diffusion, I use a copy running on my own PC and that keeps me aware of how much energy it’s consuming and how much heat it’s generating… as well as letting me do it more in the winter than the summer so I can use what would otherwise be waste heat to work with my furnace instead of against my or someone else’s air conditioner.)
As for “human-vetted”, I’d say it’s certainly possible to be good enough for certain situations. I’ve read various Japanese-only things on Pixiv by OCRing them, hand-fixing the mis-OCRed characters using the character picker on Jisho, and then juggling whitespace (and using cut/paste to “temporarily hide parts of the text”) in Google Translate until I get a sense for what meanings are actually there and which ones are it tripping over idioms or onomatopoeia.
For non-life-critical stuff, we are at a point where a responsible user can trade off time for knowledge of the source language. (To a certain point. I do need a touch of intuition from having watched a bunch of subtitled anime and a bookmarked guide to Japanese onomatopoeia.)
Log in to Reply

2026-03-10 12:10 am
Sysau
I am much more concerned by AI used in war than stealing my job. Pervasive surveillance and automated killing is being normalized. I worry about this aspect a lot more than job market disruptions or grandma’s label on her medication.
Log in to Reply

2026-03-10 12:33 pm
Andreas Reichel
Kudos, that is actually a strong argument, which I would sign promptly.
For a long time, all I wanted was better/stronger/smaller batteries. Right now it appears that batteries are humans last line of defense.
Log in to Reply

2026-03-10 3:28 am
drstorm
For someone who doesn’t use AI, you sure seem to know an awful lot about what AI can and can’t do. I read the article in Serbian – the same language I’m writing this comment in. I dunno. Seemed pretty idiomatic and accurate to me.
Log in to Reply

2026-03-10 4:51 am
ssokolow (Hey, OSNews, U2F/WebAuthn is broken on Firefox!)
LLMs are verisimilitude machines. As spicy autocomplete, they prioritize looking convincing, which means they’re very good at lulling you into not double-checking that their output is accurate every time you use them.
Log in to Reply
2026-03-10 12:30 pm
Andreas Reichel
Same for Thai or Bahasa Indonesia. I trust the AI translation in both directions better than translations of Native Speakers, especially when the Domain Knowledge of the subject matters.
Log in to Reply

2026-03-10 7:00 am
Adurbe
I have to disagree with the analysis here.
Even if done with AI, the spreading of knowledge and information is much more important than a grammatical error.
We have seen what happens if we leave it to people to translate the content on Wikipedia. An anglo-centric knowledge base. Let’s Use AI tooling to translate the content on-mass. The community of members can then tweak or correct issues as they go. As ssokolow said, put a banner to highlight it was non-human validated and we are good.
I’m sure we can all agree it’s Much easier to fix an error then write the document from scratch.
Log in to Reply

2026-03-10 1:05 pm
Alfman verbose=1
Adurbe
I have to disagree with the analysis here.
Even if done with AI, the spreading of knowledge and information is much more important than a grammatical error.
If you read the wikipedia source linked by the article, there’s a whole lot more nuance to the discussion.
https://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard/Archive378
It’s quite a lively debate. Every angle we can think up here as armchair analysts has probably already been discussed. For example some support machine translation, but are critical of certain paid translators who don’t appear to be competent of reviewing those translations. At first blush this makes perfect sense, but then someone else brings up the rate they’re paying translators through subcontracting…
From what I understand – although please correct me if I am wrong – OKA editors are independent contractors, with a very low salary of 397 USD per month for a 40 hour week, which I believe corresponds to 2.29 USD hourly. Given this precarious status, I am worried that more uncertainty in the translator duties may lead to an overloading of responsibilities, which is worrying as independent contractors do not necessarily have the same protections as paid employees.
To clarify the stipend amounts: the $397 USD figure is for the initial trial period. For regular editors, this increases up to $452 USD, and for those taking on mentoring/management roles, it ranges between $525 and $634 USD. While these amounts are modest, we ensure they are above the minimum wage in the countries where the editors reside.
Honestly it seems like a bad faith payment for someone to expect professional results.
They performed an empirical analysis on the quality of translation from the various chatbots…
https://meta.wikimedia.org/wiki/OKA/Empirical_Evaluation_of_AI-Assisted_Translation
Error origin
LLM-Introduced Errors (73.75%): 6,291 corrections addressed issues introduced by the AI, including hallucinations, broken code, missing sentences, and mistranslations.
Source Improvements (26.25%): Approximately 2,239 improvements fixed issues present in the original source article (e.g., dead links, missing templates, factual inaccuracies)
Error severity
Minor (84.34%): Small issue that does not affect meaning, only polish or readability (e.g., Punctuation, style, rewording for flow, missing reference).
Major (10.07%): Affects clarity or accuracy, but meaning is still mostly understandable (e.g., Wrong word choice, unidiomatic expression, ambiguous syntax).
Critical (5.59%): The issue changes or distorts the meaning, introduces factual error, or misrepresents information (e.g., Wrong number/date/name, mistranslation, reference in the wrong place, missing reference or paragraphs). Sentence or paragraph omissions represent a significant share of severity 3 issues, but can easily be spotted by translators.
There’s a lot more information there I found quite insightful. However my honest take is that they really should be using tools that only do translation. These commercial chatbots are readily available, but I feel they are completely the wrong tool for the job! A less knowledgeable tool (that is not capable of hallucinating) would be far more appropriate. I wonder if wikipedia could build something themselves that works better commercial chatbots. They should have all the training data they need.
Log in to Reply

2026-03-10 3:43 pm
CaptainN-
The answer is no, Wikipedia can’t make a better AI engine – not with LLM tech, because of simply how they work. They will always generate to an average, and that always leaves plenty of room for the inappropriately anthropomorphized “hallucination.” (“inaccurate averaged generation” would be better.)
LLMS do not, and CANNOT have any judgement, ever, because of how they work. They can only predict tokens to an average.
Log in to Reply

2026-03-10 6:47 pm
Alfman verbose=1
CaptainN-,
The answer is no, Wikipedia can’t make a better AI engine – not with LLM tech, because of simply how they work. They will always generate to an average, and that always leaves plenty of room for the inappropriately anthropomorphized “hallucination.” (“inaccurate averaged generation” would be better.)
I believe an LLM that has more specialized training probably would perform better than one with generic training AND do so using a lot fewer resources. Alas specialized training takes work to develop but I would expect better results than than generic LLMs that gobble up everything without receiving specialized translation training.
LLMS do not, and CANNOT have any judgement, ever, because of how they work. They can only predict tokens to an average.
As a scientific matter we should be testing such conclusions empirically with double-blind tests such that the conclusion arises from the data and not preconceptions. But I also get that there’s not much interest in being fair to LLMs due to the social circumstances surrounding AI today.
Log in to Reply

2026-03-10 12:27 pm
Andreas Reichel
Thom,
Because some dumbass bean counter kills the budget for proper, qualified, trained, and experienced translators, people are going to die.
Is it just me, or does that sound like:
– the ice king, claiming that ice from the fridge will kill people because only natural ice was sent by God?
– the hackney coachmen, claiming that speed above 25km/h will kill people
Good luck with your argument, that technical manuals or medical leaflets need the depth of “Hamlet” or “Faust”.
Log in to Reply

About The Author

Thom Holwerda

13 Comments

Leave a Reply Cancel reply