Two for the techbro “‘AI’ cannot be biased” crowd:
A Moscow-based disinformation network named “Pravda” — the Russian word for “truth” — is pursuing an ambitious strategy by deliberately infiltrating the retrieved data of artificial intelligence chatbots, publishing false claims and propaganda for the purpose of affecting the responses of AI models on topics in the news rather than by targeting human readers, NewsGuard has confirmed. By flooding search results and web crawlers with pro-Kremlin falsehoods, the network is distorting how large language models process and present news and information. The result: Massive amounts of Russian propaganda — 3,600,000 articles in 2024 — are now incorporated in the outputs of Western AI systems, infecting their responses with false claims and propaganda.
↫ Dina Contini and Eric Effron at Newsguard
It turns out pretty much all of the major “AI” text generators – OpenAI’s ChatGPT-4o, You.com’s Smart Assistant, xAI’s Grok, Inflection’s Pi, Mistral’s le Chat, Microsoft’s Copilot, Meta AI, Anthropic’s Claude, Google’s Gemini, and Perplexity’s answer engine – have been heavily infected by this campaign. Lovely.
From one genocidal regime to the next – how about a nice Amazon “AI” summary of the reviews for Hitler’s Mein Kampf?
The full AI summary on Amazon says: “Customers find the book easy to read and interesting. They appreciate the insightful and intelligent rants. The print looks nice and is plain. Readers describe the book as a true work of art. However, some find the content boring and grim. Opinions vary on the suspenseful content, historical accuracy, and value for money.”
↫ Samantha Cole at 404 Media
This summary was then picked up by Google, and dumped verbatim as Google’s first search result. Lovely.
Thom Holwerda,
I don’t know who’s saying that. The AI research I read generally reports that AI is biased particularly when the training set is biased.
Believe me I understand the hatred…however maybe we need to contemplate the possibility that this summary is a genuine reflection of those buying and reviewing the book. You may be scapegoating AI for what might be better argued as an indictment on humanity.
Maybe we could make the case their opinions should be censored, however the irony there is that codifying censorship into AI would then be a direct source of bias.
Nope the reviews were not like that at all. Read the linked article.
Thom Holwerda,
I agree with the themes of the first article, however to me that’s not “AI causes bias” so much as “bad training data”. We know that’s a problem. And yes I concede the points about LLM grooming through politicized content generation. AI isn’t really the cause of this problem though. We know that Russia was spamming social media with disinformation for at least a decade. AI platforms happen to be the newest messengers, but they are not the root cause. The question is how we deal with it. That I don’t know, but I do know shutting down AI doesn’t solve the problem.
I can not independently verify the second article that contained the summary because amazon has censored the entire listing (at least in the US). Given the type of people who would buy the book, it’s not evident to me that the AI summary is necessarily biased – for better or for worse.
I don’t say this to normalize nazism – a hateful paradigm if ever there was one, but because the world may have to face the disturbing reality that the nazis philosophy really is returning with a vengeance. Yes it’s appalling, but I honestly think these AI incidents should be a red flag to warn us about us, the AI is merely incidental.
My ramblings on this topic may not belong here, but I am so fearful that our democracy could be on the verge of collapsing into an authoritarian regime, every year it feels like we’ve gotten closer especially as the institutional guardrails and independent branches of government are crumbling. If this were a NASA countdown, we’d be at “*go* for authoritarianism” 🙁
The US is done. The experiment failed.
People use bias everday to process input. We use our experience and knowledge to make decisions. This can be flawed, and the flaws can either be corrected in a positive or negative directions.
AIs are tuned, and the %ssholes in charge of the tuning are making the AI worse for lulz or profit.
Your love of AI is blinding you to the actual point. These favorable Mein Kampf sentiments in the trainging data was put there INTENTIONALLY, as part of an attack. Check your AI bias at the door, and understand the issue.
CaptainN-,
But that’s just it…it is not bias if it’s a good faith summary of the posted reviews. I understand if you have a beef with the reviews. And to the extent that reviews are fake I agree with you that’s a problem too. However that’s a different problem than the AI not faithfully summarizing the reviews as they stand..
Ad hominem arguments like this don’t advance the discussion so I kindly ask you refrain from using them. I don’t even know where you got the idea I love AI., it seems to be a misunderstanding of my actual opinion. The main objection I have with most of the AI haters NOT because “I love AI”, but because they suggest AI won’t last. I think this is wishful thinking. Not only does AI keeps evolving to do more jobs, but corporations will always be interested in more profits – employees are by far their greatest expense. My prediction that AI tools will become more ingrained don’t stem from AI bias or misunderstanding, you are completely off the mark there, but rather predicting natural corporate behaviors. My so called “love of AI” isn’t a part of the equation. Even if I hated AI and everything it stands for, that wouldn’t necessarily change my predictions for the future.
Fair enough point on ad hominem.
It’s not that I don’t understand your point – I get it, the technology does what it’s described to do (we disagree on that too in the margins, but basically, that’s right.) That isn’t the point that is being made here. These tools are being sold as something that can both be accurate, and can be “made safe” by correcting for “bias.” The article (and Thom ) is pointing out, that it hasn’t achieved that second goal (I’d also argue, it hasn’t achieved – and cannot achieve – the first.) So when I say you are missing the point – well there it is.
The assertion I think you are missing, is that these tools cannot and have not been “made safe” in line with the promises being made. That fact will not stop people from using these things inappropriately.
CaptainN-,
So your point is that it’s being sold as something it’s not? I can agree there may be some misrepresentation going on. I’ve been trying to push the view that AI isn’t an oracle of truth nor should we expect it to be. The companies selling it need to be more upfront about the garbage in, garbage out aspects of LLMs.
Technically I still feel it’s important to distinguish between AI bias versus data bias. Bad data can make a good AI produce bad results. I guess some people would like to define a good AI in terms of good results regardless of bad data, but I question this being a mathematical possibility at all. If the quality of output weren’t dependent on the quality of input, then it seems to more or less imply that any garbage input would work. and output is predetermined.
It’s just an ironic view of “bias”. An unbiased AI will reflect the input data as is, faults and all. If you want to filter the output by some moral code, we can certainly do it, but technically this so called “moral AI” will have the “bias” and not the original. I’m not saying we can’t or shouldn’t add moral bias to the AI, but I still take issue semantically with the original unbiased AI being described as “biased”, when technically it’s the exact opposite. We should expected an unbiased AI to reflect biases in the input data. An AI that alters the input, even for moral reasons, is the one that’s technically biased.
I think our disagreement may be over what exactly “biased” means.
I would argue that AI/LLM is just a tool. It can do a lot of powerful stuff but it shouldn’t be seen as intrinsically right or wrong, good or bad, etc. Rather than fault the tool, people need to understand that it’s doesn’t have truth or morality except insofar these concepts are well represented by the training data.
People who have been huffing paint fumes and reading Ycombinator Hackernews. Bad analogy, that’s more of a circle then a venn diagram. :\
I’m sure it is. I’m sure the Nazi’s buying the book really like the book. LOL
There’s context and metadata which is missing, which is part of the problem. AI just strings together data it has minus any context or understanding. Frankenstein’s monster is a reflection of it’s creator who is vapid, narcissistic, and ultimately empty.
Flatland_Spider,
Yes, IMHO that’s the important lesson. The perception that AI is true or infallible needs to be corrected. Not because the AI is at fault but because AI isn’t an oracle that can differentiate between fact versus fiction.
As a thought experiment: place an unbiased observer in a black box to observe events and then report on them. The reporting may well be very biased and one sided, but not necessarily because of bias on the part of the observer. We could get very skews results by gaslighting the observer. People need to understand that AI is vulnerable to being gaslit and this is very hard to solve when it comes to processing unfettered information on the web.
>> People who have been huffing paint fumes and reading Ycombinator Hackernews. Bad analogy,…
…actually it’s a pretty good analogy. Biasing “ai” hallucinations is more or less like making it huff Dichlorethane and read meinkampf at the same time.
>> I’m sure it is. I’m sure the Nazi’s buying the book really like the book. LOL
You{‘d|won’t} be surprised. The last nazi i’ve seen said it was boring. Speaks volumes both about the fsckn nazis and the book in question.
That’s not a bug. That’s a feature.
We’ve known this for years. Remember the MS chatbot Tay which lasted 16 hours? That was 2016. LOL
https://en.wikipedia.org/wiki/Tay_(chatbot)
They were trying to wash the racism/sexist/homophobia/transphobia/misogyny behind another algorithm like credit scores replaced the racist policy of redlining. They’re just too dumb to do it correctly.
They’re stupid moth212f08736k72rs.
Probably nonrelated, but do you remember when IBM tried to expand Watson training by letting it to dig in on some internet shit? Not trying to undermine your point, rather underline it. The punchline i still remember from back than can be roughly described as “lol, you have cancer”.
…that was also a jab at Alfman, whom i actually respect, for the phrase “bad training data”.
No “training data” is inherently “bad”, it just can be biased. Just like when you tell your children russia is the third reich’s rightful descendant and hitler did nothing wrong. If you repeat that enough they’ll finally believe it.
Tin Worlock,
I wasn’t expecting a debate on the semantics of bad and biased, haha.
In my mind there seem to be two different ways the data can be construed as bad.
1) It can be factually wrong.
2) It can be subjectively immoral.
I’m still of the old school mindset that facts are universal, a tenant of reality that we should all cling to. However these past several years have been really eye opening. Many people don’t care and are seemingly comfortable and enthusiastic elevating “alternative facts” above actual ones. Pursuit of truth is elitist now. This is such a regressive “does not compute” world view to me, but as the RDF bubble grows people like me who stubbornly believe in the importance of facts are being pushed out while those who reject truth get to run the world.
Wait till Thom finds out what the Chinese have been up to. Russia’s 3.6 million troll articles is child’s play compared to Beijing.
And 3.6 million troll articles isn’t a drop in the bucket compared to what the CIA, MI6, BND, and DGSE have been up to.
I think a fundamental thing to understand (and explain to others) is what do LLMs aim to achieve. Their “mission”, their problem to solve, is not furthering knowledge (because they cannot discern and do not understand the meaning of truth), but to plausibly output something a human considers to be speech related to their prompts.
I recently published a review of a relatively simple and quite interesting article titled ChatGPT is bullshit, published by Springer Nature’s “Ethics and Information Technology” journal. I suggest people read it. Even with this seemingly aggressive title, it drives the point very successfully that LLMs are not made to care for truth in what they output: they are only meant to convince their readers it is a reasonable conclusion we would get given enough data. The chosen term “bullshit” is not brought by the authors from the void, but it is a term use in their area to distinguish from a lie (that is actively opposing truth).
I am in general agreement on your statements re LLM. it looks cool at first, it has been pushed on us by the large tech corporations that have our attention and money, and it does substantially less than what they have told us it does. There are currently no LLM products that warrant the billions of dollars of expenditures in process. When investors get wise to the lack of ability here, there may well be a serious stock market correction.
AndrewZ,
While training costs are high, there are no LLM products that cost “billions”, they’re more along the lines of tens of millions.
https://www.galileo.ai/blog/llm-model-training-cost
We also have to consider that while training costs are high, marginal costs are low. For better or for worse, this means that once a NN is trained, it’s fairly cheap to deploy at scale. While many people are portraying AI as expensive and inefficient, it must be measured against human training costs, expenses. and inefficiency, which is also notoriously high for employers. Over time I think that more employers are going to see these AIs as a way to automate many processes faster and cheaper than regular employees had been doing them.
There will always be winners and losers and the stock market will go up and down – this is normal. But at the end of the day I still think AI is going to displace a whole lot of people: truck drivers, hollywood effects, taking fast food orders, translators, answering phones, music, video game map design, etc. And we’re still in early days, all of these will keep improving.
I don’t want anyone to misconstrue this as my endorsement for AI to take human jobs, but I worry that too many people are not taking the threats seriously enough.
“Insightful and intelligent” doesn’t really mean “it’s the truth”, it just means, well, there’s some insights to be found but it could be in 1% of the sentences or in 99%.
Marx’s work is “insightful and intelligent” because of how it described capitalism well (I read Das Kapital), but it completely falls apart in its proposed “solution” (communism) which ultimately has led to the suffering of millions of people.
I haven’t read Hitler’s book but perhaps it also gives you an “insight” into the problems in Germany during the 1920s, it’s just that the proposed solutions turned out to be even worse.
So yeah, I can understand how the AI ends up saying this when it is supposed to sell as many books as possible…