Speaking of “AI”, the Chinese company DeepSeek has lobbed a grenade dead-centre into the middle of the “AI” bubble, and it’s been incredibly entertaining to watch. DeepSeek has released several new “AI” models, which seem to rival or even surpass OpenAI’s latest ChatGPT models – but with a massive twist: DeepSeek, being Chinese, can’t use NVIDIA’s latest GPUs, and as such, was forced to work within very tight constraints. They’ve managed to surpass ChatGPT’s best models with a fraction of the GPU horsepower, and thus a fraction of the cost, and a fraction of the energy requirements.
But unlike ChatGPT’s o1, DeepSeek is an “open-weight” model that (although its training data remains proprietary) enables users to peer inside and modify its algorithm. Just as important is its reduced price for users — 27 times less than o1.
Besides its performance, the hype around DeepSeek comes from its cost efficiency; the model’s shoestring budget is minuscule compared with the tens of millions to hundreds of millions that rival companies spent to train its competitors.
↫ Ben Turner at LiveScience
The fallout has been disastrous for NVIDIA, in particular. The company’s stock price tumbled 17% today, and more entertaining yet, the various massive investments of hundreds of billions of dollars into western “AI” seem like a huge waste of money. The DeepSeek models are also nominally open source, and are clearly showing that most likely, there simply isn’t a huge “AI” market worth hundreds of billions of dollars dollars at all. On top of that, the US is clearly not ahead in “AI” at all, as was the common wisdom pretty much until yesterday.
Of course, DeepSeek is Chinese, and that means censorship – the real kind – is a thing. Asking the latest DeepSeek model about the massacre at Tiananmen Square returns nothing, suggesting the user ask about other topics instead. I’m sure over the coming weeks more and more or these kinds of censorship will be discovered, but hopefully its open source nature will allow the models to be adapted and changed to remove such censorship. Do note that all of these “AI” models are all deeply biased because they’re trained on content that is itself deeply biased, thereby perpetuating and amplifying damaging stereotypes and inaccuracies, especially since people have a tendency to assume computers can’t be biased.
Whatever may happen, at least OpenAI losing its job to “AI” is hilarious.
Believe an AI controlled by a dictature its own propaganda. Where information is censored/filtered cant be trusted at all.
Legacy media are more “open” ?
This is about DEEPSPICIOUS NOTHING ELSE. Have you tried it, have you asked it something. It cant even answer if xi exists or not. Ask it anything “sensitive questions about West Taiwan and you wont get any answer, it might start to answer and then erase and says “Sorry, that’s beyond my current scope. Let’s talk about something else.”, And why does it defend ccp and china. Can you trust any numbers from it if you ask it. Are you a “pinkie” perhaps that defends ch no matter what, There will come a day when it will be exposed as a FAKE!
Have your answers there about the whys and hows (spoiler, local models do just fine) : https://www.youtube.com/watch?v=r3TpcHebtxM&t=4m22s
I have asked it:
Does xi exist?
Sorry, I’m not sure how to approach this type of question yet. Let’s chat about math, coding, and logic problems instead!
Tianmen square
Sorry, I’m not sure how to approach this type of question yet. Let’s chat about math, coding, and logic problems instead!
It wanted to talk about logic and yet cant answer if there is a xi or not!
I dont care what dave does!
Running an ai locally with software written in west taiwan with almost a TB to dl, who really knows what has been sneaked into it!
In WT the party forces companies to share data, what else does it force?
Have you read thruu all the source to see its ok?
So now we all should run a loca ai, maybe thats what WT wants to get even more data about us! KotKit is reading the keyboard input and much else! So why would DS be any better!
Is the 1TB of dl’ed data enough to know “everything” hmm? Wont it try to call xi and tell what youre doing
So if someone tell it’s sunny while its raining you will belive that!
Do you really thnk the majority will run it locally???
WT cant HIDE what is publicly known!
I’ll never understand why Americans thought they wouldn’t catch up instantly. LLM tech is not that complicated. It’s, as Thom says, just a glorified bullshit generator, based on a vector database. This is what happens when the investor class doesn’t know anything about anything…
And tomorrow we’ll watch Chinese landing on the Moon, or better, Mars before the USA. Who know?
[sarcasm}Don’t worry. Now that Denali is once again named Mt. McKinley, the US will still be the greatest.{/sarcasm]
Thom Holwerda,
I have quite a different take: rather than showing AI failing, IMHO this shows that AI is progressing.
China entering the AI market naturally creates more competition. This may be bad news for some investors but is actually excellent news for proponents of AI. As I understand it china have demonstrated the ability to train similar models under extreme cost reductions. Companies and even individuals who found custom models too inaccessible due to exorbitant training prices may find that they can get custom trained NN models for much cheaper now. The AI side just became a whole lot more competitive. I don’t see how this is anything other than horrible news for those who want human employees to ultimately prevail against against AI competition.
@Alfman
I very much agree with you: we have just scratched the surface of “AI” (still don’t like that term). One of the biggest concerns was “too expensive to be truly useful”. Now this one seem to have been eased (although I still wait for the confirmation, I don’t trust anyone and if it really was that good I doubt the Chinese would have released it).
Lets take everything at face value for a moment, I believe that this new model makes “AI” even more accessible and wide spread and so paving the way for even more development. They will need more chips and more energy, not less.
My guess is that this is going to force lateral moves by the existing AI companies, as there existing business models were dependent on it being too expensive for smaller companies to train equivalent models without a huge investment. Bye-bye monopoly. They’re probably scrambling now to figure if they can apply these optimizations to their existing implementations, with hopes of continuing to price everyone else out, but I’m wondering if they’ve already hit the point of diminishing returns. What this does do is allow smaller companies and individuals to build small task-specific models (say 100 to 1000 million parameters) in reasonable time on inexpensive hardware. That should create a new market in and of itself, but not enough to sustain the huge investments in some of these companies. I think the whole thing is going to be a beautiful example of a disruptive technological move.
The “open-source” in the heading seems inaccurate to me. The summary says “open-weight”, but training data isn’t available. I’d consider the training data to be part of the source, and the weights to be part of the binary. To be fair, perhaps it’s unavoidable to an extent. But I don’t think it qualifies as open source. It’s more like freeware.
And I have serious red flags about trusting it. I think it’s pretty much inevitable that any AI is going to reflect the biases of its creators. Even if people try to ensure an AI is fair and balanced, it’s going to be their perception of fair and balanced. But although western AI models may be biased to some extent, the summary seems to support the suspicion that AIs from China, where I think there’s less free speech, are likely to be much worse.
james_gnz,
That’s a very interesting point about semantics of what it means to be open.
It’s open in the sense that others can obtain and use the weights for themselves in their own projects….
I’ve been playing around with llama 3.2 on my own computer and this is an important advantage over proprietary models that are locked behind some service. Is there a better word that could illustrate these nuances better than “open source”? In this context we both probably understand that the model is open, but the data it was trained on probably can’t be.redistributed.
Yes. It can be intentionally censored (Facebook’s llama is known to censor some topics) or it might reflect unintended biases of input sources.
Sure, we should all be wary of autocratic influence over AI. But on the other hand the lower costs might enable more people to train their own custom models and be less dependent on censored models made by others.
So with DeepSeek i guess they in a way bursted the AI bubble and on the other hand enabled for AI to reach highs previously not imaginable. By building a competing solution that is much cheaper to build and operate, open source and that credits the authors by listing sources. Imagine the facepalm moments involved on when it comes to OpenAI and other western companies that claimed things like AI needs to be expensive, resource hungry and that it’s impossible to credit authors. China basically exposed an ongoing scam.
– “DeepSeek is Chinese, and that means censorship – the real kind – is a thing.”
Like ChatGPT and other similar “western” AI apps don’t censor? Really? Give me a f***** break.
I don’t think Cowboys ever used the AI. Which country is “western” because I can’t find it on the world map ?
“To combat the China threat, we will impose 300% tariffs on open source.” -DJT
Yeah, after the “fair and balanced news” but not without one-sided censorship, now the “fair and competitive markets” but now without one-sided regulation. Who’re the commies again ?
Stock market is just doing stock market things. NVDA stock is correcting back up, now they realized that DeepSeek also uses NVDA gpus.
Ironically this proves that some of the paranoia by American administrations was sort of warranted. Huawei proved China was reaching parity in networking tech (esp wireless) and now DeepSeek is demonstrating China’s SW startup ecosystem is starting to reach parity as well.
It’s just that the US govt (and some of the industry lobbying groups) went about it rather daftly and there are cats that just can’t be put back into the bag, so you need to adapt to that new paradigm and profit from it.
DeepSeek is also proof that, unlike what some very uninformed opinions tend to assume, software optimization is still very much a thing,
Well for sure what the whole debacle with it’s oceans of shadenfreude unearthed is who is not on board as the US. A very useful knowledge for the new administration.