The team that makes Cockpit, the popular server dashboard software, decided to see if they could improve their PR review processes by adding “AI” into the mix. They decided to test both sourcey.ai and GitHub Copilot PR reviews, and their conclusions are damning.
About half of the AI reviews were noise, a quarter bikeshedding. The rest consisted of about 50% useful little hints and 50% outright wrong comments. Last week we reviewed all our experiences in the team and eventually decided to switch off sourcery.ai again. Instead, we will explicitly ask for Copilot reviews for PRs where the human deems it potentially useful.
This outcome reflects my personal experience with using GitHub Copilot in vim for about 1.5 years – it’s a poisoned gift. Most often it just figured out the correct sequence of
↫ Martin Pitt)
,]
, and}
to close, or automatically generating debug print statements – for that “typing helper” work it was actually quite nice. But for anything more nontrivial, I found it took me more time to validate the code and fix the numerous big and subtle errors than it saved me.
“AI” companies and other proponents of “AI” keep telling us that these tools will save us time and makes things easier, but every time someone actually sits down and does the work of testing “AI” tools out in the field, the end results are almost always the same: they just don’t deliver the time savings and other advantages we’re being promised, and more often than not, they just create more work for people instead of less. Add in the financial costs of using and running these tools, as well as the energy they consume, and the conclusion is clear.
When the lack of effectiveness of “AI” tools our in the real world is brought up, proponents inevitably resort to “yes it sucks now, but just you wait on the next version!” Then that next version comes, people test it out in the field again, and it’s still useless, and those same proponents again resort to “yes it sucks now, but just you wait on the next version!”, like a broken record. We’re several years into the hype, and that mythical “next version” still isn’t here.
We’re several years into the “AI” hype, and I still have seen no evidence it’s not a dead end and a massive con.
The mere fact that we are attempting to us computers to think for us and that they can form thoughts, even if they are wrong, is a miracle. We should be rejoicing at the emergence of a new kind of intelligence. Billions of years of evolution have lead to this moment, and all we can do is complain that it’s not perfect immediately? We live in sad times.
This is such a puzzling response to an article providing yet another example of how all the massive amounts of money being spent on burning up the planet to feed this garbage for the sake of making the ultra-rich richer and everyone else poorer CONTINUE to fail to produce anything but negative value. I’d like to emphasize: computers can’t ‘form thoughts’; there is no ‘new kind of intelligence’ in LLMs; and a stochastic plagiarism machine is not a miracle.
My view of AI changed when I asked grok to explain to me how to create a chaotic random generator in C. It is the first time i got an explanation as to why the one I wrote works so well. There is something there and it have the potential to change society.
First, you didn’t get an explanation, you got a string of words that sounds like a plausible explanation to you. They might be correct, they might be gibberish; it doesn’t matter to the LLM, because, and I cannot repeat this enough, the LLM is not thinking. It does not understand your code. (See also: the article we’re commenting on.) It does not ‘know’ anything and it cannot ‘explain’ anything. All it’s doing is spitting out the most likely word, again and again.
Second, it doesn’t just have potential, it’s already changing society, unambiguously for the worse. Technology companies are diverting resources from things that producing anything of value. New fossil fuel power plants are being built and old ones recommissioned just for ‘AI’. Companies are firing workers en masse because they think LLMs are good enough replacements. Entire artistic and creative disciplines are disappearing because there’s no longer a way to make a living.
Aankhen,
This is an interesting philosophical question. A dictionary cannot think, and an encyclopedia cannot think. While they may be right or they may be gibberish, the fact that they don’t think isn’t really the relevant question here. A NN classifier trained on cell xrays to identify which samples are most indicative of future cancer. Such classifiers don’t “think”, they clearly know much less about cancer than humans, and yet they are still useful and may predict cancer even better than humans.
LLMs are similar in that they should only be seen as tools. Saying “they don’t think” is not a very compelling reason to declare them worthless. They are not AGI and their worth doesn’t derive from them thinking.
It’s important to understand that an algorithm that outputs one token at time has no intrinsic bearing on the quality of it’s output. Mathematically the only thing that matters is the output. The method by which output is calculated, though interesting to us, is philosophically inconsequential versus an algorithm that can produce the output differently..
This isn’t to say that LLMs don’t make mistakes, but the notion that “LLMs only predict output one word at a time” is not an objective sign of inferiority.
I can certainly see a case being made for this. My opinions are obviously a bit different. I think AI is on track to increase overall productivity over time. To me the ultimate problem with AI doesn’t lie in it’s failure to evolve, but it’s success once it does.. IMHO everybody is underestimating the scope and social costs of human displacement, which I think is coming.
Exactly.
Alfman,
Pardon my not quoting you but I’m still not sure what the syntax is here after all these years. Regarding the conversation about thinking, I wasn’t saying that forming thoughts is a prerequisite for being useful. Rather, I was responding to a comment containing misinformation about the nature of LLMs. (The same goes for what I said about guessing the next probable word.)
As far as productivity goes, I think LLMs could potentially be a valuable tool if the tech industry as a whole weren’t trying to make them the only tool. The fundamental issue is that LLMs present the appearance of omniscience when they’re anything but omniscient, and it’s human nature to mistake that for true omniscience. We’re already starting to treat them as deities who provide answers rather than one more tool for finding them. In the long term, this replaces reality with a consensual regurgitated fantasy and human expression with cliché, because LLM output by definition tends towards the mean.
And then there’s the fact that the progress of ‘AI’ development is only slowing further and further, requiring exponentially more power—literal and figurative—and resources for even more slight gains, with no more original content remaining unplagiarized… pardon me, used as training material. I doubt LLMs will go anywhere, but the bubble’s gonna pop, and we’ll all have to suffer the consequences.
Well, not the people running these companies. They’ll only get richer.
Aankhen,
Yeah it ought to be in the FAQ.
That’s kind of the point when I said LLM isn’t AGI. I actually think *we* are at fault for treating it like factual oracle. It’s really a thoughtless algorithm subject to garbage in, garbage out, especially given unvetted training material. While your points about misinformation are valid, I think it’s fair to ask what this says of humans, because the same criticism does apply to us. This is a common weakness with a lot of the “AI/LLM bad” arguments. Double blind testing may help judge things more fairly.
Yes, I agree that’s a problem. We need people to be better educated about what LLM is and isn’t. We need to view it as less of a “deity” to use your word and more of a tool. When we do that I think there is a better appreciation for what LLMS are good at. They are fantastic at interfacing with humans in our own languages and IMHO this is where most of the opportunities reside. LLM aren’t best on their own but combined with other tools. For example take an SQL database with a LLM front end, it would be an incredible boost to productivity. I think such developments are coming.
In some creative domains like movies and recording industry, the case for generative content generation is already apparent and continues to improve.
I don’t agree with you that progress slowing. If it’s true that Moore’s Law is reaching its end that could be a concern, but so far we’re still seeing node improvements, which still means things are getting more efficient. Also, we should not ignore software optimizations. Consider that China has already demonstrated that they can build models with a fraction of the resources used by western companies. Our brute force approaches can be improved on.
I do agree that DNN training is expensive and resource intensive, but once developed
many of these LLM models actually scale very well. It absolutely amazes me how personal hardware keeps delivering more performance with less power. I’ve literally run LLAMA 3.1 8B in real time on a low end 4060 in an old SFF that I bought used a decade ago with a mere 230W power supply and it doesn’t even max out. I’m blown away by the level of progress.
I do see your point, you can make the case can training AI on publicly available web content should not be allowed. However I will point out the hypocrisy. Every last one of us learns from content published on the web. When I have a question and find an answer on the web; my brain’s neurons create new links and then I go about using this knowledge for myself. So do you and practically everyone else. Of course people who are bent on criticizing AI will invent some technicality to blame AI while giving humans a pass. I imagine that you will want to do that too. However on a philosophical level, why should it be that copying knowledge on a artificial NN be disallowed when it’s allowed on a natural NN? This is easier to answer if you hold the view that handicapping AI is a goal in and of itself, but if you don’t then I find it far more difficult to justify.
Yes, there are plenty of over hyped gimmicks. It’s a gold rush and many early AI prospectors are going to fail. However I also think you’re failing to consider the type of applications where AI including LLM have long term staying power is not these gimmicks but rather really boring corporations applications where they benefit by replacing expensive employees with much cheaper bots. You call it a bubble, but I see it very differently. Initial teething trouble is obviously real but there’s no reason ti can’t be overcome especially as bots become more specialized. Once bots are brought in, it’s going to be financially non-viable for corporations to go back to human labor. The hype will come and go, but the high price of human labor is what cements AI as a long term industry rather than a bubble.
Aankhen, you are scientifically incorrect.
I’m afraid repeating something enough times doesn’t make it true. The complete lack of substance in your reply speaks volumes, and your words are frankly insulting to the very notion of scientific thought.
Papers by, say, Anthropic are out there if you are interested. You are not though, which is why I don’t feel my time will be well used trying to explain. My statement was simply for the record.
Your statement is duly noted and recorded: read the writing of purveyors of snake oil to understand the benefits of snake oil. It seems both your credibility and my willingness to engage have vanished.
Thom Holwerda,
It’s worked for wayland, why not for AI 🙂
(j/k)
It’s one thing to be critical of the hype, I agree there’s too much nonsense innovation being hyped. I’m seeing a lot of AI slop in articles, which are very well written as far as grammar goes…better than mine 🙂 In these articles the AI speaks with a degree of authority, but their substance is so weak, it’s really bad. I’m left wondering how many technical writers are loosing their jobs to these AI authors. Yet I also wonder if those AI authors may be more profitable than the humans being replaced anyway. After all they produce content more cheaply and this content does rank highly in google. I’ve also noticed more AI slop on youtube with nearly photorealistic images, but the entire video is just weirdly artificial.
I’m not a fan of this. I think creative domains should be left to humans, but honestly I’d be in denial if I said the improvements in quality over just a couple years haven’t been dramatic. Hardware keeps getting more capable. As prices come down, the training is going to become more specialized and bot creators will be less dependent on the generic models we’re making fun of today. They are improving and I just think it’s a mistake to assume that the growth of AI is done. I worry that a lot more humans are at risk of being displace by AI.
Alfman,
For what it’s worth, I am coming to see your comments on generative AI as increasingly nuanced and well reasoned, despite me not always agreeing with you. I wanted to let you know that your viewpoint adds value for me 🙂
I would tend to class myself as a sceptic, at least in regard to the current hype cycle and gen AI being pushed into places it simply doesn’t belong. I also share a lot of Thom’s opinions about it. That said, it is refreshing to me that time and time again on these articles, your comments being level headed, reasonable, and well thought out. Keep doing what you’re dong buddy!
Regards,
Phil
PhilPotter,
Wow thanks, haha. Sometimes I feel like I’m being written off as an “AI fanboy zealot” because I believe AI is going to keep improving rather than crash. It’s not that I find all AI startups are going to change the world, most will likely fail, but it only takes a tiny subset of them to strike a chord with businesses who have money and an incentive to replace more expensive employees.
I don’t really know this guy, the clip just showed up on youtube today, but I find it aligns with my POV quite well…
“Let’s Talk About AI…”
https://www.youtube.com/watch?v=Xx2SdGZ1S7U
I may have been guilty of assuming that of you at one point, but I know better these days.
Watched the video, it was interesting particularly from a creative point of view. That said, the core tenet – that generative AI will keep getting better – is very much up for debate still. In my view, plenty of evidence shows the opposite. Appreciate your view as well though 🙂