Is machine learning, also known as “artificial intelligence”, really aiding workers and increasing productivity? A study by Upwork – which, as Baldur Bjarnason so helpfully points out, sells AI solutions and hence did not promote this study on its blog as it does with its other studies – reveals that this might not actually be the case.
Nearly half (47%) of workers using AI say they have no idea how to achieve the productivity gains their employers expect. Over three in four (77%) say AI tools have decreased their productivity and added to their workload in at least one way. For example, survey respondents reported that they’re spending more time reviewing or moderating AI-generated content (39%), invest more time learning to use these tools (23%), and are now being asked to do more work (21%). Forty percent of employees feel their company is asking too much of them when it comes to AI.
↫ Upwork research
This shouldn’t come as a surprise. We’re in a massive hype cycle when it comes to machine learning, and we’re being told it’s going to revolutionise work and lead to massive productivity gains. In practice, however, it seems these tools just can’t measure up to the hyped promises, and in fact is making people do less and work slower. There’s countless stories of managers being told by upper management to shove machine learning into everything, from products to employee workflows, whether it makes any sense to do so or not. I know from experience as a translator that machine learning can greatly improve my productivity, but the fact that there are certain types of tasks that benefit from ML, doesn’t mean every job suddenly thrives with it.
I’m definitely starting to see some cracks in the hype cycle, and this study highlights a major one. I hope we can all come down to earth again, and really take a careful look at where ML makes sense and where it does not, instead of giving every worker a ChatGPT account and blanket demanding massive productivity gains that in no way match the reality on the office floor.
And of course, despite demanding massive productivity increases, it’s not like workers are getting an equivalent increase in salary. We’ve seen massive productivity increases for decades now, while paychecks have not followed suit at all, and many people can actually buy less with their salary today than their parents could decades ago. Demands imposed by managers by introducing AI is only going to make this discrepancy even worse.
I just had to register to make a comment on this
I have used chatai googles and m$ AI bots.
First of all one has to be extremly specific when asking for something, because the ai’s assume a lot without asking for input.
And for programming assist i wouldnt trust it at all without extensive testing, tried it many times just for curiosity to assist
in simple script making with bash like a 20-50 rows of code nothinh advanced my imho, Thats when i realized one has to be
extremly specific in asking sometimes it doesnt help at all, the ai assumes a lot without asing for extra input. It might know the language but it doesnt comprehend english langauge nd how its used,
If this is used proffesionaly if would give headaches more often than not
i tried it for some simple research and even if given direct orders it doesnt listen to them and still yapps about it.
if asking a question about linux and ex. virtualbox it will answer with windows included even if told speicificly to only
include linux
One cant save preserve settings told to it,
Would i recommend to use it professionally NO WAY, since has to recheck what was said if true or not, if it was for software development i would test rigorously first to if it works as wanted.
Am i impressed by AI no “insert swearword here” way
Any work, code or prose, needs review or testing. Who or what wrote it is irrelevant. Is AI a cure-all? No. Can it be a productivity multiplier? A _smarter_ search engine? Absolutely.
For example a simple prompt to standardise spelling and simplify your post above results in the following – which I think speaks for itself.
Begin quote
> I just had to register to comment on this. I have used ChatGPT, Google’s AI, and Microsoft’s AI bots.
First of all, you have to be extremely specific when asking for something because the AIs assume a lot without asking for input.
For programming assistance, I wouldn’t trust it at all without extensive testing. I tried it many times out of curiosity to assist in simple script-making with Bash, like 20-50 lines of code, nothing advanced in my opinion. That’s when I realized you have to be extremely specific in your requests. Sometimes it doesn’t help at all because the AI assumes a lot without asking for extra input. It might know the programming language, but it doesn’t comprehend English and how it’s used.
If this is used professionally, it would give more headaches than not.
I tried it for some simple research, and even when given direct orders, it doesn’t listen and still goes off-topic. For example, if you ask a question about Linux and VirtualBox, it will include Windows in the answer, even if you specifically tell it to only include Linux.
You can’t save or preserve settings told to it.
Would I recommend using it professionally? No way, since you have to recheck what it says to see if it’s true or not. If it was for software development, I would test rigorously first to see if it works as intended.
Am I impressed by AI? No way.
Thom Holwerda,
I feel more aligned with your view here than on previous articles about AI.
There’s so much hype painting a completely false reality, but when you when you get down to the nitty gritty everything is really a lot more nuanced. AI productivity makes some tasks redundant but others less so. Most of the time I don’t see AI as wholly replacing employees, but rather being a productivity multiplier such that perhaps fewer employees are needed. Executives who buy into the hype may feel tempted to blanket AI across their workforce to be used as new miracle workers on day 1, but they have unrealistic expectations are too high. AI will keep getting better, but in most cases it still needs to be trained to do specific jobs proficiently. This takes time & money. IMHO AI will cause massive employee disruption in the long run, but it’s not a sudden transition. Despite this, specialized AIs will be trained to replace to replace employees doing specific tasks and I think that will cost far less than employees long term. However for employers who just wanted to use off the shelf AI to use on day 1…well today’s AI just isn’t at that point.
Jobs that involve writing natural language text may have to compete AI that’s become fairly convincing these days. I’d be worried if I were a writer. AI is just a tool and can be put to good or bad uses. I suspect modern spammers have started using AI to create customized spam on blogs. They can be very convincing at pretending to be a regular user until you see the spam links.
I asked chatgpt to add a pitch for a penis enlargement product to a discussion as a spammer might because it just cracks me up. Hopefully these are not too inappropriate.
Oh god, what have I done.
Joking aside, it’s fascinating that AI is able to grok the subject matter on its own and create a “meaningful” response. Obviously this sticks out like a sore thumb, but when given less ridiculous tasks I find LLMs do actually work quite well. Thom, I don’t know how you’d feel about it, but I think it would make for a very interesting article if you went and tested the abilities and shortcomings of today’s AI in greater detail.
Isn’t reference to Boeing as a pitch line in an ad a huge fail?
What next?
Just like Lehman Brothers is a great investment bank, our product will also take your penis to the sky…
cevvalkoala,
Yes, I had the same observation. That might have been my fault though. Chatgpt was following instructions to frame the content as a pitch. Another point is that chatgpt’s training data is a couple years behind so it doesn’t have knowledge of recent news.
I asked chatgpt “What are the risks of deploying crowdstrike in enterprise?” and then “What is the worst case scenario?” and it provided a large variety of examples including vendor locking, data privacy, compliance, cloud dependency and so on including these two that I cherry picked:
So although it didn’t know about the crowdstrike event, it was able to enumerate such a possibility.. ChatGPT might be less suitable for finished output, but just in terms of brainstorming and generating rough drafts it does these things very well IMHO.
*nod* The only thing I’ve personally found chatbot-style “A.I.” being better than predecessor technologies at when you take into account the the skill needed to vet its results, the time needed to make sure it gives good results, and the cost of ongoing maintenance, is situations where you need a meta-search engine and you’re not sure how to formulate your initial set of search keywords for a topic… and, even then, it has to be the right kind of something to avoid it confidently returning nonsense.
As a few recent examples of the results I got from Perplexity.ai:
“What movie am I thinking of that has this quote?” generally works well to cut through the morass of text-formatting variations if the answer is something as well documented as “Inside Man”. (“And therein, as the bard would say, lies the rub”, said dramatically.)
“What is this book I’m remembering from my childhood?” is guaranteed to confidently return utter nonsense if the book was retired from the local library in the 90s, originally from the 1960s, 1970s, or 1980s, its contents probably only exist on the Internet locked behind the Internet Archive’s e-borrowing interface, and it may or may not have been OCRed. (Multiple books. One example being Wish Come True by Mary Q. Steele, thanks to someone on /r/TipOfMyTongue/)
The results for “What reference book would you recommend for an author who wants to write … accurately?” return promising citations, but I’m still going to need to verify the summarization of what was in the cited Reddit threads and blog posts is real, then vet the resulting suggestions as if they just turned up somewhere like Amazon “people who liked this also looked at” suggestions. (I’m trying to start a writing hobby and want expand my “quick, trustworthy reference material” shelf beyond the Writers Helping Writers series by Ackerman and Puglisi.)
…but, really, I’m not surprised it’d do well there. Natural language processing for the purpose of linguistically fuzzy search has always been a weak spot.
A fundamentally statistical plausible gibberish generator with no deeper “mental model” being asked to write code or anything else with ongoing maintenance requirements as more than a “help me search for the correct API to achieve this goal” tool? Give me a break.
ssokolow (Hey, OSNews, U2F/WebAuthn is broken on Firefox!),
Well, you are asking LLMs that are trained with very broad knowledge to code, but that’s like asking a human who knows about code to write their own code without any training. Consider a parallel example of asking a generic LLM to play chess. It will sort of do it if you ask it to, but as the game progresses it becomes more and more comical. Further into the game it had me laughing so hard…
https://youtu.be/hKzsmv6B8aY?t=137
Factually we know that NN training methods exist that regularly beat our best humans at chess, so what went wrong with chatgpt’s chess moves? Well, Chatgpt was trained on natural speech patterns, including those found in chess games, but it lacks any sort of chess training whatsoever. Absolutely none. This struggle should not surprise us. At the same time we should not jump to the conclusion that more specialized training isn’t coming. Too many people are assuming that today’s AI represents some kind of AI plateau, but that’s not the case. More specialized AIs will crop up to match and beat as in more specialized niches.
I think I’ll refer to Thom’s more recent post. I don’t know about you, but I don’t find it would save me much time to have someone else (LLM or otherwise) write code for me if I then still insist on taking the time to understanding the codebase so I can feel confident in my ability to maintain it.
For stuff where I’m quite familiar with the APIs and language, and someone else wrote that specific code, it’s a bit of a wash at best… so why not just write it myself in the first place and get something that’s likely to be easier for my future self to re-familiarize himself with?
ssokolow (Hey, OSNews, U2F/WebAuthn is broken on Firefox!),
That’s fine, although I still consider LLMs to be at the infant stage. I expect that AI will not only evolve in terms of much improved code generation, but also code maintenance as well. I feel there’s a lot of innovative potential here. Maybe we can discuss this if you’d like?
I understand what you are saying. Two comments:
1) Sometimes I need to revisit code I wrote years ago, but with so many other projects in the interim, TBH I forget most of the code anyway. Is this just a me problem? Haha.
2) Familiarity is relative to the tools you are working with. If you write your code in assembly, you will be familiar with your code in assembly. If you write your code in a higher level language like C++, you will be familiar with your code in C++. Does assembly familiarity matter when you are writing in C++? By extension, does familiarity with the C++ code matter if you are using higher level AI to generate it?
Obviously I know we’re not there yet, but when we do get there, familiarity with source code might not even matter anymore.
Kver made this point when asking whether star trek characters need to be familiar with source code.
https://www.osnews.com/story/140379/the-impact-of-ai-on-computer-science-education/#comment-10442128
Perhaps source code no longer matters and the AI itself becomes the new programming interface.
Anyway, this is such a deep topic, probably too deep for wordpress.
Management setting up unrealistic goals and expectations has always been the main source of burnout through the ages, at least in the tech industry. AI is now the latest technology that is too hyped and too misunderstood by decision makers in many organizations.
The more things change…
At this stage it’s left me underwhelmed. What little use I’ve tried to make of it has been nothing but disappointing, it seems to sit in an “Uncanny Valley of Error”, perhaps NQR seems to be the best descriptor in that I’ve tried it for really difficult problems which is gets wrong, and I’ve tried it for the mind-numbingly easy but tedious problems which it equally gets wrong. So the two areas I need the most help, the areas it could deliver the greatest benefit, it’s effectively useless.
Keep in mind, my context of easy or difficult is subjective, because it may be what I think is difficult is easy for some other specialist, and what I think is easy is not for a someone less specialised.
I don’t find this very surprising. Reviewing/editing can be more taxing than writing in the first place. With just plain humans involved.
Remember when we used to have physical “word processors” that became programmes like MS word? I like to call the output of large language models “ultra-processed words” as part of a range of “ultra-processed content” (UPC) from so-called AI. This is a riff on “ultra-processed foods” (UPF).
Wikipedia: “Epidemiological data suggest that consumption of ultra-processed foods is associated with higher risks of many diseases, including obesity, type 2 diabetes, cardiovascular diseases, asthma, specific cancers, and all-cause mortality”
Perhaps the effect of UPC on our mental faculties and social discourse will mirror that of UPF on our bodies and health services?
https://en.wikipedia.org/wiki/Ultra-processed_food
Sometimes to punish prisoners, wardens take away work privileges. Why is Silicon Valley hell bent in depriving humanity of its sense of purpose?
I very much take this “research” with a heavy dose of salt. Upwork’s entire buisness model is lowest common denominator work. If you are unaware, it’s basically a bidding market that rewards doing IT tasks/projects at Substantially below market rates, with Upwork’s taking a cut ofc. For extra icing you have to Pay Them to make a bid, and Pay More to push it up the rankings shown to the “client”.
On a less kind day, I’d say it’s model is barely above exploitation.
Proliferation of AI is a direct threat to their buisness model. It’s so this “research” feels like the oil industry telling me solar power isn’t very good.
This doesn’t surprise me at all. There are many in my field, that isn’t related to programming, trying to use AI to automate redundant tasks. I tried and found I could do the work on my own using templates or boilerplate text in far less time than it took to generate a prompt, review and inevitably rewrite what the AI produced.
We rolled co-pilot out to a test group of users, they where excited at first but largely stopped using it. We don’t see the value in the high cost of the license and won’t be rolling it out. I used it myself for a while and yes it’s integrated into office and teams and can save you a few minutes here or there but not for the price. It’s image generation is poor for educational content, we ended up buying stock images instead.
It’s also obnoxious in email drafting and coaching, unless you love corporate speak and extreme obsequiousness.
It can be useful for review files for specific content, that’s about the best function we found. A more advanced find function.