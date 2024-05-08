Yuxuan Shui, the developer behind the X11 compositor picom (a fork of Compton) published a blog post detailing their experiences with using GitHub Copilot for a year.
I had free access to GitHub Copilot for about a year, I used it, got used to it, and slowly started to take it for granted, until one day it was taken away. I had to re-adapt to a life without Copilot, but it also gave me a chance to look back at how I used Copilot, and reflect – had Copilot actually been helpful to me?
Copilot definitely feels a little bit magical when it works. It’s like it plucked code straight from my brain and put it on the screen for me to accept. Without it, I find myself getting grumpy a lot more often when I need to write boilerplate code – “Ugh, Copilot would have done it for me!”, and now I have to type it all out myself. That being said, the answer to my question above is a very definite “no, I am more productive without it”. Let me explain.↫ Yuxuan Shui
The two main reasons why Shui eventually realised Copilot was slowing them down were its unpredictability, and its slowness. It’s very difficult to understand when, exactly, Copilot will get things right, which is not a great thing to have to deal with when you’re writing code. They also found Copilot incredibly slow, with its suggestions often taking 2-3 seconds or longer to appear – much slower than the suggestions from the
clangd language server they use.
Of course, everybody’s situation will be different, and I have a suspicion that if you’re writing code in incredibly popular languages, say, Python or JavaScript, you’re going to get more accurate and possibly faster suggestions from Copilot. As Shui notes, it probably also doesn’t help that they’re writing an independent X11 compositor, something very few people are doing, meaning Copilot hasn’t been trained on it, which in turn means the tool probably has no clue what’s going on when Shui is writing their code.
As an aside, my opinion on GitHub Copilot is clear – it’s quite possibly the largest case of copyright infringement in human history, and in its current incarnation it should not be allowed to continue to operate. As I wrote over a year ago:
If Microsoft or whoever else wants to train a coding “AI” or whatever, they should either be using code they own the copyright to, get explicit permission from the rightsholders for “AI” training use (difficult for code from larger projects), or properly comply with the terms of the licenses and automatically add the terms and copyright notices during autocomplete and/or properly apply copyleft to the newly generated code. Anything else is a massive copyright violation and a direct assault on open source.
Let me put it this way – the code to various versions of Windows has leaked numerous times. What if we train an “AI” on that leaked code and let everyone use it? Do you honestly think Microsoft would not sue you into the stone age?↫ Thom Holwerda
It’s curious that as far as I know, Copilot has not been trained on Microsoft’s own closed-source code, say, to Windows or Office, while at the same time the company claims Copilot is not copyright infringement or a massive open source license violation machine. If what Copilot does is truly fair use, as Microsoft claims, why won’t Microsoft use its own closed-source code for training?
We all know the answer.
Deeply questionable legality aside, do any of you use Copilot? Has it had any material impact on your programming work? Is its use allowed by your employer, or do you only use it for personal projects at home?
The only “A.I.” tool I use is the copy of Stable Diffusion I pootle around with privately as a brainstorming aid. (Much as talking to a friend about writing ideas helps to break you out of being fixated on a single solution, seeing Stable Diffusion interpret what you wrote differently can also help that way… sort of like pair coders and linters are both useful, but neither is a substitute for the other.)
…and the reason it’s the only one I can use is that I can run it offline, where I don’t have to worry about someone deciding the trial period has gone on long enough and I need to pay for it now.
ssokolow,
+1.
AI will continue to become more useful, but being locked into proprietary services is a huge set back. Open AI models are clearly important but it does bring up the question of whether open community AI software will be able to compete against proprietary AI services if large AI models are destined to be vendor locked by corporations that have significantly more resources. to train AI models.
Edit: should have proof read this before submitting, haha.
Well, a ChatGPT plugin on this site, proof reading and correcting automatically would be really nice and helpful 🙂
I wrote a fierce response to a requested yesterday. ChatGPT rephrased it into “make it friendly” for me and nobody got offended yet. I love it!
Thom Holwerda,
My objection to this logic is that it vilifies AI learning from copyrighted works while overlooking the fact that humans have been doing it forever. It leaves me with the impression that the copyright argument against AI but not humans may be a form of prejudice. If so, can we justify this prejudice against AI and should it be codified into law?
I used copilot for work for about two years, starting pretty soon after it was introduced. My experience is that it’s quality is highly dependent on the language you are using.
Python is often startlingly good, as long as you can describe your plan in comments or there is sufficient context for what you are doing. The first time I noticed this was when I was trying to make a bunch of charts from some complex data structures using matplotlib. A two line comment and it popped out roughly thirty lines that very nicely graphed what I needed. I ended up slightly tweaking the legend, but that was it. It does tend to use slightly older Python styles, so I don’t see many of the features added after 3.5 or so.
It’s C and C++ are often laughably terrible. I regularly saw hallucinated libraries and function calls, nonsensical use of pointers and arithmetic, and in the case of C++ absolutely no understanding of the STL. It was good at generating or extending basic boilerplate and ok at generating test cases. The one thing it was consistently good at was generating textbook algorithms when I would otherwise look them up.
For writing documentation, it was useless.
Absolutely same experience here!
Barty,
This is just a theory, but is it possible that python libraries are better defined/documented?
There are times when using C libraries that I found the documentation missing/out of date/wrong and I want to pound my head on the wall. Projects like openssl and ffmpeg are notorious for inadequate documentation and breaking changes such that I’ve had to resort to debugging the library source code myself just to figure out how to use it correctly. A NN solution is probably going to be at a major loss here since even if it follows the documentation the code can still be wrong.
It makes me wonder if languages that have good documentation are easier for AI to get right? For all my complaints about PHP, at least it’s well documented. If somebody tries AI generation for PHP let me know how well it works 🙂
Spot on! And it would be so much fun because what exactly is “A.I”? Which algorithms to ban exactly? Just anything with pattern matching and probabilities? How about Markov Chains and Monte Carlo simulations? How are those less “A.I” then Language Models?
And please can we stop calling it “A.I”? There is nothing intelligent about it (yet), just pattern recognition (an excellent though!). Only when it knows that its lying (and it does that a lot) and when it can take “no” for an answer, we can talk about intelligence traits.
Beside that: I love those tools! Best API quick search ever especially when writing boilerplate code, text for the auditors or management or code in an un-familar syntax (like R or Julja).
Btw, I am actively publishing code on Github and I have zero concern about AI although I do earn my money with it: anything what AI can do with it is trivial anyway and any smart Indian will figure himself when you give him a budget. Writing the new code and understanding the customers requirements is where the frog has its curls.
Example: SQL is a standard, any RDBMS publishes its API and SQL implementation online. Should be the most comfortable field for an “A.I” because so many patterns to learn. Yet, give any “A.I” a simple task: to write you a SQL parser or formatter. Simple technical task, not deeper knowledge or experience needed, no room for interpretation. Go and see what happens and you will be chilled about all those tools.
Andreas Reichel,
People have a tendency to read too much into “AI”. In computer science we’ve been using the term for rather mundane tasks that are intelligent in the context of the task but not for anything else. AI merely means it contains artificial intelligence, not that it approaches human intelligence. Rather than redefining AI to exclude basic intelligence, I’m more in favor of introducing a new term for higher level intelligence, which we’ve dubbed “artificial general intelligence”.
https://en.wikipedia.org/wiki/Artificial_general_intelligence
By that definition, I seriously think we’d have to exclude humans.
We’re still in the early phases of AI, It’s likely to improve with time. The static models are limited by training data. What we are asking AI to do today is akin to having a human read a book about an unfamiliar language and then write flawless programs without ever having run the compiler or tested the software. I’m impressed that it works as well as it does today. Future evolution will start giving AI the ability to compile and test it’s own code, this should help back-fill gaps in training data the same way human coders do. This will prove to be important for AI to be able to gain even more knowledge than was available in the training set.
> What we are asking AI to do today is akin to having a human read a book about an unfamiliar language and then write flawless programs without ever having run the compiler or tested the software. I’m impressed that it works as well as it does today.
100% d’accord.
> Future evolution will start giving AI the ability to compile and test it’s own code, this should help back-fill gaps in training data the same way human coders do.
This one is interesting though! I agree that it should be able to compile and test during a learning phase. But once it has learned, should it not be able to write flawless code — at least within the syntax and API of a language?
I find that a very fascinating question because indeed I have no good answer how to differentiate between “learning” and “executing”. First instinct response was: its different, but when thinking about it indeed its the same because every execution is learning at the same time when hitting new challenges.
Andreas Reichel,
I agree that in principal a learned pattern should be more or less flawless. In practice though I’m not really sure if “100% perfect” is possible using strait up static NNs. Consider that the neurons in machine learning are often represented by half floats, which is only a mathematical approximation.
https://en.wikipedia.org/wiki/Bfloat16_floating-point_format
https://en.wikipedia.org/wiki/Half-precision_floating-point_format
These approximations, compounded over deeply nested neural pathways will likely introduce some mathematical errors in the model. I’m tempted to call this “the kraken”, as in kerbal space program – the lack of simulation precision results in unexpected chaotic behaviors that defy the simulation’s programming. Obviously this tradeoff is done for performance and memory reasons, but in theory the lack of precision could cause a “perfect” NN to diverge somewhat.
IMHO, in order to work towards the “general” in Automated General Intelligence, it’s not enough to train a NN and call it a day, that NN has to be able to perform it’s own proper testing/debugging/research.
Insightful!
> that NN has to be able to perform it’s own proper testing/debugging/research.
And here it gets interesting: Why exactly should it do that? It has no purpose and no fear to die. You can beat it into the pattern matching, but you can’t enforce curiosity or creativity. Without our intrinsic hunger for life, there was no curiosity and no will to develop or to conquer. As long as this magic spark — the equivalence of the will to live — does not happen, we won’t see any intelligence, just mechanic skill.
Copilot is a great productivity tool for us. Specially once it has been trained enough in our own patterns from our repositories.
There is definitely no going back for us.
It’s genuinely useless. If you use a tool and it works fine two times then doesn’t work, how many times will you keep using that tool before you give up? And even worse, when it doesn’t work, it creates MORE work than not using a tool at all! Until it can consistently write functional code that runs the first time damn near 100%, it’s completely useless.
And my experience with Copilot is that it far too frequently creates gibberish that looks seemingly accurate. I’ve seen it create repeated formatting errors and call modules in ansible that didn’t even exist. It was wrong more often than it was write and then it was up to me to try and debug the mess.
No thanks, it’s got a long way to go before it’s worth hassling with. And it’s entirely possible that changes and I will be following it closely but right now it’s basically useless for me.
cmdrlinux,
Going by this standard, you’d have to concede that human programmers are completely useless as well, haha.
Fair enough. We can probably agree that it will be important for AI to gain the ability to automatically test and improve the code on a real tool chain. This feature is likely to come to AI solutions in the future.