The NetBSD project seems to agree with me that code generated by “AI” like Copilot is tainted, and cannot be used safely. The project’s added a new guideline banning the use of code generated by such tools from being added to NetBSD unless explicitly permitted by “core“, NetBSD’s equivalent, roughly, of “technical management”.
Code generated by a large language model or similar technology, such as such as GitHub/Microsoft’s Copilot, OpenAI’s ChatGPT, or Facebook/Meta’s Code Llama, is presumed to be tainted code, and must not be committed without prior written approval by core.↫ NetBSD Commit Guidelines
GitHub Copilot is copyright infringement and open source license violation at an industrial scale, and as I keep reiterating – the fact Microsoft is not training Copilot on its own closed-source code tells you all you need to know about what Microsoft thinks about the legality of Copilot.
Legalese will not stop CoPilot’s adoption. Corps look for ease of implementation, and less need to pay humans in development. That said, the justice system will side with Microsoft in this regard. That’s why they wasted no brain cycles in the consideration of law in this regard as they have the resources to fight and prevail against literally any open source entity, although I’m on the side of NetBSD in this.
To me this is still questionable. If a human learns how an algorithm works, even from copyrighted source, and then proceeds to rewrite expressions of that algorithm into their own code, copyright allows that without permission. Taking existing works and republishing them in new words is easy as pie for an LLMs. Traditionally copyright laws would treat these as new expressions, not copyright infringement. It might be a nondisclosure violation if an NDA were in play. Alternately it might be patent infringement or trademark infringement if those had applied. However if the expressions are new, then there’s case for copyright infringement by traditional standards. We’d have to redefine what copyrights are to ban these AIs learning from existing works I’m struggling to find a rational to restrict AI from doing what humans have always done.
Would your opinion change if you could select your desired license from a drop down box and copilote would generate code using LLM trained using only compatible licenses. I am curious if this would still bug you?
Well, the legality and hypocrisy are two different matters. Of course they’re both topics worthy of discussion.
> GitHub Copilot is copyright infringement and open source license violation at an industrial scale,
And just to put things into context: I publish plenty of Open Source on GitHub, so I would be affected.
I feel NetBSD is one of those projects so chronically starved for development, that this seems like a clear focus on the wrong priority.