We are trying out opening Copilot automatically when Windows starts on widescreen devices with some Windows Insiders in the Dev Channel. This can be managed via Settings > Personalization > Copilot. Note that this is rolling out so not all Insiders in the Dev Channel will see this right away.
↫ Amanda Langowski, Brandon LeBlanc at the official Windows blog
You will use the copyright infringement tool, Windows user.
Yes, it probably violates copyright. I think. A strong argument either way ends up convincing me one way or the other. But anyways I don’t know how to describe these LLM Ai chat bots not worth the hype, but also kind a useful. Its like getting excited about socks. Or a screwdriver. But I guess if you first saw socks or a screw driver you’d be pretty impressed if you’d tried to live with out them.
It’s a productivity tool. You can ask it to do tasks that previously was the domain of human assistants. I definitely see the value even though it’s not general AI. Heck if it were general AI, I’m not sure it’d really be any more useful as an assistant.
Thom Holwerda,
I’d genuinely like to know if you feel that humans should be barred from doing the same things with the same sources?
What is the difference in your view?
The difference is that when someone makes a movie and is inspired by all the other movies they saw… The assumption is that they at least paid to see those movies.
Thom,
I still cannot see how this is any different than how humans learn and imitate each other. We even have a saying “good artists copy, great artist steal”.
Here the AI is just becoming the “greatest artist”, since it can copy (and steal) from the entire human knowledge, even those in other languages or domains (it can now understand images too in what is called a “multi-modal llm”)
Anyway, since we don’t have a categorization for copyright for reading vs learning, I am not sure many of the claims, except most egregious examples will hold up in court.
(To be fair, we have been charging higher for college coursebooks. But I think that is already a problem).
A big difference is accountability.
No one is gonna get mad about a program, because it’s just working the way it was designed.
While a real person will forever be tainted by what he/she did, even if it happened decades before.
So no, a program is not a person. And its authors are the ones responsible for any of its wrongdoings.
Anyway, after blatantly reaping other’s works and admitting so, IA programmers will surely find the right balance of what is an acceptable appropriation, both socially and legally, and get away with it.
Nico57,
I would argue AI has repercussions, too. At least their owners will pay for the mistakes in the process.
Remember what happened when Cruise tried to push their “robo taxi” service before it was good enough? They had to shut down the service entirely.
One off mistakes happen, yes. But if you are systematically causing major incidents, and have no way to “learn” from those mistakes, the society will also shun that AI out of existence.
Thom Holwerda,
For this hypothetical scenario to be fair, we ought to presume the copies are legitimately acquired for both human and AI cases. In this case, the difference doesn’t seem to be as much training so much as what happens afterwards, right?
Let’s say that productivity of a NN AI was about EQUAL to a human. would you still have a reason to complain that authors weren’t fairly compensated then? After all, the payment to creators would be equal in both cases. That’s the thing, the “AI infringes copyright” argument doesn’t really stem from what the AI doing anything differently than a human would. I believe the real difference here simply boils down to the AI’s huge productivity advantages: AI can do the work of one person or thousands and maybe even millions.
This is a legitimate concern for creators, and indeed all of us, but I’d like to putforward the notion that this isn’t genuinely a “copyright” problem, but rather a larger socioeconomic one. This doesn’t dismiss the problems, but maybe it changes the lens through which we view them.
Alfman,
To be blunt, I was more worried last year, compared to now. But seeing the progress in “democratization of AI”, I might feel a bit relieved.
Many artists are using AI tools to improve their work, and the output are still “clearly recognizable” if used verbatim. So for a “senior” level artist, it becomes much easier to delegate repetitive work. (And senior programmers, or senior copy editors, and so on).
But one concern is that many “entry level” jobs (think fiverr or upwork) will become more difficult to find. Plenty of people start that way building their portfolio designing simple logos, or adding views to existing mobile applications. Now GPT can do that in a few seconds.
sukru,
Yeah, it definitely started effecting entry level tasks already, but I take the long view on AI and I think the AI is only going to improve with time, displacing higher skilled jobs.
We should expect the language models to become incrementally more sophisticated (ie GPT3 to GPT4 had dramatic improvements). But the same technology only gets us so far. It’s not enough for a NN to provide an answer through neural weights, to do real math AI needs to be able to perform computations and I think one of the next AI revolutions will be combing these deep knowledge language models with a programmable state machine. Even a simple lisp-like state machine should be able to improve the language model’s ability to do mathematical induction & recursion.
I think the language models (with appropriate training) will be able to solve complex mathematical problems very accurately by providing state machine instructions rather than using the NN to solve them. And I think this will be a revolution because AI language models are much faster than humans at writing algorithms and state machines are much faster than humans at following algorithms…putting the two together while training the NN to an expert level may realistically end up challenging even high level workers.
I’m not sure if this is “democratizing” or not. Some may view this as a beneficial tool that lets people extend their own skillset and performing above their class. Others will certainly view it as a competitive threat. I guess time will tell, but I think it’s naive to look at the AI today and be dismissive of it…I am confident it’s going to keep getting better.
BTW,
https://techcrunch.com/2024/01/09/duolingo-cut-10-of-its-contractor-workforce-as-the-company-embraces-ai/
BTW, Duolingo just had layoffs relevant to the discussion…
I think this will become more normal, whether it’s employees who are explicitly laid off in favor of AI, or new jobs going to AI instead of human workers.
Alfman,
The language models, those like GPT-3 or early Bard already have access to outside information. Be it Google/Bing web search, or Wolfram Alfa.
How this works is actually quite simple and intelligent. The model outputs a stream of “tokens”, and some of the tokens have special meanings. Like [END_OF_TEXT], which tells the “main” loop to stop (or switch sides in the dialogue). And some of them are just function calls, and cause more tokens to be injected by those plugins.
But the models themselves are not being trained to learn logic and basic math. How far they can go is another question.
Nope, no such assumption can be made. Public libraries are a thing. Borrowing movies from friends is a thing. Free-to-view content is a thing.
This is why using the phrase “intellectual property” instead of “copyright” will eventually corrupt your brain: Property doesn’t expire after 95 years, “intellectual property” somehow does. Carjacking is illegal and a crime, borrowing a movie from a library or friend without permission from the studio that made it to avoid paying money to see it isn’t. “Intellectual property” isn’t really property.
And the thing is, the law doesn’t care about the “intellectual property” buzzword, the law cares about the legal concept of “copyright” (aka a limited-time state-granted monopoly on the right to make copies of a given work, with exceptions for fair use).
In my opinion, the major difference between computers and biological beings (not just humans) is that we’re imperfect. You might watch a movie, read a book/article or have an experience, but you will never perfectly recollect it. In fact, your recollection gets a little more imprecise each time you attempt to remember (https://www.psychologytoday.com/us/basics/memory/how-memory-works). That means that we have the ability to evolve an idea simply by filling in the blanks. A computer cannot do this. With perfect recollection, there are no blanks to fill in, and so there’s no creativity and no original thought. All you see when looking at nature is animals’ ability to adapt to new and old information in creative ways. But when looking at answer from an AI system, all you see is someone else’s thought, and that’s unoriginal.
teco.sb,
Sure I agree that NNs that create imperfect art can be more interesting than those that create perfect copies of something. but it doesn’t automatically follow that all artificial NNs are only good at making perfect copies. They’re often “fuzzy” approximations by design to improve generalization and not be so hyper focused on the training set.
Out of curiosity if we designed NN to be more forgetful, do you think this would have a positive impact on your perceptions of AI?
I’ll concede that these language models are static in nature and they don’t evolve continuously, and I understand why you criticize that. However language models by themselves really aren’t the end game, and I disagree with your take on AI in general. We’ve already thrown evolutionary NNs at solving complex problems like chess and go without the benefit of any human instruction beyond the rules. The solutions they come up with are completely original and regularly beat the best humans. Eventually these different kinds of AI will be fused together, gaining the capacity to train themselves. When they do, well human intelligence may no longer be the highest form on intelligence on the planet.
Hi Alfman,
I think both you and sukru, below, are making the same mistake in thinking that simply applying randomness or noise to a data set (by specifically “forgetting” certain aspects of what it learned) is an approximation to a biological brain.
The brain doesn’t just forget. It forgets then adds something in it’s place to still keep the memory whole. This is a reason eye-witness accounts can be fickle. Studies have shown that we, for example, don’t just forget someone’s face, we forget that face then invent a whole new face to put in it’s place based on our personal experiences. While this can be approximated with randomness, it’s not a random process.
Evolutionary algorithms are very interesting, but they reach their solutions through iteration. This is a mechanical process. I remember an article from a few years ago where they put an algorithm to play chess and it found a bug in the software that allowed it to essentially cheat and win every game. If I remember correctly, it crashed the software through an illegal move and won by default. All within the rules. Also, I do not find quickly churning through every possibility, however nonsensical, until you find a solution that works to be intelligence.
[q]Also, the machine learning in neural networks and human brain inner working are so similar, neuroscience and ml communities heavily feed from each other:
https://webcache.googleusercontent.com/search?q=cache:X7ZIEnWKrwAJ:https://towardsdatascience.com/the-fascinating-relationship-between-ai-and-neuroscience-89189218bb05&sca_esv=598312327&hl=en&gl=us&strip=1&vwsrc=0%5B/q%5D
I’m really not arguing that computers can consider more parameters than humans. Actually, that’s exactly what makes us “creative” in comparison to a machine. Since we are not able to identify minute variations in large data sets, we have creatively think of solutions. The examples in that article are prime examples of what computers do best. But while they mimic the human brain, they do not operate the same way.
Just to be clear… What I’m saying is that, at least in their current iteration, AI cannot be creative. Because of that, whatever they create is not copyrightable (as has already been established legal, at least in the US), and the results they produce are derivative works because they cannot be considered “new” or “unique” thought. Even humans sometimes run aground of copyright laws by making their work too close to the original (numerous examples of that in history).
Hi Alfman,
I think you fail to realize just how important randomness is to creativity and the human brain, otherwise we could never be anything more than the input that makes us up. We’d follow our brain’s programming and that’s it.
But that’s how neural nets work. I’ll grant you artificial ones aren’t replicating human behavior perfectly (I doubt we’d want that anyway), but I think artificial and biological NNs have a lot of similarities even if we’re not there yet.
You say this without irony, but honestly it sounds so human. NN are imperfect for perfect things like like math and making exact copies.
It is early in AI evolution still, but I suspect computers could objectively pass creativity tests for humans today using double blind A/B testing.
teco.sb,
Alfman already gave a very good answer with the rundown of where AI stands wrt. human brain.
I want to add one point on creativity: It will eventually come.
The early language models (like BART) were able to understand context, and find relevant passages, but were unable to articulate answer.
Next generation was able to generate coherent (but often incorrect) English passages.
GPT-4 reached to a point where the model could to logical reasoning (there are many examples of asking “I have four apples, gave 2 oranges to Mary, how many apples have I left?”, which the model correctly distinguishes between actions and types.
Actual response from GPT:
These are called “emergent properties”. As the model increases in memory capacity, it also increases in mental capacity for additional tasks it was not directly programmed for.
(Though I am sure there are now efforts to fine-tune those properties).
Anyway, back to creativity:
https://ai-scholar.tech/en/articles/alignment%2Fcreativity_of_llm#:~:text=Current%20LLMs%2C%20which%20make%20autoregressive,values%2C%20for%20better%20or%20worse.
As we have discussed the AI can generate creative solutions to problems within existing set of parameters.
It has not reached to the point yet where they can generate completely novel things.
And it might take a while.
(The human brain is estimated to have more than 1,000x capacity of the current state of the art models).
teco.sb.
The neural networks are also exactly like that.
The do have more memory, true. But they are not like classical computers where we store entire texts in a database.
In fact, modern learning algorithms heavily include forgetting. Sometimes it is harmful, and even called catastrophic, other times it is very beneficial: https://arxiv.org/pdf/2307.09218.pdf
Also, the machine learning in neural networks and human brain inner working are so similar, neuroscience and ml communities heavily feed from each other:
https://webcache.googleusercontent.com/search?q=cache:X7ZIEnWKrwAJ:https://towardsdatascience.com/the-fascinating-relationship-between-ai-and-neuroscience-89189218bb05&sca_esv=598312327&hl=en&gl=us&strip=1&vwsrc=0
And of course there is work in integrating the two (for example for prosthetics control): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10053494/
Microsoft have only pushed it out on their (opt in) Dev channel. They Want people to beta test their new features so they often enable the latest thing by default to get eyes on it. It doesn’t mean that it remains default on the GM windows.
Why do people have sympathy for the copyright industry? What have copyright maximalists done for society besides create a world where you can’t keep what you buy and can’t fix it when it breaks (or they choose to break it)?
Food for thought though, if these AI companies care that little about poking bears with teeth (the copyright people can easily sue you into the ground), how much do you think they care about the privacy and/”intellectual property” of the average Joe who is not a massive corporation? You might counter and say “but the average Joe doesn’t have anything worth taking?” And maybe that argument is valid to an extent, but remember that there is an entire industry today which is based on tracking your entire life in order to profit from it later. So not really. “Big surveillance” clearly shows that the average person has plenty of data worth stealing, and that is before taking into consideration e.g. innovative or patentable ideas that the user may also have.
It is not “people”. It is real engineers who will enjoy nothing more than there inventions and epiphanies spreading vs. “something with media” who desperately try to wring the last cent out their tangible assets. Its literally the horse breeders fighting the automobile.
“Swing Riots” was the word I was looking for: https://en.wikipedia.org/wiki/Swing_Riots
They bet on the right horse of course.
Greetings,
the Ottomans and Indians and Greeks called and ask when you pay your fair share for trigonometry, algebra and numbers. There is also some bad news here: since the claim for the wheel is still pending at court, you are not allowed to fairly use it until ownership has been settled at a court.
I can’t wait for some court to rule that the use of copyrighted material to tune the weights on a neural network is not copyright infringement, this will shut up the Thoms who claim it is without providing a single reason why it should be. PROTIP: I’ve been using your website to tune the weights on the neural network in my brain ever since I stumbled on this website. It’s called “reading comprehension”.
Much like that Supreme Court ruling that told Oracle’s lawyers that Google engaged in fair use when they copied the Java APIs shut up the Apple fanboys for good (they hoped Oracle collecting a license per Android phone would eliminate the price advantage of midrange/upper-midrange Android phones)
I know all creatives and knowledge workers are clinging to copyright infringement as their secret hope of wiping out the threat of LLM’s making their home turf obsolete. I just don’t think this will bring the urgently sought deliverance.
Truth of the matter is that LLM’s don’t copy verbatim if given enough training data. They consume the offered text and then build a statistical representation of the sequences encountered. The more material they get, the less literal the reproduction becomes.
If you train an LLM on 1 text, it can only give statistical variations on that specific text. If you give it a whole dump truck of material, it will build statistical variations over the whole dump truck, thereby making the connections between certain words much more general, leading to a general summary on subjects, because all texts on a specific subject give a certain context.
Storing the gist of subject matter in a statistical model is highly transformative and that falls under the fair use doctrine. IMO (IANAL), it’s also the defense to use for LLM’s. The more an LLM processes, the more generalized it becomes and the less it will reproduce specific works, because all information on a subject alters the statistics, making it a more distilled version of that subject.
Does it suck that a computer can now do in a few seconds, what took you a few hours or days and for which you studied years? Yes it does, but the cat is out of the bag. The tool is here and progress stops for no one. Time to find greener pastures. Sooner for the mediocre ones than the gifted, but change is coming for us all.
I’d be much more sympathetic to copyright infringement arguments in general if copyright maximalism hadn’t resulted in the last century of culture being locked up behind it. Change copyright terms to something more reasonable, like, say, a decade, then we can talk.
anevilyak,
It started as a 14 year term and went up from there.
https://en.wikipedia.org/wiki/History_of_copyright_law_of_the_United_States
I agree it’s totally corrupt not only in terms of length, but in terms of copyright holders using the law abusively to impede new creations based on fair use , which is the exact opposite of copyright’s purpose. Personally I have no faith that the copyright situation will improve though since the big money behind it’s corruption is still around.