Tech-makers assuming their reality accurately represents the world create many different kinds of problems. The training data for ChatGPT is believed to include most or all of Wikipedia, pages linked from Reddit, a billion words grabbed off the internet. (It can’t include, say, e-book copies of everything in the Stanford library, as books are protected by copyright law.) The humans who wrote all those words online overrepresent white people. They overrepresent men. They overrepresent wealth. What’s more, we all know what’s out there on the internet: vast swamps of racism, sexism, homophobia, Islamophobia, neo-Nazism.
Tech companies do put some effort into cleaning up their models, often by filtering out chunks of speech that include any of the 400 or so words on “Our List of Dirty, Naughty, Obscene, and Otherwise Bad Words,” a list that was originally compiled by Shutterstock developers and uploaded to GitHub to automate the concern, “What wouldn’t we want to suggest that people look at?” OpenAI also contracted out what’s known as ghost labor: gig workers, including some in Kenya (a former British Empire state, where people speak Empire English) who make $2 an hour to read and tag the worst stuff imaginable — pedophilia, bestiality, you name it — so it can be weeded out. The filtering leads to its own issues. If you remove content with words about sex, you lose content of in-groups talking with one another about those things.
These things are not AI. Repeat after me: these things are not AI. All they do is statistically predict the best next sequence of words based on a corpus of texts. That’s it. I’m not worried about these things leading to SkyNet – I’m much more worried about smart people falling for the hype.
ChatGPT and the like are a glorified autocomplete.
It’s a bit more advanced than that. It’s entertaining watching it generate GURPS characters based on historical figures, as well as trying to convince it that the Flat Earth theory was true (it wasn’t convinced). Also, what was the point with the white man hate in the article? The internet has all sorts of things, whether good or bad it’s because of the idiots spewing forth things on the internet, but it for sure isn’t only tied to one color of skin / gender. All it proves is that humans are terrible when they have any sort of anonymity.
We may have to consider these are limitations of the medium.
Think about what would happen in you stuck a real human being on the other side of a chat window, the experience is very similar “just a gloried autocomplete predicting the best next sequence of words”. Yes of course the fact that we know the secret formula takes the magic out of it, but surely any fair metric for intelligence must not merely concern itself with the fact that words are output sequentially, but actually consider what those words are saying.
In scenario A we have a computer program that generated a given OUTPUT for a given INPUT.
In scenario B we have a human that produced the exact same OUTPUT given the exact same INPUT.
Now, when an observer judges the intelligence of the conversation in both scenarios. Hopefully everyone sees the hubris involved in claiming that one is intelligent and the other is not. It’s the words themselves that must be indicative of intelligence, not the method of generating words.
Of course. The models being used today are static. At best, it’s like taking a snapshot of everything one knows, but not being able to learn any further. This is a limitation of today’s NN training methods, however looking further down the line I do predict dynamic neural nets that will be able to learn by “first hand” experience. Frankly it could become harder for humans to compete at this stage. AI reaching human level intelligence is going to be an important milestone, lets say by earning a doctorate in every field. I’m still not sure people are actually going to be impressed though, haha.
People who say “This is not AI” clearly misunderstand the term, possibly imbuing it with some hollywood-esque traits.
People who say “This is glorified autocomplete” clearly misunderstand the technology, and likely have not had enough imagination or real world reason to use it for what it is amazing at, or they would have realised their characterisation was stupid.
Some PEOPLE are like glorified autocomplete.