OpenAI beta tests SearchGPT search engine

Thom Holwerda 2024-07-25 Internet 21 Comments

Normally I’m not that interested in reporting on news coming from OpenAI, but today is a little different – the company launched SearchGPT, a search engine that’s supposed to rival Google, but at the same time, they’re also kind of not launching a search engine that’s supposed to rival Google. What?

We’re testing SearchGPT, a prototype of new search features designed to combine the strength of our AI models with information from the web to give you fast and timely answers with clear and relevant sources. We’re launching to a small group of users and publishers to get feedback. While this prototype is temporary, we plan to integrate the best of these features directly into ChatGPT in the future. If you’re interested in trying the prototype, sign up for the waitlist.
↫ OpenAI website

Basically, before adding a more traditional web-search like feature set to ChatGPT, the company is first breaking them out into a separate, temporary product that users can test, before parts of it will be integrated into OpenAI’s main ChatGPT product. It’s an interesting approach, and with just how stupidly popular and hyped ChatGPT is, I’m sure they won’t have any issues assembling a large enough pool of testers.

OpenAI claims SearchGPT will be different from, say, Google or AltaVista, by employing a conversation-style interface with real-time results from the web. Sources for search results will be clearly marked – good – and additional sources will be presented in a sidebar. True to the ChatGPT-style user interface, you can keep “talking” after hitting a result to refine your search further.

I may perhaps betray my still relatively modest age, but do people really want to “talk” to a machine to search the web? Any time I’ve ever used one of these chatbot-style user interfaces -including ChatGPT – I find them cumbersome and frustrating, like they’re just adding an obtuse layer between me and the computer, and that I’d rather just be instructing the computer directly. Why try and verbally massage a stupid autocomplete into finding a link to an article I remember from a few days ago, instead of just typing in a few quick keywords?

I am more than willing to concede I’m just out of touch with what people really want, so maybe this really is the future of search. I hope I can just always disable nonsense like this and just throw keywords at the problem.

About The Author

Thom Holwerda

Follow me on Mastodon @thomholwerda@exquisite.social

21 Comments

2024-07-25 7:49 pm
ssokolow (Hey, OSNews, U2F/WebAuthn is broken on Firefox!)
Bring back Altavista’s NEAR operator. Before Google started enshittifying, that’s all I ever wanted.
2024-07-26 12:01 am
Alfman verbose=1
I know this hasn’t been a popular opinion on osnews, but I welcome AI innovation and believe there is value for people.
Alas, equal access to information on the web may be at risk as more platforms close themselves off and grant preferential treatment to, the data…
https://arstechnica.com/gadgets/2024/07/non-google-search-engines-blocked-from-showing-recent-reddit-results/
I realize there’s a lot of nuance here. Google are paying reddit for user generated content. On the one hand there will be those who say this is good because google are paying reddit to allow google’s robots train AI. But on the other hand it raises serious antitrust concerns. As long as google sweetens the pot, more than others can, it keeps the data out of the hands of competitors and startups who become blocked because they cannot financially compete with google. A separate point is that reddit are selling data that they haven’t paid for themselves – it’s 100% user content. I’m sure their lawyers have written the terms of service to cover this, but I’d still expect some users to be upset over their content being sold to AI companies without the creator having agreed.

2024-07-27 3:29 am
Andreas Reichel
> I know this hasn’t been a popular opinion on osnews, but I welcome AI innovation and believe there is value for people.
One of the first things I had to learn in life: “popular” stands often orthogonal to “quality”.
All I can say is: since ChatGPT I use Google only rarely and maybe 90% of my searches run through ChatGPT now. I certainly do welcome a ChatGPT based search engine. Lots of value to me at least.

2024-07-26 2:37 am
drstorm
First off, putting random words in quotes is annoying as ever. Stay on brand, Thom!
Ok, anyway. To me chatbots and search engines are and should be separate.
I can see how a chatbot can be a good interface for finding certain types of data that would involve, say cross-referencing different sources, or sifting through garbage, alternative spelling, different languages, etc. to find what I need.
However, what I really want is a text search with no interpretation, localization, translation, query rewriting, etc. etc. Just literally, verbatim what I have typed. A product like that can sit alongside a chat interface. Sadly, I think that ship has already sailed. Google doesn’t return what I ask, but what it “thinks” 😉 I asked for years now. And it’s shit.
2024-07-26 4:56 am
Shiunbird
The worsening of search engines (mostly Google) is making my professional life difficult.
When I started my career, I mostly handled on-premises solutions and we had dedicated vendor contacts. Changes were discussed together and, besides security patches, functionality modifications were far in between.
Now everything breaks, Azure has trillions of versions of the same APIs, AWS changes as often. Usually documentation does not follow (completing documentation used to be inherent part of writing software), or in-app help files or command output does not perfectly match functionality.
In the past, I could refer to documentation, even offline documentation, and deploy correctly and successfully. Now it is a bag of hurt to find out how to get started, and then to figure after you think you understand something, that default behaviours changed, that syntax “evolved”. Google used to help me fill the gaps in documentation but now useful results are deep buried in page 5+ and my daily life is a frustrating exercise of 50% hunting docs, 30% implementing and 20% of anger and trying to find out why things don’t work as they should.
In this sense, chatgpt-like products are being helpful for me, but only for the more conservative deployments. Try anything cutting-edge and you are back to googling and hoping for the best, and explaining to your boss that “you don’t know why things are working like they are” and “you don’t know why they break when they do”.

2024-07-27 3:31 am
Andreas Reichel
Spot on and nailed!

2024-07-26 6:22 am
dsmogor
I think LLMs only add value to search interfaces if they are reliably capable not only of turning user natural language query into search engine keywords (most digitally literate people can do that just fine) but also process and integrate data from multiple sources (including putting them against each other to weed out misinformation) according to user command.
That goes way beyond current naive Retrieval Augmented Generation approaches into more agent like systems that are currently subject to very hot research.
The issue is the sheer amount processing power it takes, orders, orders of magnitude more than classic search queries. At current SOTA technology it just cannot be free.
Where current RAG (translate -> search -> retrieve -> summarize) approach really shines though are voice based interface like Gemini powered google assistant. Only now it’s became genuinely useful.
2024-07-26 6:25 am
Adurbe
I’m one of those people who consider themselves “good” at Google. I use the various operators with loc: and allintext: being my favorites.
It is however, cumbersome. Having a chatgpt-esq system where I can give as much or as little context as I want is appealing.
More interesting will be how it’s monetised. I already pay for chatgpt, if I can “just pay” for my search engine instead of being bombarded by discreet ads and tracking Im all for it.

2024-07-26 11:55 am
CaptainN--
I’m also consider myself an expert in the cumbersome arts of Google-foo. But I must admit, Google isn’t the same product it was 5 or 10 years ago. It’s actively gotten worse, in a way that feels intentional.
So a few thoughts – I’m stoked to see some new blood in the bloody waters of search engine competition. I actually think an LLM is a perfect way to build a competitive product. As much as I can’t stand the hype cycle that LLMs have produced, the truth is those systems are VERY good at natural language processing – and that is essentially what a search engine must be.
The other exciting potential in this case is the business model. I don’t how LLMs can ever become profitable, given the expense each query incurs, but their are so far using a direct user payment model. This is similar to the old Newspapers – the user and the payer are the same. This is a great idea. Newspapers used a hybrid revenue model, also getting revenue from ads – that’d be fine too, as long as it’s always balanced to keep the ad desires from driving “feature” additions and behavioral changes on the platform. Google’s 100% ad money model is atrocious, and it probably largely responsible for the decline of the platform. I’m extra pumped to be rid of that model.
That said, I really don’t trust OpenAI due to it’s questionable leadership – hopefully we’ll see a few additional players in this space.

2024-07-26 12:14 pm
Alfman verbose=1
CaptainN–,
…That said, I really don’t trust OpenAI due to it’s questionable leadership – hopefully we’ll see a few additional players in this space.
I agree with your post. But regarding AI competition do you have any comments on the link I posted earlier?
https://arstechnica.com/gadgets/2024/07/non-google-search-engines-blocked-from-showing-recent-reddit-results/
Recent discussions on Reddit are no longer showing up in non-Google search engine results. The absence is the result of updates to Reddit’s Content Policy that ban crawling its site without agreeing to Reddit’s rules, which bar using Reddit content for AI training without Reddit’s explicit consent.
…
After Reddit declared war on free use of its content for AI training (which also resulted in an API access price hike that shuttered many third-party Reddit apps), Reddit signed a deal at a reported $60 million per year that lets Google use Reddit data to train its AI.
It seems possible that the future internet could become blocked off to new startups and that only rich companies with established cash flows will be able to buy themselves into the AI club. I kind of get “net neutrality” vibes here.

2024-07-27 9:33 am
Adurbe
I always find the reddit argument an odd one. So much of it is trash and/or toxic.
NOT having reddit results will probably improve the quality of results over what is currently returned. Eg a reddit post with your same question but no resolution/answer.
I understand the concern on net neutrality, but the irony is by taking the money, reddit are reducing its traffic and interactions, which in turn will lower the value of their data. It will also give competition the space to grow into without this “dominant player” filling all the results.

2024-07-27 1:54 pm
Alfman verbose=1
Adurbe,
I always find the reddit argument an odd one. So much of it is trash and/or toxic.
NOT having reddit results will probably improve the quality of results over what is currently returned. Eg a reddit post with your same question but no resolution/answer.
Yeah, I don’t go out of my way to restrict results to reddit like some people do.
I often find that the stack exchange sites have a better format. SE’s problem is they incentivized a culture of overzealous moderation, handing out points for moderating even when moderation isn’t called for to the overall detriment of the platform. Of course it’s important to stop spam/abuse, but on the whole a lite touch is better. Even with the best intentions, heavy handed enforcement against normal users creates a bad experience for users and visitors. To be fair, I think SE has gotten better since a few years ago. Maybe they realized moderators were ruining their own platform?
I understand the concern on net neutrality, but the irony is by taking the money, reddit are reducing its traffic and interactions, which in turn will lower the value of their data.
It’s not clear if reddit’s position is strong enough to demand (and get) payment. It’s like a game of chicken. You are right that Reddit’s withholding content from chatcpt/bing/duckduckgo/etc will necessarily reduce traffic to reddit from competing search providers. Technically this is bad for both sides, but since most search traffic comes from google’s monopoly anyways, reddit may just consider google’s competitors expendable if they don’t pay up.
https://gs.statcounter.com/search-engine-market-share
Google 91.05%
bing 3.74%
YANDEX 1.44%
Yahoo! 1.26%
Baidu 0.87%
DuckDuckGo 0.6%
Reddit executives may have calculated that closed access plus google’s 60M/year is at least as profitable and sustainable as providing open access to all search engines.
It will also give competition the space to grow into without this “dominant player” filling all the results.
I’m not so sure, as a DDG user, I think a status quo where google gets more exclusive access to web content sets a bad precedent for competition. I do understand your point that users on bing/DDG might direct traffic to a new service, but who’s to say that a new service that manages to reach critical mass won’t sell out pulling the same move as reddit. A closed internet could become the new normal and I’m not thrilled at the prospect.

2024-07-26 10:29 am
rhetoric.sendmemoney
This is a solution for a problem that is at least 2 other problems down the road. The people this kind of search would help are the people that OpenAI is having a hell of a time trying to draw into the platform to begin with. The tech literate (the only ones really even using ChatGPT) will MUCH prefer the traditional search engine approach. The non-literate will continue to blindly ignore ChatGPT unless their employers force them to use it. Regardless, this search capability is going to cost money and time and result in very little usage. You can’t convince Grandma Sally to learn a brand new tool (and PAY for it no less) and she really is the person that this kind of tool would ultimately help. Sadly, she is also the least likely to understand when it coughs up some hallucinated misinformation as well.
TLDR – This is a dumpster fire.

2024-07-27 2:29 pm
ssokolow (Hey, OSNews, U2F/WebAuthn is broken on Firefox!)
Plus, we already have enough capacity problems with datacenters and their power supply.
Stop enshittifying search engines and then using generative A.I. as an excuse to un-enshittify the results and just write a decent, power-efficient search engine.
It’s bad enough that it took so long for us to start to get past that 90s “We can use scripting languages for everything because Moore’s Law” mindset and start to shift a significant amount of that effort toward more resource-efficient languages like Go and Rust.

2024-07-27 3:06 pm
Alfman verbose=1
ssokolow (Hey, OSNews, U2F/WebAuthn is broken on Firefox!),
Stop enshittifying search engines and then using generative A.I. as an excuse to un-enshittify the results and just write a decent, power-efficient search engine.
I’m not going to defend search engines enshittification…but I will say that even if search engines did their job nearly perfectly, AI still has merit in terms of introducing more powerful tools capable of doing more than search engines ever could.
There have been so many instances when I’ve needed to do product research, searching, opening dozens of tabs, reading documentation, skimming through hundreds of reviews, compiling data into spreadsheets, etc. Even with good search engines research tasks can still end up being a laborious process because all the data aggregation is being done manually. Having an AI assistant that can do this work for you is a killer feature. I’ll grant you we may not be there yet, but frankly that goal is quite appealing and I do believe it is in reach.
For me, my gripe isn’t so much the AI’s existence, but where it runs and who controls it. I’m not a fan of technology becoming gated behind corporate silos. I’d strongly prefer for AI technology to exist as products we can run on our own machines. Alas, after the past decades tech companies have learned to maximize profits with rent seeking behaviors that involve engineering technology to trap customers inside their services. Unfortunately AI will be coming of age inside of these enshitification circumstances where consumer freedom is bad. This is my biggest gripe with where AI is headed.

2024-07-26 1:02 pm
NaGERST
Being a rival to goole search results nowadays, you just have to be altavista and 20 employees. Nothing you search for on google is accurate (try and simple maths formula) And all news are slanted to Brin’s world view,
Quant is much better, but i dislike that they supplement their own searches with bing stuff, lets just call it what it is “Crap”
Quant is set to be the new major search engine in sweden, but the name is a problem Q in swedish is jut the K sound as Q is not used in any swedish words at all (only used for immigrant names and reading forregin languages) the ua sound does not have a letter in swedish either. So most people just pronounce it kunt.
2024-07-26 7:07 pm
braddockcg
For the past two weeks I have conducted an experiment. I have set Perplexity AI as my default search engine, both for work and for school. I am never going back.
Perplexity has been significantly more helpful than Google on everything from obscure coding questions to concepts of quantum mechanics (really, I had a paper). It even does a good job on current events. And it cites references.
I lived through the Search Engine revolution of the 90’s. This is the most extraordinary transformation since.

2024-07-28 2:16 am
Andreas Reichel
This looks really good, thanks for the link.
I briefly tested it searching for “continuous compounding” and it promptly(!!!) showed the explanation how the discrete compounding becomes continuous for an infinite small tenor. Try this with google to no avail.
2024-07-28 11:55 am
Iapx432
I have been using perplexity for months. I don’t believe it indexes the web directly. It processes search results from the usual sources and provides focused answers quickly. It can make errors in calculations. Also it might be an hour behind. It strongly denied Trump had been shot at about an hour after the fact, before relenting. It provides search results as well, but as backdrop vs foreground.
I will feed it a link to an article and ask it for the main point. “News” often buries the main point halfway through so you have to scroll past ads. Perplexity extracts the actual information in less than a second. My Google news feed is actually very good at selecting topics I want to learn about, except for the garbage packaging. I would love a news feed browser that extracted information from feeds vs making you process the junk yourself, or copy and paste links to an AI. The problem is the feed sources will eventually not get paid. Maybe we need a new information model where you pay like a nano dollar per bit of useful textual info. The AIs can concentrate articles into that text stream no problem, i.e. measure useful info content.. That rate translates to roughly $1 per three hours of reading. This feedback loop would cause enhancement vs degradation of web content. Imagine that!

2024-07-28 6:33 am
NaGERST
As long as AI search results include answers from Quora it is going to keep coming up with answers like “your should eat at least a handful of gravel each day for a vitamin boost” and “Napoleon ate so many puff pastries that he died, and the pastry is today named in his honour”.

2024-07-28 4:51 pm
Alfman verbose=1
NaGERST
,
As long as AI search results include answers from Quora it is going to keep coming up with answers like “your should eat at least a handful of gravel each day for a vitamin boost” and “Napoleon ate so many puff pastries that he died, and the pastry is today named in his honour”.
Haha. People need to understand that AI is vulnerable to garbage in garbage out. Frankly I’m pretty sure people are too.