Google, fix your query URLs

Thom Holwerda 2012-04-18 Google 42 Comments

Yesterday, I wanted to leave a link to a search query in a Facebook comment. Then this happened. Please Google, fix your damn query URLs. This is unusable, user-hostile crap.

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

42 Comments

2012-04-18 2:07 pm

tidux
What kind of asshat uses a proprietary Gtk-based editor that doesn’t even have as many features as gedit?

2012-04-18 2:24 pm

stestagg
You want to start an editor war of the back of an article about URLs? really
2012-04-18 2:35 pm

Beta
What kind of asshat uses gedit?

FTFY

2012-04-18 2:19 pm

anandpur
Google have neat fix. Next time try g.co for Google URLs and goo.gl for any random URL

http://goo.gl/

http://g.co/

2012-04-18 3:38 pm

phoenix
And, how exactly would shaving off a handful of characters from the domain name help with the extremely long part of the url after the domain name?

2012-04-18 3:49 pm

anandpur
Are you from past or just trolling. Do you know how this works? Try this URL form this website

http://osne.ws/jxq

https://en.wikipedia.org/wiki/URL_shortening

2012-04-18 3:59 pm

diegocg
Thom isn’t ranting about the length of the URL. He is ranting about query format in the URL.

2012-04-18 4:07 pm

Laurence
Thom isn’t ranting about the length of the URL. He is ranting about query format in the URL.

That will also get saved for redirection by the shortened URL. So it doesn’t really matter how much crap is in the query string.

Edited 2012-04-18 16:09 UTC
2012-04-18 4:55 pm

Thom Holwerda

Thom isn’t ranting about the length of the URL. He is ranting about query format in the URL.

That will also get saved for redirection by the shortened URL. So it doesn’t really matter how much crap is in the query string.

I shouldn’t need to use additional services nor should I have to figure out which part of a URL I can safely delete. When I quickly want to pass someone a link to a goddamn *search query*, it shouldn’t be more than domain.com/string. Simple. Having it span 4 lines is idiotic.
2012-04-18 8:37 pm

Laurence

I shouldn’t need to use additional services nor should I have to figure out which part of a URL I can safely delete. When I quickly want to pass someone a link to a goddamn *search query*, it shouldn’t be more than domain.com/string. Simple. Having it span 4 lines is idiotic.

Hmm, I guess that depends on whether you define the full URL – query string and all – as part of the usable site or just hidden mechanics (some browsers even hide query strings from the address bar until you click into them)

In theory, you should never need to see the query string because of how hyperlinks and forms work (eg readable text with a hidden URL behind an anchor tag) – so part of me thinks you’re being a little unreasonable.

However it’s also true to think people do still copy and paste URLs (as you had done here) so a more user friendly query string does make some sense for usability.

At the end of the day, there are simple workarounds and they don’t add a huge overhead on your time so is it really worth Google investing time in rewriting the mechanics of their search engine just to make it more copy and paste friendly? Maybe just a small “copy this link to clipboard” button -which copies a tiny URL- would be a better solution?

Edited 2012-04-18 20:56 UTC
2012-04-18 10:56 pm

DHofmann
Hyperlinks eliminate the need to post links in plaintext.
2012-04-19 6:12 am

OSNevvs
I understand, but these are variables used for specific purposes (stats, referal payouts, etc.). If you don’t want to pass on that information, you can just paste something shorter like this: http://google.com/search?q=something

2012-04-18 3:52 pm

Laurence
And, how exactly would shaving off a handful of characters from the domain name help with the extremely long part of the url after the domain name?

If you bothered to click the link you’d see the former is a URL shortner (such as Tiny URL) and not a substitute domain name.

I mean seriously, it takes all of 5 seconds to check and would have saved you days of looking foolish.

Edited 2012-04-18 15:52 UTC

2012-04-18 3:57 pm

phoenix
And if you bother to click the second link (which I did), you’d notice that all it does is shave a few characters off the domain, it doesn’t touch the parts after the domain.

Who’s the idiot now?

2012-04-18 4:02 pm

Laurence
And if you bother to click the second link (which I did), you’d notice that all it does is shave a few characters off the domain, it doesn’t touch the parts after the domain.

Who’s the idiot now?

You because the 2nd link is another URL shortner but a private one for Google services. Thus that’s not just a replacement domain either.

2012-04-18 4:15 pm

shmerl
It’s not a neat fix. In general using short URLs is bad, since they obscure the target.

2012-04-19 3:43 pm

BobOSoh
Suggestion: Try the preview function of tinyurl.com

It lets the user go to tinyurl’s site first and see the full url before proceeding to the target.

It makes it a longer process but it helps those that are suspicious of using the interwebs.

And you still get the short URL to post or email.

2012-04-18 2:40 pm

vaette
Does someone with a lot of time on their hand want to sit down and analyze the meanings of the different parts of the url? Could be interesting to figure out what is so important to Google anyway

2012-04-18 3:58 pm

Laurence
Does someone with a lot of time on their hand want to sit down and analyze the meanings of the different parts of the url? Could be interesting to figure out what is so important to Google anyway

The only query string needed for is the ‘q’ bit. eg: google.com?q=search+term

The rest (from what I’ve seen in the past) is largely analytics (someone please correct me if I’ve got that wrong)

2012-04-18 2:53 pm

bogen
And chop out all the uneeded parts. The query url will still work after you do that.

http://www.google.com/#q=fiona+apple+riding+a+unicorn

Edited 2012-04-18 14:55 UTC

2012-04-19 8:28 am

aargh
Now, why the hell did they change the normal questionmark (?) to a pound sign (#)? It used to work (a couple of years) before.

2012-04-19 1:48 pm

Alfman verbose=1
aargh,

“Now, why the hell did they change the normal questionmark (?) to a pound sign (#)? It used to work (a couple of years) before.”

AnyoneEB and I already commented on why, but maybe it wasn’t clear enough.

Due to web browser restrictions javascript pages (particularly those updating their own content via AJAX calls) cannot update parameters after the ‘?’ symbol without forcing a full page reload. This is an annoying restriction but it’s what we web developers have to live with since no browsers I am aware of have ever fixed it. Consequently AJAX web developers have to accept that page URLs no longer reflect the javascript content that is being displayed on the page. The URL either must be generic, or it will be wrong as the page’s content is updated. (Or we avoid using AJAX to refresh sections of the page and revert to a full postback).

The exception of course is for data after ‘#’, since from the early days of netscape it was used to jump to different “bookmarks” in the web page. It never generated a postback.

To be sure, it’s a terrible practice since pages using ‘#’ to pass parameters don’t work without javascript. It’s a total misuse of the way they are supposed to work. Had Thom or anyone else posted those links, it’d probably cause unintended accessibility problems. But there you go, hopefully you now understand the motivation in a nutshell.

2012-04-18 3:12 pm

Alfman verbose=1
Thom Holwerda,

When I search from duck duck go’s engine, I get urls like this:

http://duckduckgo.com/?t=lm&q=popcorn

Which is obviously better than google’s monstrosity, however it suggests that you might have edited the duckduckgo.com url yourself, in which case you should have given google the same benefit. The following works fine for example:

http://www.google.com/#q=popcorn

Doing the same queries from the firefox search box yielded these for me:

http://duckduckgo.com/?t=lm&q=popcorn

http://www.google.com/search?q=popcorn

So my question is: did you edit the duck duck go url by hand? If not where did you get it from?

* Side note:

I don’t particularly like google’s use of anchor tags to pass information to a webpage because it’s a fundamentally flawed use of URLs. For instance, try opening google’s achor URL with javascript disabled and it will not work. The reason they probably do it is to work around a browser limitation when updating the URL and using AJAX, but it’s certainly a bad design.

2012-04-18 3:30 pm

joehms22
My duckduckgo just leaves a ?q= param, looks like the &t= comes from when ddg is integrated in software to track user base coming from that particular place.

Good to see you’re actually using it though.

http://help.duckduckgo.com/customer/portal/articles/448610–t-

2012-04-18 3:41 pm

Alfman verbose=1
joehms22,

Well that was a good catch!

It is a Linux Mint customization, and though it hadn’t occurred to me, this is how the Linux Mint project gets to profit share from it’s users searches.

I guess anyone clicking on my earlier link and then clicking on ads might well contribute ad proceeds to Linux Mint!

Thom, you need to get this profit sharing arrangement for osnews! (…not that I personally click on ads)

duckduckgo.com/?t=osn&q=popcorn

2012-04-18 3:39 pm

Thom Holwerda
I did not edit the URLs in any way.

2012-04-18 3:25 pm

runjorel
This happened to me only on a few occasions and maybe it was just a fluke at those times, but there have been times where I sent someone a google query url and it did not return the same results that I saw. What I *think* happens is that if two people are signed into their google/gmail accounts, google tries to adapt queries to your personal preferences/click history so the same query would produce two different results but that is completely anecdotal.

The specific example I am thinking of was that I sent someone a google query of ‘Groovy’. My first result is the Groovy scripting language. I can’t even remember what the first result was on the other person’s search but it was not for the scripting language Groovy. It was like 3 or 4 links down for him.

2012-04-18 3:32 pm

Alfman verbose=1
runjorel,

I’ve definitely seen cases when different people see different results. I don’t know this for a fact, but I always assumed it was a deliberate design to provide localized and/or context sensitive results. It’s possible it was a bug, but I don’t think so, and for the record neither of us were signed in.
2012-04-18 6:13 pm

lokrisch
There are two reasons for this.

The first is the so-called “filter bubble” phenomenon. Google delivers content ranked by an order of what they think you want to see. And there are presumably more factors taken into acccount than just your click or search history when you’re logged in (cookies, geolocation, language settings, ISP, … endless possibilities).

A simple explanation of this has been published by the Duckduckgo people at http://dontbubble.us in order to raise more awareness to this problem.

The second reason is obviously just an updated search index. The web content changes constantly and so does Google’s search index. If you look at your search results a day later, you may also get different results.

2012-04-22 12:17 pm

zima
I’d assume also third reason: the index and its results not being strictly deterministic of sorts – operating on a best effort basis within preset time delays, momentary availability of nearby resources, and such.

2012-04-18 7:33 pm

AnyoneEB
The siblings explain pretty well why bubbling happens, but if you want to be aware of it, there’s a Chrome extension that displays other possible search results for the same query from different locations: http://bobble.gtisc.gatech.edu/software.html .

2012-04-18 6:49 pm

another_sam
I totally agree with Thom. Those URLs are retarded.

And I personally don’t like transparent search customisation; I want to get the same results no matter what account, or machine, or location I do the search from. Indeed, filter bubble is a sick invention.

2012-04-20 12:43 pm

artworx
What do you mean? Its the best thing since sliced bread. I’m happy to be finally getting the results that I want even if the name is generic.

On the other hand, I am aware that everything I do online influences what I see, so I make sure to keep things separate. That is why I always have 3 browsers open at work(programmer) and I never sign into facebook unless in private mode.

2012-04-18 7:13 pm

peanutB
Nitpicking I suppose, but it isn’t a query string. Google is shoving everything into the fragment portion of the URI. Not sure why they are doing that, kinda silly.

2012-04-18 7:36 pm

AnyoneEB
It’s a common trick for websites that make heavy use of AJAX to change their content. As with Google Instant, you can change the search query without loading a new page, Google will put the changed search query into the part of the URL after the # which can be changed by Javascript without loading a new page. This makes the back button work better with AJAX pages.

2012-04-18 8:01 pm

Doc Pain
It\’s not just the query URLs. It\’s also the search result URLs.

Let\’s say I search for \”The FreeBSD Project\”. As one of the first results, I get this:

\”The FreeBSD Project\”

Okay, looks fine. The associated URL is this:

http://www.google.com/url?sa=t&rct=j&q=freebsd&source=web&cd=1&ved=…

But the real URL of the search result should be this:

http://www.freebsd.org/

This is a fully qualified name including protocol, server and trailing /, which is fine. That is the information that I consider \”the result of the web search\”.

Let\’s assume I want to send that search result to a friend, like \”I found this item you\’ve been asking me for, look at it and see if it helps\”, which one of the results do you think is better?

Also note that in this example, I\’m not supposed to actuactually open the result (in a tab or a new window); instead I just \”copy link address\” which should be sufficient. I\’m also not supposed to look at the status line which displays http://www.google.com/url?sa=t&rct=j&q=ibm%20i5%2Fos%20…

With the \”The FreeBSD Project\” example above, I had the opportunity of copying the \”green address\” below the google result link. With this example, I don\’t have even that anymore.

Reason? The real URL to the search result is \”abbreviated\”. The green text reads:

http://www.gateway400.org/…/ …

For some shorter results, it may appear like this (from one of the next results):

common.es/p/manuales/sec_sg246668_security_guide_v5r4.pdf

This is something I could easily copy & paste, even though it misses protocol and server. But that\’s not a big deal here.

However, and that is my final statement according to those two examples, the \”copy link address\” should provide the target of the search operation, no \”google-affected\” URL with information that may contain data that harms privacy or security.

It has worked that way in the past. Why has it been disimproved so badly?

2012-04-18 8:07 pm

Thom Holwerda
off-topic: I went into the database to fix your comment = you used italics for the links, but the close tags were attached to the URLs by our system :/. Fail.

Fixed it!

2012-04-18 8:22 pm

Doc Pain
off-topic: I went into the database to fix your comment = you used italics for the links, but the close tags were attached to the URLs by our system :/. Fail.

Fixed it!

It seems you fixed my post before I edited it again, so some additional info is lost. I intended the links to be fully visible (instead of being clickable) for better illustration of what I see as the problem. Also some quotes turned into backslash-quotes. You’ve been to fast. 🙂

I hope it’s okay that I repost the full (modified and re-styled) message. If not, feel free to delete it. I’ve been careful to check everything in preview, and I’ve also added some newlines to make the database happy.

*** start edited post ***

It’s not just the query URLs. It’s also the search result URLs.

Let’s say I search for “The FreeBSD Project”. As one of the first results, I get this:

“The FreeBSD Project”

Okay, looks fine. The associated URL is this:

http://www.google.com/url?

sa=t&

rct=j&

q=freebsd&source=web&cd=1&ved=0CDUQFjAA&

url=http%3A%2F%2Fwww.freebsd.org%2F

&ei=PMqGT-5FMrAtAb6ubHTBg&

usg=AFQjCNFNUcBDJeqme1f1qWHxQ2sbygMFNQ

(NB: I’ve disassembled the result into several lines. In reality, it’s one long line of course.)

But the real URL of the search result should be this:

http://www.freebsd.org/

This is a fully qualified name including protocol, server and trailing /, which is fine. That is the information that I consider “the result of the web search”.

Note that you can’t even cut this address as “valid text” from the search result’s URL: The part “http%3A%2F%2Fwww.freebsd.org%2F” would need replacements for : and / to become valid.

Let’s assume I want to send that search result to a friend, like “I found this item you’ve been asking me for, look at it and see if it helps”, which one of the results do you think is better?

Also note that in this example, I’m not supposed to actuactually open the result (in a tab or a new window); instead I just “copy link address” which should be sufficient. I’m also not supposed to look at the browser’s status line which displays

http://www.freebsd.org/

which is not what “copy link address” will contain (misleading!).

It can be even worse. Let’s say I’m searching for some topic and find a PDF file which I consider a valid search result.

Its name, shown in blue “link text”:

[PDF] System i Roadmap and i5OS V6R1 Preview – ISV – 1-08-08

The “copy link address” function of the browser gives this:

http://www.google.com/url?

sa=t&

rct=j&

q=ibm%20i5%2Fos%20manual

&source=web&cd=5&ved=0CE8QFjAE&url=http%3A%2F%2Fwww.gatewa y400.org%2Fdocuments%2FGateway400%2FHandouts%2FSystem~ *~@~2520i%2520Roadmap%2520and%2520i5OS%2520V6R1%25 20Preview%2520-%2520ISV%2520-25201-08-08.pdf

&ei=gAKFT4O9BsrFswaN0fTBBg&

usg=AFQjCNGin6kNGlUleKwdEbrXhPiI3pLTmA

With the “The FreeBSD Project” example above, I had the opportunity of copying the “green address” below the blue google result link. With this example, I don’t have even that anymore.

Reason? The real URL to the search result is “abbreviated”. The green text reads:

http://www.gateway400.org/…/ …

For some shorter results, it may appear like this (from one of the next results):

common.es/p/manuales/sec_sg246668_security_guide_v5r4.pdf

This is something I could easily copy & paste, even though it misses protocol and server. But that’s not a big deal here.

However, and that is my final statement according to those two examples, the “copy link address” should provide the target of the search operation, no “google-affected” URL with information that may contain data that harms privacy or security.

It has worked that way in the past. Why has it been disimproved so badly?

*** edit: improved readability ***

2012-04-19 5:48 am

alexz
The reason result links are not direct link is to allow google to track which results are used.

Sure, on one hand it can be bothersome and be despicable tracking. But on the other it allows then to make conjecture between search terms and related interesting results.

Note: Nothing stops them of making it more user friendly.

On one website we use similar process with user friendly javascript, a simple click would pass through our special url. But right click and copy link works as expected. Google breaks that by using the mousedown event to block any attempt to copy the link.

2012-04-18 11:52 pm

Hiev
Retarded, Google can get the localization information from the heather and IP. Why the need of all the extra params?

Edited 2012-04-18 23:53 UTC

2012-04-19 1:15 am

Alfman verbose=1
Hiev,

“Retarded, Google can get the localization information from the heather and IP. Why the need of all the extra params?”

I won’t delve into whether it’s retarded or nay, but tagging extra tracking params to a URL allows a website to track sessions in browsers that would otherwise disable cookies. Furthermore it allows websites to unambiguously track navigation between pages in a session. Also, because google controls adsense/doubleclick/google analytics/youtube/etc networks, it could enable them to track sessions across 3rd party domains as well (that is scary).

* Note I don’t actually know what google tracks this way, but I assume it tracks everything it can.

2012-04-19 8:23 am

aargh
I guess one WTFLOL deserved another WTFLOL.

I’m looking at features of that editor and comparing it with vim: check, check, minimap – that would be kinda hard to do in text mode, check, check, all check.

Edited 2012-04-19 08:24 UTC