General Question

LostInParadise's avatar

How has Google gotten so good at guessing what I meant?

Asked by LostInParadise (25388points) April 9th, 2009

Google has gotten very good at piecing together requests. It goes beyond simple spell checking. They will take apart words run together and recognize names of organizations. Sometimes they will suggest an alternative request even when mine is exactly as I intended. It must somehow be based on previous queries, but how do they store the information? They can’t possibly store every previous query. Do you have any idea of the type of algorithm they use?

Observing members: 0 Composing members: 0

18 Answers

Zen's avatar

Oh you know…

NaturalMineralWater's avatar

It’s all about the algorithm.

mattbrowne's avatar

I’m not sure. Technically, there are at least 2 ways of doing it.

1) Using the cookie feature of your browser
2) Knowing who you are after you sign into your Google account

The second feature is heavily used not only by Google, but also Amazon. Overall the techniques belong to the field of personalization.

From Wikipedia: Google was the first big search engine to introduce personalized results on a massive scale. Weighing a number of factors including but not limited to user history, bookmarks, community behaviour and site click-through rate and stickiness, Google is providing results that are specific to what they believe you are searching for. Currently this service is only available to those who are logged into their Google account.

There are three broad methods of personalizations:

1. Implicit
2. Explicit
3. Hybrid

Implicit & Explicit methods deal with gathering information about the user. Implicit personalization deals with gathering user information without asking the user for this and that. Explicit personalization on the other hand usually is done by asking the user to rate existing / changed things, filling up forms, clicking radio buttons, etc. Hybrid personalization combines the above two approaches for leverage best of both worlds.

allen_o's avatar

Because big brother is watching you

squirbel's avatar

Cookies, and knowing who you are after you sign in to your account. [just echoing mattbrowne because he is right.]

And cookies store everything. Every query. Feel free to erase them every now and again…

Even though that won’t erase your search history that is recorded on google’s side [whenever you are logged in].

Lupin's avatar

Have you checked out Yahoo’s search with suggestions turned on. I found that to be much better than google. I use Firefox and can easily switch between about a dozen search engines. I like yahoo better. I recently read someplace that google is playing catch up with this feature. As if it’s possible to ever have google play catchup on anything.

richardhenry's avatar

Google have essentially built a badass spell checker (read up on stemming), and then ordered the results by query popularity. By not comparing words to an English dictionary, and instead building their own dictionary of every word ever typed into Google, and then ordering those by popularity, they have a good spread of words you probably meant, including non-English words and organisation names and brands.

Google do store and make available internally every single query (although this sounds impossible to us), by lumping them together into query counts. These query counts also power Google Trends and much of their Zeitgeist data.

@mattbrowne The impression I’ve gotten from reading the Google blog is that these are not based on personalization. A user recieves exactly the same “is this what you meant?” recommendation in every case, logged in or out. Unlike a recommendations engine, this engine is based on interpreting error through natural language processing.

I believe the correction is also heuristic, and learns from it’s mistakes for everyone if a lot of people end up choosing a non-corrected search result. This way, Google can, in a short space of time and without any systems administration or moderation, determine that a search for fcuk doesn’t need to be corrected.

richardhenry's avatar

You have to realise that the technology in this case is probably a relatively simple heuristic spell checker, but because Google can make available an incredibly vast set of popularity data and a vast dictionary ordered by usage, it’s really f’ing good.

mattbrowne's avatar

@richardhenry – Hmm… so nothing changes after a user enters a Google user name and password?

richardhenry's avatar

@mattbrowne Not as far as I know. It’s certainly not the basis for how the technology operates, if they employ it.

squirbel's avatar

Time to start wearing our tinfoil! I’m selling them for 50ยข apiece!

richardhenry's avatar

@squirbel Can I have one for each of my heads?

squirbel's avatar

dirty, dirty man.


richardhenry's avatar

Here’s another example that supports my theory that it is heuristic:

Searching for ‘flickr’ (we all know this one)

Searching for ‘flashr’ (a programming library)

Searching for ‘slappr’ (doesn’t exist)

If someone built “slappr”, a brand new website offering god knows what, and a lot of people ended up on it by searching for that term, Google would no longer correct ‘slappr’ to ‘slapper’.

Real world example:

A website I built called Cursebird used to offer the correction ‘curse bird’ in the Google search for ‘cursebird’, but as the popularity has increased it is now a valid word in the Google dictionary and the correction no longer appears.

richardhenry's avatar

The other interesting point is that only 7% of Cursebird’s traffic comes from Google. From a guess, it seems to be roughly >5k clicks (edit: actually, probably less, I’m assuming it also accounts for the supposed relevance of the term in the pages being clicked on) in a month was the required amount to push the term and for it to validate. I’m sure this changes from group to group.

Judi's avatar

It is very close to becoming self aware, then they will know all our weaknesses and take over the world.

squirbel's avatar

/hands judi a free tinfoil hat. “It’s on me!”

Judi's avatar

Thanks. I need it often.

Answer this question




to answer.

This question is in the General Section. Responses must be helpful and on-topic.

Your answer will be saved while you login or join.

Have a question? Ask Fluther!

What do you know more about?
Knowledge Networking @ Fluther