Allow me to introduce myself, my name is Wentworth Miller, you might remember me from such TV classics as ‘prison break’, ‘Prison break season 2’ and ‘Prison break season 3’. I have recently had ear-enlargement surgery. My one true vocation or calling is getting myself wrongfully thrown into high level security prisons and then taking on the gruelling, challenging task of breaking out. Well, actually I have slight interest in some other more sensible activity: I do find search technology quite fascinating, and I’d like to learn more about mass parallel computing and how it enhances the effectiveness and accuracy of information retrieval. In short, is the internet one day going to wake up and become sentient, and automatically inherit the personality of Tim Burners-lee, its father, or worse, Deuce Bigalow: Male Gigolo?

Semantics, the meaning in communication, has always been fantastically neglected in search. Not even Google have mastered a perfect analysis and interpretation of intent in a searcher’s usually insufficient query phrase. So far it seems they have been focusing their vast resources on protecting their search results from attack from less than savoury gaming tactics. One gets the feeling they will never get around to actually turning their search algorithm into an intelligent one.

Imagine a single human being, sitting at Google’s main data centre, being fed consecutive query phrases and having to look over billions of documents, discovering which one’s contain what the writer of the query actually needed, and then giving them back in an order of relevance. This person of course could not complete one query alone within the lifespan of our universe. But if you gave him a handful of relevant documents and asked him to pick the one that most suited the searcher’s request, he would do it far better than Google’s algorithm.

The faculties that the analyst possesses that allow him to do this are his own experience, and his own imagination. The difference is that Google’s algo cannot know that the hypothetical searcher’s query “lance armstrong” clearly indicates that they never wanted any information about Neil Armstrong the Astronaut. Even when the searcher types “lance armstrong cyclist winner of tour de france”, they will likely be given somewhere in their results documents relating to Neil Armstrong, cycling, Tour de France, France, de, etc. The problem here is that Google could not detect that the searcher in fact only wanted to know who Lance Armstrong was. It assumed that they might perhaps have needed a summary of the country of France. Would the job not be easier for the search engine had they only focused on the absolutely relevant results?

There is also the example where the searcher, let’s call him Steve, surfs the net using his faithful search engine, not looking for exact explanations, but for varied results, perhaps to get an unbiased and unrelated comparison of something; a product maybe. Steve types in “bathroom paint colours” because he wishes to browse vaguely over a variety of options. In this case the randomness of Google’s results may pay him service.

The skill comes in determining when the query is that type of query or the other,. A human can do this most of the time because he or she can weigh up the factors involved. We aren’t going to be asked about Lance Armstrong and respond with explanations about Neil Armstrong,; we simply don’t have time to communicate everything all at once, and neither does a Steve, in this case.

So where does this leave us? The minute someone develops a system that can interpret phrase queries as fast as a search engine, and as instinctively as a person, is the minute that the internet wakes up and realises it’s own existence, sees the folly of mankind, and self destructs the Earth before we can do any real damage.

Share or Bookmark this post:
These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Sphinn
  • bodytext
  • StumbleUpon
  • del.icio.us
  • Technorati
  • TwitThis
  • Reddit
  • Mixx
  • Slashdot
  • Propeller
  • NewsVine