Not that this isn’t widely understood, but James Robertson does a nice job at putting search context in, erm, context;

If I type HDTV in, I’ve provided no extra context – no information on whether I need a definition, or information on buying, or what have you. It’s a crap shoot. Seattle Hotels has that extra context – not only are you interested in hotels, but you are specifically interested in Hotels in Seattle. The difference between the two result sets is all about the amount of context provided.

I wonder; when a site offloads search to Google via a search form, as many do, does Google use what it knows about that site to provide context for the search?

Some playing around with the Google custom search page revealed that they may not. I first did a search for “CDF” restricted to w3.org, and the top two results were the Channel Definition Format and Compound Document Formats links, as you’d expect. But when I broadened the scope of the search to the entire Web by selecting “Search WWW”, those two were way down the list, with the second link not even on the first page. Interesting.

It seems like an obvious long-tail-ish hack, but I don’t recall hearing anybody mention it being used. But I’m hardly a search guru. Anybody know?

Update: Michael Bernstein sent me a link to what appears to be Google’s Site Flavored Search;

Site-flavored Google search delivers web search results that are customized to individual websites. Simply fill out a profile describing your website’s content, and when you add a site-flavored search box to your site, your users will get search results that are “flavored” to be more attuned to their interests.

When you go through it though, it does ask you for your site URL, then presents its analysis using some circa-1995 Yahoo directory ontology. For example, it told me my site was in the “Internet”, “Programming”, and “Software” categories. Ok, but surely PageRank’s got a lot more to say about that, no? Not with some pre-fab ontology, but in relation to other sites?

Anyhow, so you click on the “Generate HTML” button after that, and it gives you some HTML you include on your site, which includes this line;

<input type=hidden name=interests value=58|62|65>

… which seems to represent those three categories. Ok, but that seems kinda crude, no? It reminds me of del.icio.us, only centralized (their ontology), and not Web friendly (numbers instead of URIs).

So what am I missing? Why is Google doing this, and not something based on PageRank?

Trackback

no comment until now

Add your comment now