One important component has been missing from this site ever since it went live back in 2003 - search. As you might guess, Blogger doesn't provide this functionality, so whenever I needed to find something from an old post, I had to rely on Google's site-specific search.

The other day I decided that it was high time I added a search box to the site, but just throwing in a search box that directed you to one of the many search engines wasn't very appealing. It just looks out of place, and you're essentially directing visitors away from the site, only to have them come back in through a search result.

So I used the MSN Search API, which uses the SOAP protocol, to build a search page that fit in with the rest of the site. Now I know that most of you are scratching your heads and wondering - eh, why MSN Search? Search equals Google, no? Not quite. I personally use Google for all my web searches on a daily basis, and I'll admit that my initial intention was to use the Google API, but I experimented a bit and the results changed my mind. Read on...

1) Relevency: a good search engine should be able to take in a search query and return the "best" possible result first. Now the definition of "best" is subjective in most cases, but when you have a smaller subset of data (in this case, site-specific as compared to the entire web), it's sometimes quite obvious what the best result for a set of keywords should be. Let me illustrate this with a couple of examples.

Let's say a visitor came in and wanted to see what I had to say about desktop search apps. Google's first result links to the general reviews page, while MSN's first result takes you straight to my Windows Desktop Search review. In fact, the WDS review doesn't even appear in Google's search results, even though the page has been online for several months already.

Let's consider a more general example. If you remember, I went on a ski-trip on New Year's eve and posted the pictures a few weeks later. Searching for "ski-trip" on MSN Search includes a result that takes me right to the pictures page. The same search query on Google returns only a link to a page with a post linking to the pictures page. Again, surprisingly, the pictures page is absent from Google's results, even though it explicitly includes the term "ski-trip."

Just two examples right there, but I did try several other combinations of keywords and phrases, and the results matched my predictions.

2) Freshness: relevency is important, but effectiveness also depends largely on the freshness of the search index; i.e., how often the index is updated. I was unintentionally led into comparing this aspect of Google and MSN's index some time ago when we lost a chunk of the OSNN forum database. I was trying to salvage significant threads and posts from search engine caches, and the results were quite interesting - MSN's search results contained recent threads that the Google bot hadn't indexed yet. Apparently, MSN's crawlers are quite a bit more aggressive compared to their Google counterparts. I suppose this could be an issue if you had bandwidth constraints, but if you want fresh search results, it definitely works to your advantage.

That incident prompted me to do a similar experiment with site-specific search, limited only to this domain. Search for Honda, and you notice that the post about the ad appears in MSN's results, and, as you might have guessed, doesn't appear in Google's. Try searching for the blonde joke. Or try looking for XMPP (referring to the Google Talk post). Same results. In fact, searching for unique terms that were contained in posts made about 21 days ago return nothing in Google. Odd!

Anyway, the actual point of this post was to let you know that we finally have search that looks and feels like it's part of the site. But now you know the reasoning behind my choice too. :)

Technorati Tags: ,