Google’s blog search

September 14th, 2005

Google has released a beta version of Google Blog Search, a service that finds blog posts. Its options support searching in date ranges, for posts by an author, posts in a given language and restricting searchs by URL or to blogs with certain words in their title. Google’s FAQ on this has a few interesting details, including

  • “The goal of Blog Search is to include every blog that publishes a site feed (either RSS or Atom).”
  • They diuscover blogs by monitoring ping services (e.g., weblogs.com) and started indexing in June 2005
  • While the query results are posts, “when there are entire blogs that seem to be a good match for your query, these will appear in a short list just above the main search results.”
  • You can subscribe the a query results as an RSS or Atom feed.
  • Google’s usual search operators (e.g., link:, site:, intitle:) are supported plus some blog specific ones.

Some observations:

  • Through some experimentation, I think identifying entire blogs relevant to a query is done by matching queries against blog titles, which has caused me to go and change the title of our blog.
  • Google doesn’t index the entire post page. It ignores, for example, text in the post template. I’m not sure if it’s getting the post text from the feed or if it tries to extract a post’s content from the template background. The former is easy to do, but some people only include a truncated version of their post in the feed. The latter is more robust, but also more difficult to do and do right.

Technorati and other blogs search services will have to work hard to compete with this.