Last week I noticed that some of our blog posts took a long time to show up in the Google Blog search index. During the past year, Google has been very fast at indexing blog posts, typically taking less than five minutes from the time is made to when it shows up in their blog search index. But this week it seemed that our posts, or at least some of them, took more than twelve hours to be indexed.
Yesterday I tried to watch a post I made on the IT job market which I wrote just before 11:00am (GMT-5). It showed up in Google Feed Reader quickly enough but had not yet appeared in Google Blog Search when I finally went to bed 14 hours later. When I checked at 9:00am today, it was there, so it took sometime between 14 and 22 hours.
It’s not the case that all posts are being delayed — do a Google Blog search for a popular term (e.g., TV) sorted by date and you’ll see posts made in the past few minutes. Nor do I think it’s related to pageRank — their blog search ingest is based on pings rather than crawling. Besides, our blog enjoys a reasonable rank. Finally, it can’t be the case that Google’s systems are being overwhelmed by new blogs — the growth of the Blogosphere has slowed.
So I’m puzzled about what is going on. (goomtitag)
Update 1: Posted at 9:49, in Google Feed Reader at 10:14, indexed by Google Blog Search by ~19:15 and in Google’s main index about the same time. Maybe this is a clue — it used to be the case that a post hit the blog index within a few minutes and showed up in the main index after about twelve hours. This post hit both indexes around the same time — after about ten hours. Maybe there is now just one (logical) index.
Update 2: Hmmm. Another post seems to have made it into Google’s main index before it got into the blog search index. I imagine that Google revisited our blog home page as part of it’s regular crawl and picked up the new post.