Half of Swoogle’s hits are from referer log spammers

February 4th, 2006

We are using bbclone to generate reports on Swoogle access. Look at today’s top 10 referers as of 3:00pm:

  www.legaladvocate.net  246     26.14%
  www.myjavaserver.com   152     16.15%
  www.google.com         125     13.28%
  dannyayers.com         44      4.68%
  lucky7.to              34      3.61%
  ebiquity.umbc.edu      25      2.66%
  www.google.de          18      1.91%
  planetrdf.com          18      1.91%
  mail.google.com        18      1.91%
  groups.google.com      14      1.49%

One and five are clearly spam sites and two is suspicious, too. The first, for example, appears to be about poker, though the site name is legaladvocat. The site’s text is obviously automatically generated nonsense. All of the links point to subpages in the same domain with a similar structure and content. I assume that once the site achineves a high pageRank, it will be repurposed or sold.

So, it seems like nearly 50% of our hits are due to referer log spamming. I’d guess Swoogle was picked by finding its URL on recent posts found on a blog search engine or a ping server.