UMBC ebiquity
Welcome to the Splogosphere: 75% of new pings are spings (splogs)

Welcome to the Splogosphere: 75% of new pings are spings (splogs)

Pranam Kolari, 10:23am 15 December 2005

In the blogosphere, pings are notifications sent by updated blogs to PingServers. A major issue recently has been unjustified pings, also known as Spings, sent by Splogs. Splogs have been discussed a lot recently, including an interesting thread on post piracy that Steve Rubel initiated on Micropersuasion.

The problem of splogs prompted us to analyze pings from weblogs.com, which publishes hourly pings as changes.xml. We have been collecting these pings over the last 4 weeks for a total of 40 million pings from around 14 million (so claimed) blogs. To begin with, we applied a language identification technique implemented by James Mayfield to identify language by fetching these blogs. As expected most of the pings were from blogs authored in English. But we were able to identify blogs from many other languages as well. For instance, charts below show a distribution of pings from blogs authored in Italian — over a day and over a week. Each bar denotes the number of pings per hour.


Pings over a day
Pings over 8 days

All times are in GMT; clearly Italian authored blogs display a specific blogging pattern.

In the next step we used our work on splog detection to detect splogs (and hence spings) among the english blogs. Our detection mechanism is close to 90% accurate. As shown in the charts below pings from blogs average around 8K per hour and those from splogs average around 25K.


Blog Pings
Splog Pings

Clearly almost 3 out of 4 pings are spings! Going back further to the source of these spings, we observed that more than 50% of claimed blogs pinging weblogs.com are splogs.

Based on the interestingness of this preliminary statistics, scope for further analysis and interest in the resulting dataset we decided to continuosly monitor the pingosphere. So, we now do it “live” on updated blogs published by weblogs.com(delayed by an hour), and have made it publicly available at http://memeta.umbc.edu. The site lists blogging patterns for many other languages, and compares splogs with blogs. All of our work is part of a larger project memeta, towards analyzing the content and structure of the blogosphere.

We hope our effort is a good complement to existing services (e.g., FightSplog, SplogReporter and SplogSpot) towards combating splogs. We currently publish only simple ping statistics on this site, but do stay tuned for fresh splog and classified blog dumps and much more!

UPDATE: Matthew Hurst from BlogPulse points us to an interesting analysis he has done on a day of weblogs.com pings.


26 Responses to “Welcome to the Splogosphere: 75% of new pings are spings (splogs)”

  1. Micro Persuasion Says:

    Study Says 75% of All Blog Pings Are Spam

    According to a study performed at UMBC eBiquity Research Group at the University of Maryland, nearly three out of every four pings to blog ping servers are from splogs – or what they’re calling spings! They also found that more

  2. B.L. Ochman's weblog - Internet and corporate blogging strategy, and online marketing trends, with news and commentary Says:

    Study: 75% of Blog Pings are from Spam Blogs

    A new study at UMBC eBiquity Research Group at the University of Maryland proves that almost three of every four pings to blog servers are from spam blogs, or splogs, which they say constitute 50% of all claimed blogs. Frankly, I don’t see that blog s…

  3. » Study claims 75% of blog pings come from spam blogs The Blog Herald: more blog news more often Says:

    [...] A study by UMBC eBiquity Research Group has found that nearly three out of every four pings to blog ping servers are from spam blogs and more than 50% of blogs pinging weblogs.com are spam blogs. The report can be found here. [...]

  4. Rauru Blog » Blog Archive » Sping — Spam Blog ã?«ã‚ˆã‚‹ Ping Says:

    [...] メリーランド大 eBiquity ã?®å ±å‘Šã?«ã‚ˆã‚Šã?¾ã?™ã?¨ã€?weblogs.com ã?® ping を分æž?ã?—ã?Ÿçµ?æžœã€?英語åœ?ã?‹ã‚‰ã?® ping ã?®75%ã?¯ splog ã?‹ã‚‰ã?® sping ã?§ã?‚ã‚‹ã€?ã?¨çµ?è«–ã?•ã‚Œã?Ÿã??ã?†ã?§ã?™ã€‚ ã?©ã?†ã‚„ã?£ã?¦ splog ã?‹ã?©ã?†ã?‹ã‚’見分ã?‘ã?Ÿã?‹ã?¯ä¾‹ã?«ã‚ˆã?£ã?¦è¬Žã?§ã?™ã?Œã€?eBiquity ã?¯ 我々ã?®ã‚¢ãƒ«ã‚´ãƒªã‚ºãƒ ã?¯90%ã?®ç²¾åº¦ã?§è¦‹åˆ†ã?‘ã‚‹ã?“ã?¨ã?Œã?§ã??ã‚‹ ã?¨ä¸»å¼µã?—ã?¦ã?„ã?¾ã?™ã€‚é?žè‹±èªžåœ?ã?¾ã?§å…¥ã‚Œã?Ÿå ´å?ˆã?® sping 率ã?¯50%程度ã?¾ã?§ä½Žä¸‹ã?™ã‚‹æ¨¡æ§˜ã?§ã?™ã?Œã€?ã??ã‚Œã?§ã‚‚ã?²ã?©ã?„話ã?«å¤‰ã‚?ã‚Šã?‚ã‚Šã?¾ã?›ã‚“。 [...]

  5. Spings Says:

    [...] Aunque parezca mentira aún quedamos algunos blogs escritos por personas humanas. El resto… bienvenidos a la Splogosfere: el 75% de los pings que se generar hoy en día son spings (pings hechos por splogs). Los blogs – como este – usan los pings para notificar a los servicios como bitacoras.com o technorati que tienen un nuevo post. Pues este mecanismo también lo usan los splogs – blogs generados automaticamente – y estan cargándose la eficacia de los pings servers. El spam rebosa en la blogosfera; ya es un canal de comunicación maduro [...]

  6. blogvp Says:

    Study Says 75% Of All Blog Pings Are Spam: It’s Not For Me

    Some study says that 75% of all pings are spam, but this is not the case for me, or at least with the blogs that I run. There is no mention of what makes “splog” or a spam blog, so maybe a little bit more of information on that would be ni…

  7. Headshift Says:

    Spings and Splogs

    Welcome to the Splogosphere: 75% of new pings are spings.

  8. 75% Blog Pings are Spam Says:

    [...] In an effort to up my weekly posting of words with the suffix -log: As shown in the charts below pings from blogs average around 8K per hour and those from splogs average around 25K. Clearly almost 3 out of 4 pings are spings! Going back further to the source of these spings, we observed that more than 50% of claimed blogs pinging weblogs.com are splogs. [...]

  9. Fight Splog » Study shows 75% of pings are from splogs Says:

    [...] A new study shows that 75% of pings are generated from splogs. The report analysis pings from weblogs.com over a four week period. we used our work on splog detection to detect splogs (and hence spings) among the english blogs. Our detection mechanism is close to 90% accurate. As shown in the charts below pings from blogs average around 8K per hour and those from splogs average around 25K. [...]

  10. /blogosphere/ID/ Says:

    75% of Pings are Splogs

    According to a study just released by the Ebiquity Group, 75% of all blog pings belong to splogs (these splog pings now known by the moniker ‘spin…

  11. Jason Herschel Says:

    It would be nice to know where these pings are originating. I assume that the majority are coming through web applications such as Ping-O-Matic or Pingoat. Despite attempts to put ‘splog’ filters in place to combat this problem, the ease with which you can send, and even automate the sending of, pings through these services is problematic. Spings will remain a problem as long as this is the case.

  12. Small Business Blogging Says:

    Study: 75% of Blog Pings are from Splogs

    A new study at UMBC eBiquity Research Group at the University of Maryland shows that almost three out of four pings to blog servers are from spam blogs (splogs). Those source of spings constitute 50% of all claimed blogs. They claimed that their detec…

  13. Uppkopplat.se - nyhetsblogg om bloggar, sökmotorer e-handel och marknadsföring Says:

    [...] En ny studie från University of Maryland som har analyserat 40 miljoner pingar hos Weblogs visar att tre av fyra pingar kommer från spambloggar eller så kallade sblogs. Mer än hälften av alla bloggar som pingar hos Weblogs är sblogs. Forskargruppen har även satt upp en tjänst där besökarna kan se andelen sblogs vs riktiga bloggar under de sju senaste dagarna. [...]

  14. Dick Hardt - Blame Canada » Structured Blogging is Happening Says:

    [...] 4. Spam issues. One kind of structure that is needed is the identity of the blog. 75% of new pings are spings (splogs). Splogs and spings are degrading the value of the real time web. Efforts are under way to resolve this problem, and hopefully we have all learned from our ant-spam experiences on how to do this right. [...]

  15. logbook.biz » Blog Archive » The Blogosphere is Full of Splogs Says:

    [...] found on Bloggers Blog: Blogging the Blogsphere eBiquity has a conducted a study that found 75% of new pings are splogs. Micropersuasion.com says this problem needs to be solved: Clearly this issue is bigger than everyone probably is imagining, despite what David Sifry says. This must be solved now. Who besides Mark Cuban is taking the lead on this? The future of the blogosphere is at stake here. This has to be addressed at the publisher level. Does anyone care about this or is everyone busy building new features? Memeta is also providing current data on the amount of splogs being published on this page which includes graphs that show the amount of blogs and splogs pinged over the last seven days. The latest graphs show a blogosphere that is over 50% splog. Memeta also mentioned several other splog fighting sites and tools: FightSplog, SplogReporter and SplogSpot. related posts: [...]

  16. Splog filtering through human ranking - Migs Paraz - Random Takes Says:

    [...] Tailrank’s feed blog index is available for licensing. One highlight is that it’s splog free, not just from algorithms, but because people filter them. Spam pinging/spinging has been on my mind ever since I read the Welcome to the Splogosphere: 75% of new pings are spings (splogs) post at UMBC Ebiquity. [...]

  17. Jeff Barr’s Blog » Links for Wednesday, December 28, 2005: Ajax, Blogging, Splogs Says:

    [...] Welcome to the Splogosphere. [...]

  18. marc-o.net » » Le pay-per-click se meurt Says:

    [...] Selon une récente étude de l’Université de Baltimore, 75% des nouveaux pings reçus par weblogs.com émanent de splogs. D’après Wired (qui évoque d’autres types d’abus liés au pay-per-click) cette course aux clics et au Page Rank rapporterait plusieurs milliers de dollars aux sploggers, qui créent des réseaux de splogs truffés de liens croisés, de contenu volé et de publicités. Le problème est connu de Google qui, paradoxalement, héberge lui-même des splogs (sur Blogger) et contribue ainsi au pourrissement de son propre système. [...]

  19. Harry Chen Thinks Aloud » Blog Archive » Fighting Splogs in the Blogosphere Says:

    [...] Pranam Kolari, a UMBC doctoral student, has discovered nearly 75 percent of blog updates that registered with weblogs.com are bogus (more technical details). So why do people spam blog ping servers? The motivation behind splogs is the same as that for any other form of spam – it comes down to money. [...]

  20. SEO Expert » Blog Archive » 3 Out of 4 New Pings Are Spings Says:

    [...] A new study at UMBC eBiquity Research Group proves that almost three of every four pings to blog servers are from spam blogs, or splogs. Those interested to see their findings on the pingosphere over time can check it out http://memeta.umbc.edu [...]

  21. Mary Skyers Says:

    Technorati, an authority on what’s going on in the blogosphere, is currently tracking 19.9 million sites. As of October 2005, it’s seeing an average 70,000 new blogs created each day. A new blog created every minute, on average.

  22. Patricia Lacey Says:

    Perseus Development randomly surveyed 3,634 blogs on eight leading blog-hosting services to develop a blogosphere model. Based on this research, Perseus estimates 4.12 million blogs have been created by the eight providers.

  23. Paul Brown Says:

    If trends remain constant, nearly 13.1 million Technorati-tracked blogs are abandoned by their authors every two months, of which 5.3 million blogs are abandoned after a single post. What percentage of these blogs become spam remains an elusive figure.

  24. John Bushelle Says:

    One thing’s for certain: 100 percent of abandoned blogs add to the cyber clutter crawled and indexed by search engine spiders. Small wonder the major engines have stopped focusing on index growth, with the blogosphere doubling in size every five months.

  25. SEO Case Study Says:

    Hi, the link to view the splog analysis is not working.
    So many blogs are abandoned after a short time that the SE’s could probably add some additional criteria when evaluating the authority of a blog.

  26. Splog software from Hell Says:

    [...] been working on the problem of identifying splogs (spam blogs) for the past six months. Our studies show that almost 75% of of the blog posts from weblogs.com’s feed and about 25% of fresh [...]