<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>UMBC ebiquity &#187; memeta</title>
	<atom:link href="http://ebiquity.umbc.edu/blogger/category/web/memeta/feed/" rel="self" type="application/rss+xml" />
	<link>http://ebiquity.umbc.edu/blogger</link>
	<description>EBB is the ebiquity research group\\\'s blog at the University of Maryland, Baltimore County (UMBC).  We focus on technologies that facilitate the design, implementation and control of distributed, intelligent information systems -- mobile and pervasive computing, ad hoc networking, multiagent systems, knowledge representation and reasoning, and the semantic web.  As the tides of technology ebb and flow, we hope the good ideas wash up on our beach and the bad ones drift back out to sea.</description>
	<lastBuildDate>Fri, 20 Nov 2009 13:50:39 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Sifry&#8217;s state of the blogosphere</title>
		<link>http://ebiquity.umbc.edu/blogger/2006/02/06/sifrys-state-of-the-blogosphere/</link>
		<comments>http://ebiquity.umbc.edu/blogger/2006/02/06/sifrys-state-of-the-blogosphere/#comments</comments>
		<pubDate>Mon, 06 Feb 2006 19:39:04 +0000</pubDate>
		<dc:creator>Tim Finin</dc:creator>
				<category><![CDATA[Blogging]]></category>
		<category><![CDATA[memeta]]></category>
		<category><![CDATA[splog]]></category>

		<guid isPermaLink="false">http://ebiquity.umbc.edu/blogger/?p=476</guid>
		<description><![CDATA[Technorati&#8217;s David Sifry has posted another State of the Blogosphere report with lots of interesting statistics.  Highlights include

 Technorati tracks 50K posts and hour from 27M blogs. 
 The number of blogs doubles evey six months.  
 Splogs and spings are increasing.  
 Tagging is increasingly popular. 

]]></description>
			<content:encoded><![CDATA[<p>Technorati&#8217;s David Sifry has posted another <a href="http://www.sifry.com/alerts/archives/000419.html">State of the Blogosphere</a> report with lots of interesting statistics.  Highlights include</p>
<ul>
<li> Technorati tracks 50K posts and hour from 27M blogs. </li>
<li> The number of blogs doubles evey six months.  </li>
<li> Splogs and spings are increasing.  </li>
<li> Tagging is increasingly popular. </li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://ebiquity.umbc.edu/blogger/2006/02/06/sifrys-state-of-the-blogosphere/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Welcome to the Splogosphere: 75% of new pings are spings (splogs)</title>
		<link>http://ebiquity.umbc.edu/blogger/2005/12/15/welcome-to-the-splogosphere-75-of-new-blog-posts-are-spam/</link>
		<comments>http://ebiquity.umbc.edu/blogger/2005/12/15/welcome-to-the-splogosphere-75-of-new-blog-posts-are-spam/#comments</comments>
		<pubDate>Thu, 15 Dec 2005 15:23:55 +0000</pubDate>
		<dc:creator>Pranam Kolari</dc:creator>
				<category><![CDATA[Blogging]]></category>
		<category><![CDATA[GENERAL]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[memeta]]></category>
		<category><![CDATA[splog]]></category>

		<guid isPermaLink="false">http://ebiquity.umbc.edu/blogger/?p=429</guid>
		<description><![CDATA[Recent studies at UMBC show that 75% of posts to English language weblogs are from spam blogs or splogs.]]></description>
			<content:encoded><![CDATA[<p>In the blogosphere, pings are notifications sent by updated blogs to <a href="http://en.wikipedia.org/wiki/Ping_blog">PingServers</a>. A major issue recently has been unjustified pings, also known as <a href="http://en.wikipedia.org/wiki/Sping">Spings</a>, sent by <a href="http://en.wikipedia.org/wiki/Splog">Splogs</a>.  Splogs have been discussed a lot recently, including an interesting   thread on  <a href="http://www.micropersuasion.com/2005/12/blog_post_pirac.html"> post piracy </a>  that Steve Rubel  <a href="http://www.micropersuasion.com/2005/12/blog_content_th.html"> initiated</a>  on Micropersuasion.</p>
<p>The problem of splogs prompted us to analyze pings from <a href="http://weblogs.com">weblogs.com</a>, which publishes <a href="http://weblogs.com/api.html">hourly pings</a> as changes.xml. We have been collecting these pings over the last 4 weeks for a total of 40 million pings from around 14 million (so claimed) blogs. To begin with, we applied a language identification technique implemented by James Mayfield to identify language by fetching these blogs. As expected most of the pings were from blogs authored in English. But we were able to identify blogs from many other languages as well. For instance, charts below show a distribution of pings from blogs authored in Italian &#8212; over a day and over a week. Each bar denotes the number of pings per hour.<br />
<center><br />
<img src="http://memeta.umbc.edu/stats/ebb.ping.it.1.png" alt="Pings over a day" /><br />
<img src="http://memeta.umbc.edu/stats/ebb.ping.it.7.png" alt="Pings over 8 days" /><br />
</center><br />
All times are in GMT; clearly Italian authored blogs display a specific blogging pattern.</p>
<p>In the next step we used our <a href="http://ebiquity.umbc.edu/paper/html/id/269/">work on splog detection</a> to detect splogs (and hence spings) among the english blogs. Our detection mechanism is close to 90% accurate. As shown in the charts below pings from blogs average around 8K per hour and those from splogs average around 25K.<br />
<center><br />
<img src="http://memeta.umbc.edu/stats/ebb.ping.blog.7.png" alt="Blog Pings" /><br />
<img src="http://memeta.umbc.edu/stats/ebb.ping.splog.7.png" alt="Splog Pings" /><br />
</center><br />
Clearly almost 3 out of 4 pings are spings! Going back further to the source of these spings, we observed that more than 50% of claimed blogs pinging weblogs.com are splogs.</p>
<p>Based on the interestingness of this preliminary statistics, scope for further analysis and interest in the resulting dataset we decided to continuosly monitor the <i>pingosphere</i>. So, we now do it &#8220;live&#8221; on updated blogs published by weblogs.com(delayed by an hour),  and have made it publicly available at <a href="http://memeta.umbc.edu">http://memeta.umbc.edu</a>.  The site lists blogging patterns for many other languages, and compares splogs with blogs. All of our work is part of a larger project memeta, towards analyzing the content and structure of the blogosphere. </p>
<p>We hope our effort is a good complement to existing services (e.g., <a href="http://www.fightsplog.com/">FightSplog</a>, <a href="http://www.splogreporter.com/"> SplogReporter</a> and <a href="http://splogspot.com/pages/submit">SplogSpot</a>) towards combating splogs. We currently publish only simple ping statistics on this site, but do stay tuned for fresh splog and classified blog dumps and much more!</p>
<p>UPDATE: <a href="http://datamining.typepad.com/data_mining/">Matthew Hurst</a> from BlogPulse <a href="http://blogsearch.google.com/blogsearch?hl=en&#038;q=site%3Adatamining.typepad.com+24&#038;btnG=Search+Blogs">points us to an interesting analysis</a> he has done on a day of weblogs.com pings.</p>
]]></content:encoded>
			<wfw:commentRss>http://ebiquity.umbc.edu/blogger/2005/12/15/welcome-to-the-splogosphere-75-of-new-blog-posts-are-spam/feed/</wfw:commentRss>
		<slash:comments>26</slash:comments>
		</item>
	</channel>
</rss>
