<?xml version="1.0"?>

<!DOCTYPE owl [
  <!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#">
  <!ENTITY xsd "http://www.w3.org/2001/XMLSchema#">
  <!ENTITY owl "http://www.w3.org/2002/07/owl#">
  <!ENTITY cc "http://web.resource.org/cc/#">
  <!ENTITY event "http://ebiquity.umbc.edu/ontology/event.owl#">
  <!ENTITY person "http://ebiquity.umbc.edu/ontology/person.owl#">
  <!ENTITY assert "http://ebiquity.umbc.edu/ontology/assertion.owl#">]>

<!--
  This ontology document is licensed under the Creative Commons
  Attribution License. To view a copy of this license, visit
  http://creativecommons.org/licenses/by/2.0/ or send a letter to
  Creative Commons, 559 Nathan Abbott Way, Stanford, California
  94305, USA.
-->

<rdf:RDF 
  xmlns:rdf = "&rdf;"
  xmlns:rdfs = "&rdfs;"
  xmlns:xsd = "&xsd;"
  xmlns:owl = "&owl;"
  xmlns:cc = "&cc;"
  xmlns:event = "&event;"
  xmlns:person = "&person;"
  xmlns:assert = "&assert;">
  <event:Event rdf:about="http://ebiquity.umbc.edu/event/html/id/158/The-Multi-Relational-Blogosphere-Empirical-Characterization-and-Spam-Protection">
    <rdfs:label><![CDATA[The Multi-Relational Blogosphere: Empirical Characterization and Spam Protection]]></rdfs:label>
    <event:title><![CDATA[The Multi-Relational Blogosphere: Empirical Characterization and Spam Protection]]></event:title>
    <event:speaker><person:Alumnus rdf:about="http://ebiquity.umbc.edu/person/html/Pranam/Kolari/"><person:name><![CDATA[Pranam  Kolari]]></person:name><rdfs:label><![CDATA[Pranam  Kolari]]></rdfs:label></person:Alumnus></event:speaker>
    <event:startDate rdf:datatype="&xsd;dateTime">2006-05-10T13:00:00-05:00</event:startDate>
    <event:endDate rdf:datatype="&xsd;dateTime">2006-05-10T14:30:00-05:00</event:endDate>
    <event:location><![CDATA[ITE 325/346]]></event:location>
    <event:abstract><![CDATA[<P>Weblogs, or blogs, have become an important new way to publish information,
engage in discussions and form communities. Blogs collectively constitute 
the blogosphere, a highly influential and dynamic subset on the Web. The 
nature of their content and publishing infrastructure requires that they be
modeled, harvested and analyzed differently from the rest of the web.
</P>
<P>
We first propose a model for the blog graph that extends the more general 
web graph. The web is viewed as a graph G(V, E) where V is the set of pages
and E represents hyperlinks between them. With a focus on the blogosphere, 
we view the web graph at a much lower granularity. Each entity v in the set
V can also be associated with subsets constituted by the blogosphere or web
news sources. In addition, every post hosted by a blog can be considered to
be constituted of a Title, Content, Time, Tag, Author and Comment. This 
multi-relational conceptualization, and its instantiation is made possible
through structured publishing on the blogosphere, enabled by RSS (RDF Site
Summary), OPML (Outline Processor Markup Language), DC (Dublin Core) and 
FOAF (Friend of a Friend), all of which constitute popular metadata
vocabularies.
</P>
<P>
We next propose to characterize instances of this multi-relational model, to
include local content and the link structure involving various entities. We 
will identify the boundaries of the blogosphere, clarify what features 
makes it different from the rest of Web, and study the nature of spam. Such
a characterization will be based on both publicly available blog data-sets, 
as well as those collected using our own system which is capable of 
discovering and harvesting blogs. We will share our experiences in 
implementing a blog harvesting system, the approaches we employed and their 
effectiveness, and provide new mechanisms that could be useful for timely 
content harvesting on the blogosphere.
</P>
<P>
We then propose to tackle spam afflicting the multi-relational blogosphere.
We will formally make a distinction between spam in blogs with e-mail spam 
and the generic web spam. We will provide algorithms and techniques that 
employ both local and relational models to detect and eliminate spam posts 
and the blogs hosting them. We will then explore how such techniques can be
made to adapt and learn in an adversarial classification setting, as 
mechanisms employed by spammers evolve. Based on our analysis of the 
algorithms and their cost, we will finally recommend a multi-step approach
to eliminate spam blogs.
</P>]]></event:abstract>
    <event:tag><![CDATA[blog]]></event:tag>
    <event:tag><![CDATA[blog]]></event:tag>
    <event:tag><![CDATA[blog]]></event:tag>
    <event:tag><![CDATA[blog]]></event:tag>
    <event:tag><![CDATA[blogosphere]]></event:tag>
    <event:tag><![CDATA[splog]]></event:tag>
    <event:tag><![CDATA[splog]]></event:tag>
    <event:tag><![CDATA[web spam]]></event:tag>
    <event:host><person:PrincipalFaculty rdf:about="http://ebiquity.umbc.edu/person/html/Tim/Finin/"><person:name><![CDATA[Tim  Finin]]></person:name><rdfs:label><![CDATA[Tim  Finin]]></rdfs:label></person:PrincipalFaculty></event:host>
  </event:Event>

  <rdf:Description rdf:about="">
    <cc:License rdf:resource="http://creativecommons.org/licenses/by/2.0/" />
  </rdf:Description>

</rdf:RDF>
