UMBC ebiquity research group Building intelligent systems in open, heterogeneous, dynamic, distributed environments
15 October 2008, 18:32:32 EDT  
2006 February

Archive for February, 2006

Game theoretic analysis of the toilet seat problem

February 7th, 2006, by Tim Finin, posted in Humor

Here’s a practical, scientific result.

A Game Theoretic Approach to the Toilet Seat Problem, Richard Harter, Science Creative Quarterly, number four, January 2006.

The toilet seat problem has been the subject of much controversey. In this paper we consider a simplified model of the toilet seat problem. We shall show that for this model there is an inherent conflict of interest which can be resolved by a equity solution.

Consider a bathroom with one omnipurpose toilet (also known as a WC) which is used for two toilet operations which we shall designate as #1 and #2. The toilet has an attachment which we shall refer to as the seat (but see remark 1 below) which may be in either of two positions which we shall designate as up and down.

Swoogle: over 1,000,000 Semantic Web documents

February 6th, 2006, by Tim Finin, posted in OWL, RDF, Semantic Web, Swoogle, Web

Sometime today the UMBC Swoogle Semantic Web search engine discovered and indexed its millionth document. Of these, about 77% are valid RDF documents, 15% HTML documents with embedded RDF and 8% appear to be RDF documents but can not be parsed.

MIT and Cambridge to build free wireless mesh network

February 6th, 2006, by Tim Finin, posted in Computing Research, Mobile Computing

The MIT Tech reports on a plan in which MIT is collaborating with the city of Cambridge to deploy a free wireless mesh network. The article has some interesting technical details and says that the plan is based on MIT’s roofnet project, an experimental 802.11b/g mesh network in development at MIT CSAIL.

A collaboration with MIT researchers may provide Cambridge with a free, city-wide, wireless internet service as early as late summer. The project will rely on a mesh networking technology that allows individual computers to become new access points, projecting the reach of the network beyond its original antennas.

Traditionally, a wireless network is centralized around one wireless access point, which communicates with a wireless card in any laptop or desktop computer, Hart said. Mesh technology allows individual computers to propagate the network and act as new access points, making it unnecessary for a user to be within range of the original wireless signal, she said.

[the wireless access points] are constructed from $15 commercial access points purchased from the software manufacturer NETGEAR, he said. The 40 milliwatt chip inside the commercial product is replaced with a 400 milliwatt chip and ‘hacked’ to include computer code that enables the mesh technology, he said.

The code, which is publicly available, was written by an MIT research group called Roofnet. Daniel E. Aguayo G, a Roofnet researcher, said that though they were not the first to write a code for mesh technology, they were the first to conduct a large-scale test of their software. …

Is my document indexed by Swoogle?

February 6th, 2006, by li ding, posted in AI, Ontologies, Semantic Web, Swoogle, Web

“Swoogle has indexed millions of Semantic Web Documents, but how do I know that mine has been indexed?” Here is a simple way - please try your URL using Swoogle Track Back Service. Here I list several example to show how it works:

  • It helps us track the evolution of an ontology - say the protégé ontology
  • http://protege.stanford.edu/plugins/owl/protege
    ——————————————————————————–
    About this URL
    The latest ping on [2006-01-29] shows its status is [Succeed, changed into SWD].
    Its latest cached original snapshot is [2006-01-29 (3373 bytes)]
    Its latest cached NTriples snapshot is [2006-01-29 (41 triples)].
    ——————————————————————————–
    We have found 7 cached versions.
    2006-01-29: Original Snapshot (3373 bytes), NTriples Snapshot (41 triples)
    2005-08-25: Original Snapshot (3373 bytes), NTriples Snapshot (41 triples)
    2005-07-16: Original Snapshot (2439 bytes), NTriples Snapshot (35 triples)
    2005-05-20: Original Snapshot (2173 bytes), NTriples Snapshot (30 triples)
    2005-04-10: Original Snapshot (1909 bytes), NTriples Snapshot (28 triples)
    2005-02-25: Original Snapshot (1869 bytes), NTriples Snapshot (27 triples)
    2005-01-24: Original Snapshot, NTriples Snapshot (31 triples)

  • We may also check the growth of FOAF documents.
  • http://www.csee.umbc.edu/~dingli1/foaf.rdf
    ——————————————————————————–
    About this URL
    The latest ping on [2006-01-29] shows its status is [Succeed, changed into SWD].
    Its latest cached original snapshot is [2006-01-29 (6072 bytes)]
    Its latest cached NTriples snapshot is [2006-01-29 (98 triples)].
    ——————————————————————————–
    We have found 6 cached versions.
    2006-01-29: Original Snapshot (6072 bytes), NTriples Snapshot (98 triples)
    2005-07-16: Original Snapshot (6072 bytes), NTriples Snapshot (98 triples)
    2005-06-19: Original Snapshot (5053 bytes), NTriples Snapshot (80 triples)
    2005-04-17: Original Snapshot (3142 bytes), NTriples Snapshot (50 triples)
    2005-04-01: Original Snapshot (1761 bytes), NTriples Snapshot (29 triples)
    2005-01-24: Original Snapshot, NTriples Snapshot (29 triples)

  • Finally, this service may also help us learn the life cycle of a semantic web document: it was created, actively maintained, lingered around for a while and finally died (i.e. went offline).
  • http://simile.mit.edu/repository/fresnel/style.rdfs.n3
    ——————————————————————————–
    About this URL
    The latest ping on [2006-02-02] shows its status is [Failed, http code is not 200 (or406)].
    Its latest cached original snapshot is [2005-03-09 (15809 bytes)]
    Its latest cached NTriples snapshot is [2005-03-09 (149 triples)].
    ——————————————————————————–
    We have found 3 cached versions.
    2005-03-09: Original Snapshot (15809 bytes), NTriples Snapshot (149 triples)
    2005-02-25: Original Snapshot (12043 bytes), NTriples Snapshot (149 triples)
    2005-01-26: Original Snapshot, NTriples Snapshot (145 triples)

NOTICE: Yesterday we posted a form that direct you to Swoogle trackback service. Unfortunately, the form failed when it was called outside our firewall because a Swoogle API key is required. We didn’t notice at first, because we were inside the firewall when we tested it. When we did, we deleted the post, but PlanetRDF had already picked up the post and it was still in our database. Now the form has been removed, but you can definitely go to swoolge web site and try trackback service there.

Sifry’s state of the blogosphere

February 6th, 2006, by Tim Finin, posted in Blogging, memeta, splog

Technorati’s David Sifry has posted another State of the Blogosphere report with lots of interesting statistics. Highlights include

  • Technorati tracks 50K posts and hour from 27M blogs.
  • The number of blogs doubles evey six months.
  • Splogs and spings are increasing.
  • Tagging is increasingly popular.

Well, Is my document indexed by Swoogle or not?!

February 6th, 2006, by Tim Finin, posted in RDF, Semantic Web, Swoogle

Yesterday we posted directions on how to tell if your Semantic Web document is in Swoogle’s database. Unfortunately, our directions suggested using a service that, if called outside our firewall, requires a Swoogle API key. (This is seperate from being a registered Swoogle user.) We didn’t notice at first, because we were inside the firewall when we tested it. When we did, we deleted the post, but PlanetRDF had already picked up the post and it was still in our database. We’re working to straighten this out and hope to have the service available soon.

FIPA’s P2P Nomadic Agent standards

February 5th, 2006, by Tim Finin, posted in AI, Agents, Mobile Computing, Pervasive Computing

FIPA is an IEEE Computer Society standards organization that promotes agent-based technology and the interoperability of its standards with other technologies. Jim Odell reports that FIPA’s P2P Nomadic Agent Working Group has released a draft of its specification. The group describes it’s focus as:

“The objective is to define a specification for P2P Nomadic Agents, capable of running on small or embedded devices, and to support distributed implementation of applications for consumer devices, cellular communications and robots, etc. over a pure P2P network. This specification will leverage presence and search mechanisms of underlying P2P infrastructures such as JXTA, Chord, Bluetooth, etc. In addition, this working group will propose the minimal required modifications of existing FIPA specifications to extend their reach to P2P Nomadic Agents. Potential application fields for P2P Nomadic Agents are healthcare, industry, offices, home, entertainment, transport/traffic.”

There is also a document from the Review of FIPA Specification Study Group that reviews and critiques the current inventory of 25 specifications.

coComment tracks blog conversations

February 5th, 2006, by Tim Finin, posted in Blogging, Web

coComment is a free service to help keep track of comment-based conversations on the blogosphere. After registering, you add their bookmarklet to you browser. When making a comment on a blog using any of the most common platforms (e.g., WordPress, blogger), you first click on the bookmarklet, and then submit your comment. The bookmarklet sends a copy of your comment to coComment which adds it to their database, along with the context. The result is that you can visit their page and see the comments you’ve made and can also add some code to your own blog(s) to show recent comments. Here’s what it should look like:



One thing that’s missing, IMHO, is the ability to register your comments with several IDs. I’d like to have my personal ID, but also define it as part of a group ebiquity ID. We could put code to link the ebiquity group ID comments on our ebiquity group blog.

Btw — to sign up you need an invitation code. To get an invitation code, just enter your email address to be notified when one is available. You may get it almost immediately in email, like I did.

Yahoo and AOL propose email postage

February 5th, 2006, by Akshay Java, posted in GENERAL, Web

According to reports, Yahoo and AOL are planning to test a new system for email postage.

America Online and Yahoo, two of the world’s largest providers of e-mail accounts, are about to start using a system that gives preferential treatment to messages from companies that pay from a quarter of a cent to 1 cent each to have them delivered. The Internet companies say this will help them identify legitimate mail and cut down on junk e-mail, identity-theft scams and other scourges of users of their services.

This would lead to interesting issues especially with respect to micropayments when it comes to individual users and mailing lists or groups. One of the major hurdles could be public acceptance, it would be difficult to expect people to pay for something that they have been so far getting for free. Also, as noted on wikipedia article on email spam, cost based methods might not prove effective since spammers often use hijacked computers and accounts to send messages. Instead, one might take the approach of using ‘virtual money’: for example, there could be points associated with each user - on signup you get some points and then as you use the service, you build points by having earning a good reputation (which in turn may increase your quota for sending emails). Another interesting approach is to use the social networks and trust for email filtering, as described in the paper “Reputation Network Analysis for Email Filtering” by Jennifer Goldbeck and Dr. James Hendler from the Mindswap lab.
This space has become more important than ever before not just because of email spam but also due to the extent to which spam has entered the blogosphere.

Search the Enron email corpus online

February 5th, 2006, by Tim Finin, posted in AI, NLP

The enron email corpus is a collection of hundreds of thousands of email messages from the infamous Enron corporation that researchers have been using to improve and evaluate techniques for analyzing email, e.g., NLP analysis, information extraction, sentiment detection, social network analysis, information flow, etc. It’s become important because it is the only substantial collection of real email that is public. In the ebiquity lab, for example, Akshay Java has worked with UMBC’s Institute for Language and Information Technologies to bring to bear their NLP technology on the messages.

InBoxer has put up an Enron Email site that lets anyone explore and search the collection on the Web. InBoxer is not a research group, but a company that sells an “anti-risk appliance” that is used to detect when email that is about to be sent or has been sent violates policy. (There should be a good market for this in the Government, too!).

You can also surf the corpus via a simple database interface at UC Berkeley.

William Cohen of CMU describes the collection:

This dataset was collected and prepared by the CALO Project (A Cognitive Assistant that Learns and Organizes). It contains data from about 150 users, mostly senior management of Enron, organized into folders. The corpus contains a total of about 0.5M messages. This data was originally made public, and posted to the web, by the Federal Energy Regulatory Commission during its investigation. … The dataset here does not include attachments, and some messages have been deleted “as part of a redaction effort due to requests from affected employees”.

Now it’s convenient to explore corporate malfeasance on the Web.

Software–Defined Radio Could Unify Wireless World

February 5th, 2006, by Amit, posted in Mobile Computing, Technology Impact

Technicians in Ireland are testing a device capable of skipping between incompatible wireless standards by tweaking its underlying code. A report from NewScientist states:

The device can impersonate a multitude of different wireless devices since it uses reconfigurable software to carry out the tasks normally performed by static hardware… The technology promises to let future gadgets jump between frequencies and standards that currently conflict. A cellphone could, for example, automatically detect and jump to a much faster Wi-Fi network when in a local hotspot.

Half of Swoogle’s hits are from referer log spammers

February 4th, 2006, by Tim Finin, posted in Blogging, Semantic Web, Swoogle, Web, splog

We are using bbclone to generate reports on Swoogle access. Look at today’s top 10 referers as of 3:00pm:

  www.legaladvocate.net  246     26.14%
  www.myjavaserver.com   152     16.15%
  www.google.com         125     13.28%
  dannyayers.com         44      4.68%
  lucky7.to              34      3.61%
  ebiquity.umbc.edu      25      2.66%
  www.google.de          18      1.91%
  planetrdf.com          18      1.91%
  mail.google.com        18      1.91%
  groups.google.com      14      1.49%

One and five are clearly spam sites and two is suspicious, too. The first, for example, appears to be about poker, though the site name is legaladvocat. The site’s text is obviously automatically generated nonsense. All of the links point to subpages in the same domain with a similar structure and content. I assume that once the site achineves a high pageRank, it will be repurposed or sold.

So, it seems like nearly 50% of our hits are due to referer log spamming. I’d guess Swoogle was picked by finding its URL on recent posts found on a blog search engine or a ping server.

You are currently browsing the UMBC ebiquity weblog archives for February, 2006.

  Home | Archive | Login | Feed





UMBC