UMBC ebiquity research group Building intelligent systems in open, heterogeneous, dynamic, distributed environments
16 May 2008, 01:50:44 EDT  
2005 February

Archive for February, 2005

SIGMOD Workshop on the Web and Databases

February 18th, 2005, by Tim Finin, posted in Web, Semantic Web

The Eighth International Workshop on the Web and Databases, WebDB 2005, will be held June 16-17, 2005 in Baltimore in association with ACM SIGMOD/PODS 2005. Submitted papers and demonstration proposals are due 23 March.

How TiVo does its collaborative filtering

February 18th, 2005, by Harry Chen, posted in Technology Impact, Machine Learning

There is an interesting paper that describes how TiVo computes its recording recommendations.

The abstract:

We describe the TiVo television show collaborative recommendation system which has been fielded in over one million TiVo clients for four years. Over this install base, TiVo currently has approximately 100 million ratings by users over approximately 30,000 distinct TV shows and movies. TiVo uses an item-item (show to show) form of collaborative filtering which obviates the need to keep any persistent memory of each user�s viewing preferences at the TiVo server. Taking advantage of TiVo�s client-server architecture has produced a novel collaborative filtering system in which the server does a minimum of work and most work is delegated to the numerous clients. Nevertheless, the server-side processing is also highly scalable and parallelizable. Although we have not performed formal empirical evaluations of its accuracy, internal studies have shown its recommendations to be useful even for multiple user households. TiVo�s architecture also allows for throttling of the server so if more server-side resources become available, more correlations can be computed on the server allowing TiVo to make recommendations for niche audiences.

See PVRBLog

Swoogle namespace searches

February 15th, 2005, by Tim Finin, posted in Swoogle, Semantic Web

We’ve added a new feature to Swoogle’s web interface that allows one to search for RDF documents that use a particular namespace. To use this, include a search term of the form ns:<NS> where <NS> is either a URI for the namespace or an abbreviation for one of the most common namespaces.

This example query searches for all RDF documents that use the cobra namespace (ns:http://daml.umbc.edu/ontologies/cobra/0.4/). A second example (i.e. pet person ns:foaf) finds RDF documents using the FOAF namespace and containing the lexemes ‘pet’ and ‘person’. (The ‘lexemes’ are word-like components in the local name part of URIs. Swoogle maintains indexes between URIs and documents and between URIs and lexemes. Lexemes are recognized by a kind of morphological analysis in which, for example, favoritePetFood is decomposed into {favorite, pet, food}).

Thanks to Ryusuke Masuoka for prompting us to add this namespace search feature. The namespace abbreviations that we currently recognize are:

rdf http://www.w3.org/1999/02/22-rdf-syntax-ns
dc http://purl.org/dc/elements/1.1
rss1 http://purl.org/rss/1.0
mvcb http://webns.net/mvcb
rdfs http://www.w3.org/2000/01/rdf-schema
foaf http://xmlns.com/foaf/0.1
dcterms http://purl.org/dc/terms
dctype http://purl.org/dc/dcmitype
owl http://www.w3.org/2002/07/owl
daml http://www.daml.org/2001/03/daml+oil

It’s easy to add more — so let us know if you have favorites you recommend adding.

Swoogle’s database contains much more metadata about the documents it’s discovered than it exposes in its simple web interface. We are always interested in improving the interface and have found it pretty easy to add features. We are anxious to hear from users or potential users who want to do searches they don’t find possible or easy. If that’s you, please let us know by posting a comment to one of the Swoogle forums or send email to swoogle-developer@cs.umbc.edu.

Using Google to learning the meanings of words

February 15th, 2005, by Harry Chen, posted in Ontologies, Web, Machine Learning

The Web is the largest database on the Earth, and Google has the largest index of this database. Two researchers at University of Amsterdam proposed a new system that uses Google search to learn and distinguish the meanings of words.

Their work is based on the theory that the meaning of a word can usually be gleaned from the words used around it. Take the word “rider”. Its meaning can be deduced from the fact that it is often found close to words like “horse” and “saddle”.

Instead relying on a common sense knowledge base such as Cyc, the reseachers use Google search to measure how closely two words relate to each other.

To do this, it needs to build a word tree - a database of how words relate to each other. It might start off with any two words to see how they relate to each other. For example, if it googles “hat” and “head” together it gets nearly 9 million hits, compared to, say, fewer than half a million hits for “hat” and “banana”. Clearly “hat” and “head” are more closely related than “hat” and “banana”.

To gauge just how closely, Vitanyi and Cilibrasi have developed a statistical indicator based on these hit counts that gives a measure of a logical distance separating a pair of words. They call this the normalised Google distance, or NGD. The lower the NGD, the more closely the words are related.

See also: “Google’s search for meaning“, New Scientist.

The Little JavaScripter

February 13th, 2005, by Tim Finin, posted in Programming, Web

lambda headWow, Douglas Crockford’s The Little JavaScripter made me vow to take a more serious look at Javascript. Like most of my colleagues, I’ve thought of it as another weak, poorly designed and inelegant programming language. Crockford points out that

“JavaScript has much in common with Scheme. It is a dynamic language. It has a flexible datatype (arrays) that can easily simulate s-expressions. And most importantly, functions are lambdas. Because of this deep similarity, all of the functions in The Little Schemer can be written in JavaScript. The syntaxes of these two languages are very different, so some transformation rules are needed.

I have prepared a file containing primitive functions (cons, cdr, etc.), a pair of functions (p and s) for converting from s-expressions to text and back), and most of the functions in the book, expressed in JavaScript. Pay particular attention to The Applicative Order Y Combinator, one of the most strange and wonderful artifacts of Computer Science.”

Now, if I can only figure out how to use the Y operator on one of our web pages…

mSpace: an iTune-like semantic search engine

February 11th, 2005, by Harry Chen, posted in KR, Semantic Web

mSpace is an interaction model to help explore relationships in information. It is a research project developed by the School of Electronics & Computer Science (ECS) at the University of Southampton.

mSpace combinds the use of information semantics and a flexible UI interface. It allows users to explore new information that they only have limited knowledge about. One problem that the mSpace project tries to address is the following:

What if you want to find something from an domain where you have a general interest but not specific knowledge? How would you find classical music you might enjoy if you don’t know what Beethoven or Berlioz sounds like? What a Sonata is? The difference between Baroque or Romantic? What do you type into Google?

You can find mSpace technical reports and demo apps here.

mSpace

W3C workshop on Frameworks for Semantics in Web Services

February 10th, 2005, by Tim Finin, posted in Conferences, Semantic Web

The W3C will hold a workshop on Frameworks for Semantics in Web Services 9-10 June 9-10, 2005 at the Digital Enterprise Research Institute (DERI) in Innsbruck, Austria. Position papers must be submitted to obtain an invitation to participate and are due by 22 APril 2005.

CA school tracks students with mandatory RFID badges

February 10th, 2005, by Tim Finin, posted in Security, Pervasive Computing

The Brittan Elementary School in California now requires students to wear RFID badges that can track their every move. Students must wear identification cards around their necks with their picture, name and grade and a RFID tag. The system was imposed, without parental input, to simplify attendance-taking, reduce vandalism and improve student safety. The district superintendent told the parents concerned about privacy that their children could be disciplined for boycotting the badges.

“It’s not an option, (The badge) is just like a textbook, you have to have it. I’m charged with running the school district and I get to make those kinds of rules.”

The badges were developed by InCom Corp., a company co-founded by the parent of a former Brittan student. The company has paid the school several thousand dollars for agreeing to the experiment, and has promised a royalty from each sale if the system takes off. See stories here and here and a NYT article describing parent protests..

Walmart on the Web - Watch Out the REST!

February 9th, 2005, by Pranam Kolari, posted in Web, GENERAL

We had an interesting discussion on Walmart, UPC Bar Codes, RFID today in our research group. The point was about how this powerhouse has affected adoption of some technologies in non-Web commerce.

I came across some interesting statistics which points to Walmart and their growing dominance on the Web. From the Internet Stock Blog.

The top 10 with visitors in December ’04, December ’03 (in millions) and year-over-year growth were:

eBay, 50.9, 49.9, 2%
Amazon, 42.5, 37.4, 14%
Wal-Mart Stores, 23.8, 16.7, 42%
Yahoo Shopping, 22.6, 21.5, 5%
Shopping.com, 19.1, 17.1, 11%
Target, 17.5, 13.9, 26%
Dell, 17.5, 12.7, 38%
Best Buy, 17.3, 12.9, 34%
Overstock.com, 14.7, 8.6, 71%
Expedia, 12.7, 11.4, 11%

EBay with 2% growth is no match to Walmart’s 42%. Walmart says - “Watchout!” to the rest of the Web.

Google Maps

February 8th, 2005, by Pavan, posted in GENERAL

Google Maps Google has launched Google maps and it already seems to have a edge over its competitors. First of all it is FAST , it is also closely tied with their Local Search , so you see all the pizza places on the map.
I wonder when they will integrate this with keyhole to show images of the location.

Looks like Google will make us depend on it for almost any information. What’s Next Google ?

I know Google “does no evil “, but what scares me is what if they decide to ?

List of Semantic web tools

February 8th, 2005, by Pavan, posted in Ontologies, AI, Semantic Web, GENERAL

Developers Guide to Semantic Web Toolkits for different Programming Languages

We are collecting links to Semantic Web toolkits for different programming languages and evaluate for each toolkit:

  • which features are offered (APIs, query languages, storage, reasoning support)
  • the strength of the development effort (number of developers involved, latest release)
  • the activity level of the toolkit’s user community (number of downloads, active mailing list)

As expected Java seems to be the preferred language, Pyhton and Perl are also catching up. JENA 2.1 has had 24600 downloads and KAON 1.2.7 14200.

Ask Jeeves Acquires Bloglines

February 8th, 2005, by Harry Chen, posted in Blogging, GENERAL

It’s now official that Ask Jeeves has acquired Bloglines. You can read about this acquisition here

Ask Jeeves has acquired Bloglines, and we’re excited about becoming the newest member of their portfolio of web services. We view this as a huge step forward for Bloglines, and a chance to achieve our mission of making RSS news reading and blogging a part of everyone’s internet experience. You can learn more about the transaction by reading our press release or reviewing our Frequently Asked Questions.

You are currently browsing the UMBC ebiquity weblog archives for February, 2005.

  Home | Archive | Login | Feed

Recent posts

  • Students: brand yourself with a blog
  • Social Data on the Web workshop at ISWC 2008
  • Petrini: Streaming Applications on the Cell BE Processor, 3pm 5/13 UMBC
  • Gossip-Based Outlier Detection for Mobile Ad Hoc Networks
  • Int. Conf. Semantic Web deadlines this week and next (ISWC 2008)

  • Ebiquity community

  • Fieldmarking data blog
  • Geospatial Semantic Web
  • Harry Chen thinks aloud
  • Planet social media research
  • Social media research blog
  • TrackForward by Kolari
  • UMBC GAIM

  • UMBC