UMBC ebiquity research group Building intelligent systems in open, heterogeneous, dynamic, distributed environments
08 August 2008, 14:39:12 EDT  
2008 February

Archive for February, 2008

Hypertable 0.9 alpha

February 8th, 2008, by Tim Finin, posted in Database, Semantic Web, Web, Web 2.0

hypertableHypertable 0.9 alpha is out.

“Hypertable is a high performance distributed data storage system designed to support applications requiring maximum performance, scalability, and reliability. Hypertable will be particularly invaluable to any organization that needs to manage rapidly evolving data to support demanding real-time applications. Modeled after Google’s well known Bigtable project, Hypertable is designed to manage the storage and processing of information on a large cluster of commodity servers, providing resilience to machine and component failures. Hypertable seeks to set the open source standard for highly available, petabyte scale, database systems. ” (link)

Update: LinuxWorld has an article, Zvents releases open-source cluster database, on the release along with a podcast with Doug Judd, principal search architect for Zvents.

ICWSM 2008 early registration ends Fri 2/15

February 7th, 2008, by Tim Finin, posted in Semantic Web, Social media, Web

The Second International Conference on Weblogs and Social Media (ICWSM 2008) will be held March 30 - April 2, 2008 at the Hilton in Seattle, Washington. Registration is now open and the deadline to qualify for the early registration date is Friday, February 15 18. The conference will bring together academic and industrial practitioners to present and to discuss new research, applications, thoughts and ideas that are shaping the future of social media analysis. The conference aims to bring together researchers from different subject areas including computer science, linguistics, psychology, statistics, sociology, multimedia and semantic web technologies.

The program includes an impressive line-up of invited speakers: Bernardo Huberman (HP Labs), who will speak on “Social Dynamics in the Age of the Web,” David Sifry (Founder, Technorati, Sputnik, and Linuxcare), and Brad Fitzpatrick (Google, LiverJournal Founder). Two tutorials are planned, including “Subjectivity and Sentiment Analysis” by Jan Wiebe (Univ. of Pittsburgh) and “Graph Mining Techniques for Social Media Analysis” by Mary McGlohon and Christos Faloutsos (CMU).

DARPA budget up 10% but AI program cut 17%

February 5th, 2008, by Tim Finin, posted in AI, Computing Research, Funding

A Wired article, DARPA Nabs Big Bucks for Mach 6 Planes, Giant Robotic Blimps, Next-Gen Networks, summarizes the news in the proposed 2009 DARPA budget.

“DARPA, the Pentagon’s mad science division, got a $324 million boost in the Defense Department’s new budget — a ten percent increase. Which means lots more cash for giant blimps, next-gen wireless networks, Mach 6 planes, shape-shifting drones, and improvised bomb-beaters. … But not everything in the DARPA budget got bumped up. The agency’s much-ballyhooed efforts at “Cognitive Computing” took at $30 million cut, to $145 million. Which could mean that even the Pentagon’s most wide-eyed visionaries see thinking machines are still far, far off in the distance.” (link)

DARPA has traditionally been an important funding source for basic computer science research. While the ORCA program got a healthy increase of $53M, this is the only CS-related program mentioned.

2007 Turing award goes to model checking developers

February 5th, 2008, by Tim Finin, posted in CS, Computing Research, GENERAL, KR

The ACM named Edmund Clarke, E. Allen Emerson and Joseph Sifakis winners of the prestigious 2007 A.M. Turing Award for their research on Model Checking.

From the ACM announcement:

“Their innovations transformed this approach from a theoretical technique to a highly effective verification technology that enables computer hardware and software engineers to find errors efficiently in complex system designs. This transformation has resulted in increased assurance that the systems perform as intended by the designers. … Clarke of Carnegie Mellon University, and Emerson of the University of Texas at Austin, working together, and Sifakis, working independently for the Centre National de la Recherche Scientifique at the University of Grenoble in France, developed this fully automated approach that is now the most widely used verification method in the hardware and software industries.” (link)

Policy-controlled dynamic spectrum access

February 4th, 2008, by Tim Finin, posted in Mobile Computing, Semantic Web

Next Friday, UMBC alumnus Filip Perich (PhD 2004) will talk about his recent work using policies expressed in the Semantic Web language OWL to control how software radios manage access to the radio spectrum.

We present an overview of the policy-controlled dynamic spectrum access technology, which provides an order-of-magnitude improvement to wireless communications and spectrum management in terms of spectrum access, capacity, planning requirements, ease of use, reliability, and jam resistance. We describe the current radio systems developed by Shared Spectrum Company as part of the DARPA XG program and provide results from field testing and benefit studies.

The talk, SSC Dynamic Spectrum Access Technology, will be 1:00pm-2:30pm in ITE 229 on Friday 8 February 2008.

Reuters Calais: free text to Semantic Web services

February 2nd, 2008, by Tim Finin, posted in NLP, OWL, RDF, Semantic Web, Social media, Web, Web 2.0

Reuters has released an API for its Calais Web service. The free service discovers entities, events and relations in text and returns the results in the form of RDF data. The services use information extraction technology from ClearForest, which Reuters acquired in April 2007.

“The Calais web service automatically attaches rich semantic metadata to the content you submit – in well under a second. Using natural language processing, machine learning and other methods, Calais categorizes and links your document with entities (people, places, organizations, etc.), facts (person ‘x’ works for company ‘y’), and events (person ‘z’ was appointed chairman of company ‘y’ on date ‘x’). The metadata results are stored centrally and returned to you as industry-standard RDF constructs accompanied by a Globally Unique Identifier (GUID). Using the Calais GUID, any downstream consumer is able to retrieve this metadata via a simple call to Calais.” (link)

The semantic types it recognizes and uses in its annotations are a basic set typical of information extraction systems and include entities, facts, events and categories. See, for example, the description of the person entity type. The brief API documentation describes how to call the web services and interpret the results. As an example of the semantic metadata types supported by Calais, a preprocessed a sample content set of about 350 Business and Economic news articles from WikiNews for the year 2007 is available.

The service is free for both commercial and non-commercial purposes with a limit, but a generous one, on the number of service calls a registered developer can make in a day. A sample Java application is available that reads input from STDIN, writes output to STDOUT and takes processing parameters from a configuration file.

    updates: The sample application requires Java 6 to run! Here’s an example of input and the RDF output.

Making such a service freely available on the Web has the potential to be a disruptive move. Reuters will sponsor “a number of contests and bounties for applications developed using the Calais API.” An initial “bounty” of $5,000 is offered for “A highly configurable plugin for WordPress that enriches a blog with several capabilities” based on OpenCalais.

The kind of content extraction that Calias does falls considerably short of full language understanding. However, it does represent the state of the art in scalable, domain-independent information extraction, is immediately useful, and an important step toward the ultimate goal of full NLP.

Twine in the New York Times

February 2nd, 2008, by Tim Finin, posted in GENERAL, Semantic Web, Social media, Web 2.0

Tomorrow’s New York Times has a very positive story on Twine in the business section, An Online Organizer That Helps Connect the Dots.

“How often have you wasted time searching through page after page of e-mail messages, Web sites, notes, news feeds and YouTube videos on your computer, trying to find an important item? If the answer is “too often,” a San Francisco company, Radar Networks, is testing a free, Web-based application, called Twine, that may provide some robotic secretarial help in organizing and retrieving documents.”

Happily, the story mentions that Twine is using Semantic Web technology:

“Twine is based on technologies created for the developing semantic Web — foreseen as a smarter Web where machines may someday be able to process the meaning of words and phrases in documents and even routinely answer direct questions.”

Google social graph API

February 2nd, 2008, by Tim Finin, posted in Blogging, Semantic Web, Social media, Web, Web 2.0

Late this week Google released the Google social graph API which provides structured access to information Google’s has extracted from public FOAF and XFN data on the Web. Google also says it mines the web for “and other publicly declared connections”. I wonder what that means? Brad Fitzpatrick gives a three minute explanation in this video. This is exciting and likely to give a push to any number of emerging themes, including data portability, linked data, and the Semantic Web in general. There’s lots of comment from the ususal suspects and also on the SWIG IRC

By the way, he will give an invited talk at the 2008 International Conference on Weblogs and Social Media at the end of March in Seattle.

Here’s a simple call to the API starting with the ebiquity blog

  http://socialgraph.apis.google.com/lookup?q=ebiquity.umbc.edu%2Fblogger%2F&fme=1&pretty=1

You can see from the results that they are returned using JSON. The possible parameters and what they mean are given here.

You are currently browsing the UMBC ebiquity weblog archives for February, 2008.

  Home | Archive | Login | Feed





UMBC