UMBC ebiquity research group Building intelligent systems in open, heterogeneous, dynamic, distributed environments
Semantic Web

Archive for the 'Semantic Web' Category

Tracking news memes: lipstick on Joe the Plumber

July 13th, 2009, by Tim Finin, posted in Semantic Web, Social media

Here’s a great graphic on the rise and fall of memes in the news from KDD 2009 paper, Meme-tracking and the Dynamics of the News Cycle by Leskovec, Backstrom and Kleinberg. Click on the image to see a larger version.

Tracking memes in news

Here’s the paper’s abstract.

“Tracking new topics, ideas, and “memes” across the Web has been an issue of considerable interest. Recent work has developed methods for tracking topic shifts over long time scales, as well as abrupt spikes in the appearance of particular named entities. However, these approaches are less well suited to the identification of content that spreads widely and then fades over time scales on the order of days —the time scale at which we perceive news and events. We develop a framework for tracking short, distinctive phrases that travel relatively intact through on-line text; developing scalable algorithms for clustering textual variants of such phrases, we identify a broad class of memes that exhibit wide spread and rich variation on a daily basis. As our principal domain of study, we show how such a meme-tracking approach can provide a coherent representation of the news cycle—the daily rhythms in the news media that have long been the subject of qualitative interpretation but have never been captured accurately enough to permit actual quantitative analysis. We tracked 1.6 million mainstream media sites and blogs over a period of three months with the total of 90 million articles and we find a set of novel and persistent temporal patterns in the news cycle. In particular, we observe a typical lag of 2.5 hours between the peaks of attention to a phrase in the news media and in blogs respectively, with divergent behavior around the overall peak and a “heartbeat”-like pattern in the handoff between news and blogs. We also develop and analyze a mathematical model for the kinds of temporal variation that the system exhibits.”

(via the NYT)

detexify: draw a symbol, get the LaTeX command

July 12th, 2009, by Tim Finin, posted in GENERAL, Semantic Web

The perfect document preparation system has yet to be invented, and I’ve tried many over the years, starting with TJ6. It’s surely impossible for any one system to be best, given the range of documents most of us have to produce: letters, memos, resumes, scientic papers, dissertations, books, etc. Microsoft Word is great for many of these, but like many, I’ve concluded that LaTeX is still the best for academic papers or large, complex documents. I think this graph attributed to Marko Pinteric says it elegantly.


Microsoft Word vs LaTeX

That LaTeX is so widely used is remarkable, given that it has been more that 25 years since it was first released and it was based on the somewhat arcane Tex. But LaTeX has its problems too, and one of them is remembering all of the commands to generate the many symbols that we like to use to make out papers seem more profound.

Detexify is a neat Web service that lets you draw a mathematical symbol with your mouse, interprets the result, and shows you what LaTeX command to use to generate it.

detexify readys your drawing and recommends a LaTeX command

It works pretty well! You can look at the source code — mostly in ruby — on github and contribute. Or you can volunteer to help train the system on new symbols.

(via Hacker News)

CFP: Semantics for the rest of us Workshop at 8th Int. Semantic Web Conference

July 9th, 2009, by Tim Finin, posted in Conferences, OWL, RDF, Semantic Web, Web, iswc
IMPORTANT DATES
Submissions 10 Aug 09
Notification 19 Aug 09
Final copy 2 Sept 09
Workshop 26 Oct 09

Semantics for the Rest of Us: Variants of Semantic Web Languages in the Real World is a workshop that will be held at the on 26 October 2009 in Washington, DC.

The Semantic Web is a broad vision of the future of personal computing, emphasizing the use of sophisticated knowledge representation as the basis for end-user applications’ data modeling and management needs. Key to the pervasive adoption of Semantic Web technologies is a good set of fundamental “building blocks” – the most important of these are representation languages themselves. W3C’s standard languages for the Semantic Web, RDF and OWL, have been around for several years. Instead of strict standards compliance, we see “variants” of these languages emerge in applications, often tailored to a particular application’s needs. These variants are often either subsets of OWL or supersets of RDF, typically with fragments OWL added. Extensions based on rules, such as SWRL and N3 logic, have been developed as well as enhancements to the SPARQL query language and protocol.

This workshop will explore the landscape of RDF, OWL and SPARQL variants, specifically from the standpoint of “real-world semantics”. Are there commonalities in these variants that might suggest new standards or new versions of the existing standards? We hope to identify common requirements of applications consuming Semantic Web data and understand the pros and cons of a strictly formal approach to modeling data versus a “scruffier” approach where semantics are based on application requirements and implementation restrictions.

The workshop will encourage active audience participation and discussion and will include a keynote speaker as well as a panel. Topics of interest include but are not limited to

  • Real world applications that use (variants of) RDF, OWL, and SPARQL
  • Use cases for different subsets/supersets of RDF, OWL, and SPARQL
  • Extensions of SWRL and N3Logic
  • RIF dialects
  • How well do the current SW standards meet system requirements ?
  • Real world “semantic” applications using other structured representations (XML, JSON)
  • Alternatives to RDF, OWL or SPARQL
  • Are ad hoc subsets of SW languages leading to problems?
  • What level of expressive power does the Semantic Web need?
  • Does the Semantic Web require languages based on formal methods?
  • How should standard Semantic Web languages be designed?

We seek two kinds of submissions: full papers up to ten pages long and position papers up to five pages long. Format papers according the ISWC 2009 instructions. Accepted papers will be presented at the workshop and be part of the workshop proceedings.

Organizers:

Journal of Web Semantics maintains high impact factor

July 6th, 2009, by Tim Finin, posted in AI, Semantic Web, Web

Journal of Web SemanticsThe latest Journal Citation Reports (2009) published by Thomson Reuters shows that the Journal of Web Semantics continues to enjoy a very high impact factor. The 2008 measure was 3.023, which was the 12th highest out of the 94 journals in the category of Computer Science, Artificial Intelligence.

Thomson Reuter’s journal impact factor is a measure of the frequency with which the average article in a journal has been cited in a particular year. The 2008 impact factor is computed as the citations received in 2008 to all articles published in 2006 and 2007, divided by the number of “source items” published in 2006 and 2007.

ISWC 2009 student support

July 5th, 2009, by Tim Finin, posted in Semantic Web

2009 International Semantic Web Conference
The US National Science Foundation (NSF), and the Semantic Web Science Association (SWSA) plan to contribute funds to support participation by full-time students in 2009 International Semantic Web Conference. SWSA and NSF anticipate providing 10,000€ and $20,000 respectively, with NSF funds being earmarked to support students enrolled at U.S. Universities. We anticipate that the SWSA funds will support 15 awards of 600-800€, and that the NSF funds will support 13 awards of approximately $1500.

Confirmation of the funding, as well as details on how to apply will be available on the ISWC 2009 Web site.

Last year’s student fellows made significant contributions to the conference, and we look forward to this year’s fellows being similarly engaged. In selecting applications for travel support, preference will be given to students selected to participate in the doctoral consortium, followed by students who are first author on a paper accepted at the conference, followed by students who have other authorship on a conference or workshop paper.

Applications are due August 21, with notification of success by September 7.

Direct questions to iswc09_fellowships@cs.umbc.edu.

NOSQL: distributed key-value data stores

July 2nd, 2009, by Tim Finin, posted in Database, Semantic Web, Web

ComputerWorld has an article on the “nosql” movement and a recent nosql meetup held in San Francisco, No to SQL? Anti-database movement gains steam. Nosql systems are distributed, non-relational data stores that typically use a simple key-value approach to indexing and retrieving data and use a simple procedural query API rather than a sophisticated declarative query language.

“The inaugural get-together of the burgeoning NoSQL community crammed 150 attendees into a meeting room at CBS Interactive. Like the Patriots, who rebelled against Britain’s heavy taxes, NoSQLers came to share how they had overthrown the tyranny of slow, expensive relational databases in favor of more efficient and cheaper ways of managing data.

“Relational databases give you too much. They force you to twist your object data to fit a RDBMS [relational database management system],” said Jon Travis, principal engineer at Java toolmaker SpringSource, one of the 10 presenters at the NoSQL confab (PDF). NoSQL-based alternatives “just give you what you need,” Travis said.”

There were presentation on nine different ‘nosql’ databases: Voldemort, Cassandra, Dynomite, HBase, Hypertable, CouchDB, VPork, MongoDb as well as general presentations by Google’s Jonas Karlsson, and Cloudera’s Todd Lipcon.

Johan Oskarsson of Last.fm wrote a debriefing post on his blog.

“The relatively young but rapidly growing “nosql” community met last Thursday in San Francisco. The idea was to give attendees a solid introduction to how distributed, non relational databases work as well as an overview of the various projects out there.”

and provides links to the presentation slides and videos. You can also search for NOSQL on Vimeo to get the videos.

I learned of this meeting on Hacker News, where you can find some interesting comments.

Of course their are many popular key-value stores that are not designed to support the highly-scalable distributed needs of many Web applications. I found, for example, that as a persistent RDF store for rdflib, Sleepycat out performed MySQL.

CFP: JWS special issue on Semantic Web and Social Media

June 27th, 2009, by Tim Finin, posted in Blogging, Semantic Web, Social media, Wikipedia
important dates
abstracts 21 Sept 09
submissions 01 Oct 09
notification 15 Dec 09
final copy 15 Jan 10
publication April 10

The Journal of Web Semantics will publish a special issue on Data Mining and Social Network Analysis for integrating Semantic Web and Web 2.0 in the spring of 2010. The special issue will be edited by Bettina Berendt, Andreas Hotho and Gerd Stumme and initial abstracts for papers must be submitted via the Elsevier EES system by September 21, 2009.

The special issue, invites contributions that show how synergies between Semantic Web and Web 2.0 techniques can be successfully used. Since both communities work on network-like data structures, analysis methods from different fields of research could form a link between those communities. Techniques can be – but are not limited to – social network analysis, graph analysis, machine learning and data mining methods.

Relevant topics include

  • ontology learning from Web 2.0 data
  • instance extraction from Web 2.0 systems
  • analysis of Blogs
  • discovering social structures and communities
  • predicting trends and user behaviour
  • analysis of dynamic networks
  • using content of the Web for modelling
  • discovering misuse and fraud
  • network analysis of social resource sharing systems
  • analysis of folksonomies and other Web 2.0 data structures
  • analysis of Web 2.0 applications and their data
  • deriving profiles from usage
  • personalized delivery of news and journals
  • Semantic Web personalization
  • Semantic Web technologies for recommender systems
  • ubiquitous data mining in Web (2.0) environment
  • applications

Bing vs. Google, side by side comparison

June 1st, 2009, by Tim Finin, posted in Google, Security, Semantic Web, Social media, sEARCH

Microsoft’s new Bing search engine is getting a lot of interest. Glenn McDonald posts about a nice side-by-side Bing vs Google comparator tat he developed. It makes it easy to compare how the two services do on a range of different types of searches. Here are the ones that Glen said he found useful in developing his initial opinion.

I sense form some of these queries that he is probing the systems where an advanced search engine can exploit a little bit of semantic knowledge. For example, recognizing that a user’s query “boston to asheville” matches a common pattern “ to “, and she probably is interested in information about how to travel from the first location tot he second. It seems like Google has been working on adding more such patterns, at least for the low hanging fruit.

Of course, if everyone hits on this site it may get throttled or blocked by either or both of the search engines. @Glen — would you be willing to share your code?

(spotted on hacker news)

Price Waterhouse Coopers bullish on the Semantic Web

May 29th, 2009, by Tim Finin, posted in AI, Database, Semantic Web

Price Waterhouse Coopers is one of the largest “professional services” organization and has always been strong on technology consulting and advice. The Spring issue of their quarterly Technology Forecast journal focuses on the Semantic Web. This is from the table of contents

pwc-tech-forecast-spring-2009

  • 04 Spinning a data Web. Semantic Web technologies could revolutionize enterprise decision making and information sharing. Here’s why.
  • 20 Making Semantic Web connections. Linked Data technology can change the business of enterprise data management.
  • 16 Traversing the Giant Global Graph. Tom Scott of BBC Earth describes how everyone benefits from interoperable data.
  • 28 From folksonomies to ontologies. Uche Ogbuji of Zepheira discusses how early adopters are introducing Semantic Web to the enterprise.
  • 40 How the Semantic Web might improve cancer treatment. M. D. Anderson’s Lynn Vogel explores new techniques for combining clinical and research data.
  • 46 Semantic technologies at the ecosystem level. Frank Chum of Chevron talks about the need for shared ontologies in the oil and gas industry.

You can download the free 58 report here. You can also read a note on the issue in ReadWriteWeb, which focuses on linked data and interoperability.

“A new PricewaterhouseCoopersTechnology report explains how the Semantic Web and Linked Data can help enterprises manage their large scale data better. The PwC Center for Technology and Innovation team spent several months researching and analyzing the problem of data silos in enterprises – and what solutions are being developed to help with that problem. The answer, according to PwC, is Semantic Web techniques. PwC believes that the Semantic Web offers a practical way to address the problem of large-scale data integration. … “

(Spotted on publi-lod@w3.org)

Google Wave as a new communication model

May 28th, 2009, by Tim Finin, posted in Agents, Google, Semantic Web, Social media

Google wave looks interesting. Google describes it as “a new tool for communication and collaboration on the web” and it’s a funny mix of email, instant messaging, wikis, and Facebook wall interactions. Or maybe IRC for the new century. This is from a post, Went Walkabout. Brought back Google Wave, on the Google blog.

“A “wave” is equal parts conversation and document, where people can communicate and work together with richly formatted text, photos, videos, maps, and more. Here’s how it works: In Google Wave you create a wave and add people to it. Everyone on your wave can use richly formatted text, photos, gadgets, and even feeds from other sources on the web. They can insert a reply or edit the wave directly. It’s concurrent rich-text editing, where you see on your screen nearly instantly what your fellow collaborators are typing in your wave. That means Google Wave is just as well suited for quick messages as for persistent content — it allows for both collaboration and communication. You can also use “playback” to rewind the wave and see how it evolved.”

Google Wave is not available yet, but you can sign up to be notified when it’s launched.

Here’s a random thought. Our models for communication in multiagent systems (e.g., KQML and FIPA) were informed by if not based on email and, to a lesser degree, IM. If Wave is a useful new communication model for humans, does it have a counterpart for software agents? If so, I suspect that ideas from the Semantic Web will be useful to provide a “rich content” for agents.

For more views, see posts by o’reilly, techcrunch, BusinessWeek and Gabor Cselle.

Wolfram Alpha is live, API description online

May 15th, 2009, by Tim Finin, posted in Semantic Web

Wolfram!Alpha is live. A document describing the Wolfram Alpha API can be found in Google’s cache.

Steve Wolfram wrote today in a blog post, Wolfram|Alpha Is Launching: Made Possible by Mathematica, on its relation to Mathematica.

“Wolfram|Alpha defines a new direction in computing—that would have simply not have been possible without Mathematica, and that in time will add some remarkable new dimensions to Mathematica itself. In terms of technology, Wolfram|Alpha is a uniquely complex software system, which has been entirely developed and deployed with Mathematica and Mathematica technologies. … When we launch Wolfram|Alpha this weekend, it will be running Mathematica on about 10,000 processor cores, using gridMathematica-based parallelism. And every single query that comes into the system will be served with webMathematica.”

And now, for a real test…

(spotted on Hacker News)

UPDATE: (5/18) The API document is officially now available.

Google supports RDFa and Microformats

May 12th, 2009, by Tim Finin, posted in Google, RDF, Semantic Web

Google has announced that it will begin to recognize structured information encoded as metadata in either RDFa and in Microformats and use the metadata in search results snippets for reviews and people.

“Structured data makes the web a better place. It also helps Google better understand and present your page in search results. … Google’s first use of this data will be in search results snippets for two kinds of objects: Reviews and People. Providing more detail in search results helps users to understand the value of your pages. When users get more information showing how your page is relevant to their search, they’re more likely to click through to see the full page. … At Google, we believe in openness, so we are using two open standards to allow you to annotate structured data on your site: microformats and RDFa. Both standards allow markup of information on your pages.”

This is a case where Google is following Yahoo, which announced more general support for RDFa and microformats last Fall in their Search Monkey.

We expect that this is work in progress. While it’s great that Google is supporting RDFa annotations, they are asking people to start with the new RDF vocabulary defined at their site http://www.data-vocabulary.org/ rather than reusing or integrating with existing, widely used vocabularies. Let’s hope that they embrace the LOD vision in the near future.

You are currently browsing the archives for the Semantic Web category.

  Home | Archive | Login | Feed






UMBC