 | AI 
Archive for the 'AI' Category
October 30th, 2009, by Tim Finin, posted in Ontologies, RDF, Semantic Web
Like many newspapers, the New York Times links the first mention of well known entitles in its articles to a reference page. For example, a mention of Barack Obama links to a page which is a collection of basic information on President Obama and links to relevant stories and other resources that the Times has created.
Now the Times is also using RDF to publish some of information as linked open data. Yesterday the Times announced the publication of an LOD collection covering about 5,000 people at http://data.nytimes.com/ under under a Creative Commons 3.0 Attribution License and plan to put their full collection of 30K topics online soon.
“Over the last several months we have manually mapped more than 5,000 person name subject headings onto Freebase and DBPedia. And today we are pleased to announce the launch of http://data.nytimes.com and the release of these 5,000 person name subject headings as Linked Open Data.
…
Over the next several months, we plan to expand http://data.nytimes.com to include each of the nearly 30,000 subject headings we use to power Times Topics pages, a collection that includes locations, organizations and descriptors in addition to person names.”
Edit | Bookmark@del.icio.us | Trackback | No Comments »
October 27th, 2009, by Tim Finin, posted in AI, KR, OWL, Ontologies, Semantic Web
OWL 2, the new version of the Web Ontology Language, officially became a W3C standard yesterday. From the W3C press release:
“Today W3C announces a new version of a standard for representing knowledge on the Web. OWL 2, part of W3C’s Semantic Web toolkit, allows people to capture their knowledge about a particular domain (say, energy or medicine) and then use tools to manage information, search through it, and learn more from it. Furthermore, as an open standard based on Web technology, it lowers the cost of merging knowledge from multiple domains.”
Edit | Bookmark@del.icio.us | Trackback | No Comments »
October 25th, 2009, by Tim Finin, posted in AI, Agents, Social media
Golden Balls is a UK game show with a final round, Split or Steal, that is similar to the prisoner’s dilemma. The two contestants have to simultaneously choose to split the prize or try to steal it. If both choose split, they each get half. If one chooses split and the other steal, than the stealer gets it all. If they both choose steal, neither gets anything. While the payoff matrix is not exactly that for the PD, it has a similar effect on the strategy. Check out this video of a Split or Steal round for £100,000. (Spotted on Hacker News)
Edit | Bookmark@del.icio.us | Trackback | No Comments »
October 16th, 2009, by Tim Finin, posted in AI, KR, NLP, Ontologies, Semantic Web
Wolfram|Alpha is an interesting query answering system developed by Wolfram Research that is a blend of a question answering system and a Semantic Web alternative. It tries to interpret and answer queries expressed as a sequence of words from a large collection of interlinked tables. Oh, and Mathematica is in thrown in for free. A free Web version was released last Spring.
The news today is that Wolfram|Alpha has released an API, as noted in their blog:
“The API allows your application to interact with Wolfram|Alpha much like you do on the web—you send a web request with the same query string you would type into Wolfram|Alpha’s query box and you get back the same computed results. It’s just that both are in a form your application can understand. There are plenty of ways to tweak and control the results, as well.”
The pricing plan runs from $60/month for 1000 (6 cents a query) queries to $220K for up to 10M queries/month (2.2 cents a query). programming language bindings are available for Java, PHP, Perl, Python, Ruby and .NET.
Their original web interface remains free, but the TOS specifies that it “may be used only by a human being using a conventional web browser to manually enter queries one at a time.”
Edit | Bookmark@del.icio.us | Trackback | No Comments »
October 6th, 2009, by Tim Finin, posted in Machine Learning, Privacy, Semantic Web, Social media
In the Fall of 2007, two MIT students carried out a class project exploring how presumably private data could be inferred from an online social networking system. Their experiment was to predict the sexual orientation of Facebook users who make their basic information public by analyzing friendship associations. As reported in the Boston Globe last month, the students’ had not yet published their results.
Well, now they have — in the October issue of the First Monday, “one of the first openly accessible, peer–reviewed journals on the Internet”.
The paper has a lot of detail on the methodology for collecting the data and how it was analyzed. Here’s the abstract.
“Public information about one’s coworkers, friends, family, and acquaintances, as well as one’s associations with them, implicitly reveals private information. Social networking Web sites, e–mail, instant messaging, telephone, and VoIP are all technologies steeped in network data — data relating one person to another. Network data shifts the locus of information control away from individuals, as the individual’s traditional and absolute discretion is replaced by that of his social network. Our research demonstrates a method for accurately predicting the sexual orientation of Facebook users by analyzing friendship associations. After analyzing 4,080 Facebook profiles from the MIT network, we determined that the percentage of a given user’s friends who self–identify as gay male is strongly correlated with the sexual orientation of that user, and we developed a logistic regression classifier with strong predictive power. Although we studied Facebook friendship ties, network data is pervasive in the broader context of computer–mediated communication, raising significant privacy issues for communication technologies to which there are no neat solutions.”
As we had previously noted, this datamining exercise only accesses information that Facebook users explicitly choose to make public. The authors note that their analysis “relies on public self–identification of same–gender interest in Facebook profiles as a sentinel value for LGB identity”. The privacy vulnerability is that the default setting for a Facebook account is that friendship relations are public and you can not control the privacy settings of your friends. So if your leave your friend list public and many of your Facebook friends open up their profiles, it may be possible to draw reasonable inferences about your age, gender, political leanings, sexual preferences and other attributes.
Edit | Bookmark@del.icio.us | Trackback | 2 Comments »
September 21st, 2009, by Tim Finin, posted in AI, Machine Learning, Semantic Web, Social media
Netflix announced today that BellKor’s Pragmatic Chaos team was awarded the $1M Netflix Grand Prize.
“It is our great honor to announce the $1M Grand Prize winner of the Netflix Prize contest as team BellKor’s Pragmatic Chaos for their verified submission on July 26, 2009 at 18:18:28 UTC, achieving the winning RMSE of 0.8567 on the test subset. This represents a 10.06% improvement over Cinematch’s score on the test subset at the start of the contest. We congratulate the team of Bob Bell, Martin Chabbert, Michael Jahrer, Yehuda Koren, Martin Piotte, Andreas Töscher and Chris Volinsky for their superb work advancing and integrating many significant techniques to achieve this result.”
Netflix announced that it will hold a new Netflix Prize 2 contest with details to be released.
What about the Ensemble’s last-minute entry, the one that seemed to top BellKor’s?
“Team BellKor’s Pragmatic Chaos edged out team The Ensemble with the winning submission coming just 24 minutes before the conclusion of the nearly three-year-long contest. Historically the Leaderboard has only reported team scores on the quiz subset. The Prize is awarded based on teams’ test subset score. Now that the contest is closed we will be updating the Leaderboard to report team scores on both the test and quiz subsets.”
As part of the final submission, teams were required to submit papers describing the approach. Here are the three that the winning team delivered.
The New York Times Bits blog also has an article, Netflix Awards $1 Million Prize and Starts a New Contest.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
September 4th, 2009, by Tim Finin, posted in AI, Humor
Edit | Bookmark@del.icio.us | Trackback | No Comments »
September 3rd, 2009, by Tim Finin, posted in NLP, Semantic Web, sEARCH
HealthBase is a ’semantic search engine’ for healthcare information that is driven by content mined from “millions of authoritative health sources” including WebMD, Wikipedia, PubMed, and Mayo Clinic’s health site. Techcrunch first described it as the ultimate medical content search engine but then had a follow up article reporting that HealthBase thinks you can get rid of jews with alcohol and salt. Language Log had some more fun exploring HealthBase.
We thought we’d see what HealthBase thought of the Semantic Web and it turns out that if you are experiencing the Semantic Web as a condition there are several recommended treatments.

and as a treatment itself, HealthBase is pretty positive.

Edit | Bookmark@del.icio.us | Trackback | 1 Comment »
August 23rd, 2009, by Tim Finin, posted in AI
Sean Luke has made available an open set of lecture notes on metaheuristics algorithms, Essentials of Metaheuristics. Sean defines a metaheuristic as
“A common but unfortunate name for any stochastic optimization algorithm intended to be the last resort before giving up and using random or brute-force search. Such algorithms are used for problems where you don’t know how to find a good solution, but if shown a candidate solution, you can give it a grade. The algorithmic family includes genetic algorithms, hill-climbing, simulated annealing, ant colony optimization, particle swarm optimization, and so on.”
Such AI algorithms are also often called weak methods, but I like the term metaheuristic better.
The lecture notes look great and the chapters can be used independently for self study or to augment topics in a graduate or undergraduate course. Thanks Sean!
(via Don Miner.)
Edit | Bookmark@del.icio.us | Trackback | 1 Comment »
August 21st, 2009, by Tim Finin, posted in AI, Agents, Semantic Web, Social media, Technology Impact
RAEng report on Social, legal and ethical issues of autonomous systems
The Royal Academy of Engineering has released a report on the social, legal and ethical issues involving autonomous systems — systems that are adaptive, learn and can make decisions without the intervention or supervision of a human.
The report, Autonomous Systems: Social, Legal and Ethical Issues (pdf), was based on a roundtable discussion “from a wide range of experts, looking at the areas where autonomous systems are most likely to emerge first, and discussing the broad ethical issues surrounding their uptake.”
While autonomous systems have broad applicability, the report focuses on two areas: transportation (e.g. autonomous road vehicles) and personal care (e.g., smart homes).
“Autonomous systems, such as fully robotic vehicles that are “driverless” or artificial companions that can provide practical and emotional support to isolated people, have a level of self-determination and decision making ability with the capacity to learn from past performance. Autonomous systems do not experience emotional reactions and can therefore perform better than humans in tasks that are dull, risky or stressful. However they bring with them a new set of ethical problems. What if unpredicted behaviour causes harm? If an unmanned vehicle is involved in an accident, who is responsible – the driver or the systems engineer? Autonomous vehicles could provide benefits for road transport with reduced congestion and safety improvements but there is a lack of a suitable legal framework to address issues such as insurance and driver responsibility.
…
The technologies for smart homes and patient monitoring are already in existence and provide many benefits to older people, such as allowing them to remain in their own home when recovering from an illness, but they could also lead to isolation from family and friends. Some users may be unfamiliar with the technologies and be unable to give consent to their use.”
The RAEng report recommends “engaging early in public consultation” and working to establish “appropriate regulation and governance so that controls are put in place to guide the development of these systems”.
rdf:SeeAlso Autonomous tech ‘requires debate’; Scientists ponder rules and ethics of robo helpers; Robot cats could care for older Britons.
(via Mike Wooldridge)
Edit | Bookmark@del.icio.us | Trackback | No Comments »
July 27th, 2009, by Tim Finin, posted in AI, Machine Learning, Social media, Web
Who won the Netflix Prize? Ensemble or BellKors Pragmatic Chaos
Who won the Netflix Prize? According to a post in the NYT Bits blog, Netflix Challenge Ends, But Winner Is In Doubt, it’s still very much up in the air.
” So The Ensemble won, right? Not necessarily. In an e-mail message Sunday night, Chris Volinsky, a scientist at AT&T Research and a leader of the BellKor’s team, said: “Our team is in first place as we were contacted by Netflix to validate our entry.” And in an online forum, another member of the BellKor team, Yehuda Koren, a researcher for Yahoo in Israel, said his team had “a better Test score than The Ensemble,” despite what the rival team submitted for the leaderboard.
So is BellKor the winner? Certainly not yet, according to a Netflix spokesman, Steve Swasey. “There is no winner,” he said.
A winner, Mr. Swasey said, will probably not be announced until sometime in September at an event hosted by Reed Hastings, Netflix’s chief executive. The movie rental company is not holding off for maximum P.R. effect, Mr. Swasey said, but because the winner has not yet been determined.
The Web leaderboard, he explained, is based on what the teams submit. Next, Netflix’s in-house researchers and outside experts have to validate the teams’ submissions, poring over the submitted code, design documents and other materials. “This is really complex stuff,” Mr. Swasey said.
A leading member of The Ensemble, Domonkos Tikk, a Hungarian computer scientist, did not sound too hopeful. “We didn’t get any notification from Netflix,” Mr. Tikk said in a phone interview from Hungary. “So I think the chances that we won are very slight. It was a nice try.”
It seems strange that Netflix called the Bellkor team first, since according to the Leaderboard the Ensemble team submitted the top entry.
UPDATE 2/28: Today’s NYT has a good article on the Netflix Prize and the role of teamwork for developing machine learning systems, Netflix Competitors Learn the Power of Teamwork.
Edit | Bookmark@del.icio.us | Trackback | 3 Comments »
July 26th, 2009, by Tim Finin, posted in AI, Machine Learning, Social media, Web
Netflix has announced that the Netflix Prize contest is now closed. Presumably, The Ensemble is the winner, subject to final qualification.
“We are delighted to report that, after almost three years and more than 43,000 entries from over 5,100 teams in over 185 countries, the Netflix Prize Contest stopped accepting entries on 2009-07-26 18:42:37 UTC. The closing of the contest is in accordance with the Rules — thirty (30) days after a submitted prediction set achieved the Grand Prize qualifying RMSE on the quiz subset.
…
Qualified entries will be evaluated as described in the Rules. We look forward to awarding the Grand Prize, which we expect to announce in a few weeks. However if a Grand Prize cannot be awarded because no submission can be verified by the judges, the Contest will reopen. We will make an announcement on the Forum after the Contest judges reach a decision.”
So what’s left for the judges to do. The rules say that “a panel of senior Netflix engineers and qualified independent judges” need to “ensure that the provided algorithm description and source code could reasonably have generated the prediction sets submitted”. To do this, the candidate winner must produce the algorithm along with a description of who it works. And, of course, before receiving the prize the winner has to grant Netflix
“an irrevocable, royalty free, fully paid up, worldwide non-exclusive license under the Participants’ copyrights, patents or other intellectual property rights in the winning algorithm (”Winning Algorithm”) to reproduce, distribute, display, and create derivative works from the Winning Algorithm and also to make, have made, use, sell, offer for sale, and import products that would otherwise infringe the Winning Algorithm.”
The Netflix Prize was a great idea and generated a lot of interest around the world. It’s been good for the field of AI and its machine learning sub-field, especially. Congratulations to the Ensemble team and condolences to BellKor’s Pragmatic Chaos. I wish there could have been two winners.
UPDATE 2/27: Wait! The winner is still in doubt.
Edit | Bookmark@del.icio.us | Trackback | 1 Comment »
|  |
|  |