Zombie apocalypse on the Internet

October 21st, 2008

John Markoff has an article on botnets, A Robot Network Seeks to Enlist Your Computer, in today’s New York Times. It focuses on the efforts that Microsoft is taking to combat the botnet problem.

“In a windowless room on Microsoft’s campus here, T. J. Campana, a cybercrime investigator, connects an unprotected computer running an early version of Windows XP to the Internet. In about 30 seconds the computer is “owned.” An automated program lurking on the Internet has remotely taken over the PC and turned it into a “zombie.” That computer and other zombie machines are then assembled into systems called “botnets” — home and business PCs that are hooked together into a vast chain of cyber-robots that do the bidding of automated programs to send the majority of e-mail spam, to illegally seek financial information and to install malicious software on still more PCs.

“The mean time to infection is less than five minutes,” said Richie Lai, who is part of Microsoft’s Internet Safety Enforcement Team, a group of about 20 researchers and investigators.”

One item I found interesting is that some botnet programs have their own own ‘antivirus software’ to eliminate any competition and even use standard measures to keep their newly acquired machine safe.

“Mr. Campana said the Microsoft investigators were amazed recently to find a botnet that turned on the Microsoft Windows Update feature after taking over a computer, to defend its host from an invasion of competing infections.”

Motorola developing social network friendly android mobile phone

October 20th, 2008

My Treo 650 is long in the tooth and I’m anxious to replace it. I’d love an iPhone, but am not ready to switch service providers and am also somewhat wary about its closed nature. So an android based phone is intriguing. Now here is an interesting development: BusinessWeek reports that Motorola Readies Its Own Android Social Smartphone:

“As the wireless world awaits the Oct. 22 debut of the first phone based on the Google-backed Android software, engineers at Motorola (MOT) are hard at work on their own Android handset. Motorola’s version will boast an iPhone-like touch screen, a slide-out qwerty keyboard, and a host of social-network-friendly features, BusinessWeek.com has learned.”

This is a bit of a no-brainer and iPhone is sure to have support for social media and probably well before these Motorola phones will hit the street, which is expected to be in the second quarter of 2010. The BusinessWeek article notes that:

“In the next year, social networking phones are expected to be a hit with the 16- to 34-year-old crowd, analysts say. According to consultancy Informa (INF), the number of mobile social-networking users will rise from 2.3% of global cell-phone users at the end of 2007 to as many as 23% of all mobile users by the end of 2012.”

Andrew Sullivan on why he blogs

October 19th, 2008

Andrew Sullivan has an article, Why I Blog, in the November issue of The Atlantic in which he talks about blogging and why he does it.

“From the first few days of using the form, I was hooked. The simple experience of being able to directly broadcast my own words to readers was an exhilarating literary liberation. Unlike the current generation of writers, who have only ever blogged, I knew firsthand what the alternative meant. I’d edited a weekly print magazine, The New Republic, for five years, and written countless columns and essays for a variety of traditional outlets. And in all this, I’d often chafed, as most writers do, at the endless delays, revisions, office politics, editorial fights, and last-minute cuts for space that dead-tree publishing entails. Blogging—even to an audience of a few hundred in the early days—was intoxicatingly free in comparison. Like taking a narcotic.”

Sullivan is a good writer and an early adopter of the blogging form. He is often controversial, unusually provocative, and worth reading.

Jim Parker to defend dissertation: Detecting Malicious Behavior in Ad-hoc Networks, 9am 10/23/08

October 19th, 2008

This Thursday (9am 10/23) UMBC Ph.D. student Jim Parker will defend his dissertation on Observation Techniques for Detecting Malicious Behavior in Ad-hoc Networks. Detecting malicious behaviour in MANETs is a tricky problem on which Jim has made considerable headway. Here’s his abstract.

A mobile ad-hoc network (MANET) is a collection of wireless, self-organizing nodes, each capable of routing network traffic and having the ability to be mobile. A MANET has no central authority nor fixed network infrastructure, and the dynamic nature and openness of MANETs lead to potential vulnerabilities. Since there is no guarantee of connection to the wired Internet, accepted security practices involving third party authentication servers becomes an unrealistic expectation. Even with authentication, there is the potential for abuse.

Our research has focused on being the “eyes and ears” for trust evaluation. We have developed an extensive simulation to investigate the viability of detecting malicious and faulty node behavior in MANETs. We first show detection capability at the network layer and introduce two techniques for reacting to malicious behavior. We then demonstrate detection using information from multiple layers of the OSI stack. Finally, we tie everything together by combining the detection techniques with a field communications scenario.

Sarah Palin defeats bot in Loebner Prize competition

October 14th, 2008

I guess this is the ultimate question for a Turing Test. At least for this Fall.

Reporter Will Pavia of The Times was one of the judges a the 2008 Loebner Prize competition last week. In a story in The Times yesterday, Machine takes on man at mass Turing Test, he revealed his question that gave away one of the cold, lifeless, mechanical bots.

“The other correspondent was undoubtedly a robot. I asked it for its opinion on Sarah Palin, and it replied: ‘Sorry, don’t know her.’ No sentient being could possibly answer in this way.”

Of course, this could have been an ironic response from a clever person who was mocking VP candidate Palin’s stock question of “Who is Barack Obama?”.

(spotted on Languae Log)

Akshay Java on Mining Social Media Communities and Content

October 14th, 2008

Akshay Java will defend his dissertation, Mining Social Media Communities and Content, at 10:30am this Thursday in ITE 325. Here’s the abstract.

Social Media is changing the way we find information, share knowledge and communicate with each other. The important factor contributing to the growth of these technologies is the ability to easily produce “user-generated content”. Blogs, Twitter, Wikipedia, Flickr and YouTube are just a few examples of Web 2.0 tools that are drastically changing the Internet landscape today. These platforms allow users to produce, annotate and share information with their social network. Their combined content accounts for nearly four to five times that of edited text being produced each day on the Web. Given the vast amount of user-generated content and easy access to the underlying social graph, we can now begin to understand the nature of online communication and collaboration in social applications. This thesis presents a systematic study of the social media landscape through the combined analysis of its special properties, structure and content.

First, we have developed techniques to effectively mine content from the blogosphere. The BlogVox opinion retrieval system is a large scale blog indexing and content analysis engine. For a given query term, the system retrieves and ranks blog posts expressing sentiments (either positive or negative) towards the query terms. We evaluate the system on a large, standard corpus of blogs with available human verified, relevance assessments for opinions. Further, we have developed a framework to index and semantically analyze syndicated feeds from news websites. This system semantically analyzes news stories and build a rich fact repository of knowledge extracted from real-time feeds.

Communities are an essential element of social media systems and detecting their structure and membership is critical in several real-world applications. Many algorithms for community detection are computationally expensive and generally, do not scale well for large networks. In this work we present an approach that benefits from the scale-free distribution of node degrees to extract communities efficiently. Social media sites frequently allow users to provide additional meta-data about the shared resources, usually in the form of tags or folksonomies. We have developed a new community detection algorithm that can combine information from tags and the structural information obtained from the graphs to detect communities. We demonstrate how structure and content analysis in social media can benefit from the availability of rich meta-data and special properties.

Finally, we study social media systems from the user perspective. We present an analysis of how a large population of users subscribes and organizes the blog feeds that they read. It has revealed several interesting properties and characteristics of the way we consume information. With this understanding, we describe how social data can be leveraged for collaborative filtering, feed recommendation and clustering. Recent years have seen a number of new social tools emerge. Microblogging is a new form of communication in which users can describe their current status in short posts distributed by instant messages, mobile phones, email or the Web. We present our observations of the microblogging phenomena and user intentions by studying the content, topological and geographical properties of such communities.

The course of this study spans an interesting period in Web’s history. Social media is connecting people and building online communities by bridging the gap between content production and consumption. Through our research, we have highlighted how social media data can be leveraged to find sentiments, extract knowledge and identify communities. Ultimately, this helps us understand how we communicate and interact in online, social systems.

Bots fail to win Loebner Prize, Elbot takes bronze

October 13th, 2008

Elbot takes bronze in 2008 Loebner Prize competition
None of the six bots that made the Loebner Prize Competition finals won the prize, but Fred Roberts’ Elbot was declared the best of the lot, winning a bronze metal. Only five of the bots managed to start.

Apparently the sixth was busy elsewhere, rumored to be furiously buying and selling Credit Default Swaps on the weekend market.

The Guardian reports that

Elbot emerged as the winner, after scooping a 25% success rate at convincing the judges that it was actually human. That’s not enough to please the ghost of Turing, but it was enough to pick up Elbot’s owner, Fred Roberts, a cash prize. Fred’s invention had a few tricks up his sleeve, including trying to the judges off their game by explicitly referring to itself as a machine.

“Hi. How’s it going?” one judge began.

“I feel terrible today,” Elbot replied. “This morning I made a mistake and poured milk over my breakfast instead of oil, and it rusted before I could eat it.”

The BBC has a video on the competition.

SMOOTH: an efficient method for probabilistic knowledge integration

October 12th, 2008

In this week’s ebiquity meeting (10:30am Tue Oct 14), PhD student Shenyong Zhang will present his recent work with Yun Peng on SMOOTY, a new efficient method for modifying a joint probability distribution to satisfy a set of inconsistent constraints. It extends the well-known “iterative proportional fitting procedure” (IPFP) which only works with consistent constraints. Compared to existing methods, SMOOTH is computationally more efficient and insensitive to data. Moreover, SMOOTH can be easily integrated with Bayesian networks for Bayesian reasoning with inconsistent constraints. A paper on this work, An Efficient Method for Probabilistic Knowledge Integration will apear in the proceedings of The 20th IEEE International Conference on Tools with Artificial Intelligence next month.

NLTK: a natural language processing toolkit in Python

October 11th, 2008

NLTK looks very useful.

“NLTK — the Natural Language Toolkit — is a suite of open source Python modules, data and documentation for research and development in natural language processing. NLTK contains Code supporting dozens of NLP tasks, along with 40 popular Corpora and extensive Documentation including a 375-page online Book. Distributions for Windows, Mac OSX and Linux are available.”

The development of NLTK is led by Steven Bird, Edward Loper, and Ewan Klein.

(Spotted on Language Log.)

Earn $100 by designing an ontology for one of these domains

October 11th, 2008

This offer just showed up in a Google alert triggered by its mention of Swoogle. Some poor Australian student (poor in ethics and ability, not money) is willing to pay $100 to have someone do his project for a Semantic Web course.

homeworkanytimehelp4 is behind on several assignments and in a bit of a fix. He needs his ontology assignment done by 12 October, just two days after he posted his offer.

Is this cheating? Well, the studentOfFortune.com site has thought deeply about this, and it turns out that it’s not.

Q: It still seems like cheating
A: We’ve thought long and hard about this. We believe that users who write solutions which not only help provide answers but also help teach how the answers were achieved will be the solutions that are purchased more often than not. And for that reason, we believe that Student of Fortune is a teaching and research tool, not a tool for cheating. But it’s up to you how you use it. We’re not going to judge you. We’re just here to help.

Times are hard right now. If you are tempted to help homeworkanytimehelp4, you owe it to yourself to find out if the dollars are USD or AUD.

NRC study questions use of datamining for counterterrorism

October 7th, 2008

The National Research Council released a report on the effectiveness of collecting and mining personal data, such as such as phone, medical, and travel records or Web sites visited, as a tool for combating terrorism. The report, titled Protecting Individual Privacy in the Struggle Against Terrorists: A Framework for Program Assessment, was produced by a multi-year study was carried out at the request of DHS and NSF.

The NRC’s press release on the study notes that routine datamining can help in “expanding and speeding traditional investigative work”, it questions the effectiveness of automated datamining and behavioral surveillance.

“Far more problematic are automated data-mining techniques that search databases for unusual patterns of activity not already known to be associated with terrorists, the report says. Although these methods have been useful in the private sector for spotting consumer fraud, they are less helpful for counterterrorism precisely because so little is known about what patterns indicate terrorist activity; as a result, they are likely to generate huge numbers of false leads. Such techniques might, however, have some value as secondary components of a counterterrorism system to assist human analysts. Actions such as arrest, search, or denial of rights should never be taken solely on the basis of an automated data-mining result, the report adds.
    The committee also examined behavioral surveillance techniques, which try to identify terrorists by observing behavior or measuring physiological states. There is no scientific consensus on whether these techniques are ready for use at all in counterterrorism, the report says; at most they should be used for preliminary screening, to identify those who merit follow-up investigation. Further, they have enormous potential for privacy violations because they will inevitably force targeted individuals to explain and justify their mental and emotional states.”

The report suggested criteria and questions addressing both the technical effectiveness as well as impact on privacy to help agencies and policymakers evaluate data-based counterterrorism programs. It also calls for oversight and both technical and policy safeguards to protect privacy and prevent “mission creep”. Declan McCullagh has a good summary of the key recommendations.

The 352 page report can be downloaded from the National Accademies Press site for $37.00.

Chatterbots vie for $100K Loebner Prize

October 5th, 2008

On Sunday October 12, six computer chatterbots will sit down with six human judges at the University of Reading and try to convince them that they are not machines, but humans. The winner might take away the grand Loebner Prize worth $100,000. The Loebner Prize competition is a modified and simplified Turing test intended as a measure of machine intelligence. Here’s how Wikipedia describes it.

“The Loebner Prize is an annual competition that awards prizes to the Chatterbot considered by the judges to be the most humanlike of those entered. The format of the competition is that of a standard Turing test. In the Loebner Prize, as in a Turing test, a human judge is faced with two computer screens. One is under the control of a computer, the other is under the control of a human. The judge poses questions to the two screens and receives answers. Based upon the answers, the judge must decide which screen is controlled by the human and which is controlled by the computer program.”

This year, the competition is taking place ar Reading under the direction of Professor Kevin Warwick. The thirteen initial entries which have been reduced to six finalists.

Jeremy Gardiner
Qiong John Li
Peter Cole & Benji Adams
Elizabeth Perreau
Simon Edwards
Robert Scott Mitchell

The competition was started in 1990 by Hugh Loebner, who put up a set of cash prizes, including one worth $100,000 for the “first chatterbot that judges cannot distinguish from a real human in a Turing test that includes deciphering and understanding text, visual, and auditory input.” A fact of local interest is that Hugh Loebner worked at UMBC as the assistant director of computing in the 1980s. He left UMBC to run his family’s business, which at the time was doing well manufacturing roll-up disco dance floors for parties.

Over the years the Loebner prize competitions has come under considerable criticism from the AI research community. A common option among AI researchers is that the competition is more about publicity than science and encourages people to try to do well by exploiting tricks and competition-specific strategies rather than work on the fundamental problems underlying the development of intelligent machines. This article in Salon, Artificial stupidity, summarizes the positions.

Here are some stories on the 2008 Loebner Prize competition in the press: ‘Intelligent’ computers put to the test’, Invasion of ‘human’ robots and Artificial Conversational Entities: Can A Machine Act Human and Be Given ‘Rights’?.