Parsing Perl considered undecidable

January 28th, 2008

Jeffrey Kegler on offers a proof that static parsing of Perl programs is undecidable in his post Perl Cannot Be Parsed: A Formal Proof. The approach is to show that parsing a tricky example (one short line of code!) can be reduced to solving the Halting problem. Here’s a piece of it.

Kennedy’s Lemma: If you can parse Perl, you can solve the Halting Problem.

To prove Kennedy’s Lemma, we assume that we can parse Perl. In particular this means we can take the following devilish snippet of code, concocted by Randal Schwartz, and determine the correct parse for it:

    whatever / 25 ; # / ; die "this dies!";

Schwartz’s Snippet can parse two different ways: if whatever is nullary (that is, takes no arguments), the first statement is a division in void context, and the rest of the line is a comment. If whatever takes an argument, Schwartz’s Snippet parses as a call to the whatever function with the result of a match operator, then a call to the die() function.

See his post for the complete proof.

Duncan Watts on influence, tipping points and marketing

January 27th, 2008

Fastcompany has a long article, Is the Tipping Point Toast?, on social-network researcher Duncan Watts, who’s on leave from his position as Professor of Sociology at Columbia and working for Yahoo Research. The article focuses on Watt’s challenges to the importance of “influentials” typified by Maclom Gladwell’s popular book, The Tipping Point.

“In the past few years, Watts–a network-theory scientist who recently took a sabbatical from Columbia University and is now working for Yahoo –has performed a series of controversial, barn-burning experiments challenging the whole Influentials thesis. He has analyzed email patterns and found that highly connected people are not, in fact, crucial social hubs. He has written computer models of rumor spreading and found that your average slob is just as likely as a well-connected person to start a huge new trend. And last year, Watts demonstrated that even the breakout success of a hot new pop band might be nearly random. Any attempt to engineer success through Influentials, he argues, is almost certainly doomed to failure.” link

According to the article, Watts work at Yahoo Research is refining the concept of big-seed marketing that he and Jonah Peretti proposed in a note in HBR, Viral Marketing for the Real World. The idea is to marry “viral-marketing tools with old-fashioned mass media in a way that yields far more predictable results than “purely” viral approaches like word-of-mouth marketing”.

Anonymous, leaderless resistance and Scientology

January 26th, 2008

Leaderless resistance is defined on Wikipedia as

“…a political resistance strategy in which small, independent groups (covert cells) challenge an established adversary such as a government. Leaderless resistance can encompass anything from non-violent disruption and disobedience to bombings, assassinations and other violent agitation. Leaderless cells lack bidirectional, vertical command links operating without a hierarchical command.” (link)

It’s challenging to combat a leaderless resistance because one can’t use the usual methods to discover participants by exploiting the social networks of known members.

Today’s new communication infrastructures make it easier for such distributed resistance movements to take hold and grow. Information, instructions and loose coordination can be spread via Web pages, Blogs, text messages, IRCs, mailing lists, etc.

A colleague Chris Diehl at JHU APL suggested the Estonian cyberwar might be a good example to study how the Blogosphere was used for this by combining sentiment analysis, geotagging and temporal analysis. This cyber attack was a subject of a recent colloquium at APL. It’s a great idea, but one made more challenging by the fact that the attack is over and would involve dealing with content in Estonian, which, although not exactly a low-density language, is also not one that has been extensively studied by computational linguists.

But maybe there is another example of an Internet-driven leaderless resistance, going on right now, that would be good to study as it unfolds. A group that calls itself Anonymous has announced it intends to launch an online DDOS attack on Scientology as part of a campaign against the organization.


The message is spread in part by YouTube videos starting on 21 January. There is also the Wikipedia page on Project Chanology which was created on 24 January 2008, an Anonymous Scientology Widget that counts down to (I suppose) when participating members should take action, and lost of mentions on forums, blogs and other forms of social media.

Linuxhaxor has instructions for what to do, which are offered only for educational purposes.

“This guide is for information purpose only, I, the site owner, do not encourage people to go about and follow these steps or Chanology in anyway to carry this attack, or any attack to any organization or any person. If you agree to follow these steps and help them carry this attack you are fully responsible for any consequences whatsoever. This act is illegal in many states and countries. ”

Wired just ran a story on this leaderless resistance effort, Anonymous Hackers Shoot For Scientologists, Hit Dutch School Kids, and there are plenty more online.

Finally, you can track the online interest through this Blogpulse trend graph comparing Blogosphere mentions of (1) “Tom Cruise” (2) Scientology and (3) anonymous+scientology and also the Google Trends graph comparing Google searches for the same three terms. Click on the graphs to see the current results.

Mentions of scientology, tom cruise and anonymous via Blogpulse

Google searches for scientology, Tom Cruise and anonymous

Tom Cruise is in there because he’s rumored to be the second most important person in the Church of Scientology and his recent Scientology indoctrination video that surfaced on YouTube may have been the tipping point for some.

UMBC offers class on Anatomy of a Video Game in Spring 2008

January 25th, 2008

This Spring UMBC will mount our first “regular” undergraduate class as part of its new programs on games, animation and interactive media. The class, Anatomy of a Video Game, will be taught by UMBC Alumna Katie Hirsch, who graduated with dual degrees in Computer Science and Visual Arts and who works at and Breakaway Games in Hunt Valley MD.

“This class dissects the process of developing a video game from an introductory perspective. The class will give artist and programmers an opportunity to focus on their specific areas of interest within the development pipeline while learning to work across their disciplines. The class will include production and design as well as art and programming specific topics.”

This course, as well as several others this spring, will take advantage of UMBC’s new GAIM Lab that is equipped with a generous gift of 20 Xbox consoles from Microsoft. shows your neighborhood news

January 25th, 2008 is another site that shows news, civic information and blogs posts relevant to a location. See the site for Catonsville MD, the area around UMBC for an example.

A VC blog calls this a hyperlocal site. (They are an investor). It’s similar to everyblock but it covers nearly 12,000 cities instead of three. The coverage, however, does not seem as deep.

They recognize the location of a blog posts as follows. A blogger registers he feed with and they monitor the posts. She geotags blog posts with the a location using one of four methods: (1) a link to a Google map with the location, (2) a blog category or tag that looks like a Zip code, (3) an inline text tag like or [where 1000 Hilltop Cir, Baltimore, MD 21250], or (4) geoRSS

everyblock shows what is new in your neighborhood

January 24th, 2008

Everyblock launched yesterday as a site that shows you news and other item that are about or relevant to your neighborhood. You enter your address, postal code or neighborhood name and see news articles, civic documents like crime reports and building permits, blog posts, craigslist entries, Yelp reviews, and Flickr photos associated with the area around it.

Currently only three cities are covered by everyblock — New York, Chicago and San Francisco. For an example, see the everyblock page for NYC’s Chelsea neighborhood or around the University of Chicago.

I suspect that much of the work that goes into a system like everyblock is selecting the right sources. For example, how you access online local government documents for each state or city will differ. You would want to mine the local newspapers for news items and focusing on them would make disambiguating geonames and addresses easy, since your first order approximation would be that every geo-reference is local. There is also work in adapting to the APIs for other services, like Yelp and craigslist, that each has its own way to sorting items into geographic regions.

There are good data sources for the GIS information and services, free or paid, that will do the geocoding and reverse geocoding of names and addresses. The Earth is a finite place and we have a lot of data about what is where, at least in the more developed parts of it.

Although everyblock claims to include relevant blog posts, I’ve not seen any yet. This is a harder problem, unless you get bloggers to add explicit geographic metadata or register their blogs with a location, like does.

Changes in the mailing list

January 21st, 2008

On Tuesday 22 January the agents mailing list ( will be offline between 21:00 and 23:00 UTC as we transition from Majordomo to GNU Mailman. Mail sent to the list at this time will bounce.

The agents list was begun in 1994 by Ray Johnson, then at the Lockheed Palo Alto AI Center and moved to UMBC in 1996. Majordomo represented the state of the art for mailing list software in 1996, but development stopped sometime around 2001. Moving to Mailman will make it easier for us to manage the list and let users manage a wider range of their own subscription options. The list currently has about 2000 subscribers.

If you are a subscriber to either the UMBC agents or agents-digest lists, your subscription will be transferred to the new Mailman-supported list. Subscribers to the old agents-digest list will get a daily digest of messages. Using the agents administration page you can elect to receive messages as they are sent or to get them in digest form. We’ve assigned subscribers random passwords, so you will need to recover your password before making any changes.

You can edit your Mailman configuration now, but we won’t start sending out mail using Mailman until the Tuesday evening. I’ll send out an announcement via the re-hosted list when I know it’s enabled.

An address entered in the Mailman admin page must match your subscribed address exactly. If you are not sure which of your email address is subscribed, check the message headers to see if that reveals it. Failing that, you can try asking the old system by sending poor old an email message with the command “which ” in the message body, where is a string you believe to be in your subscribed address. As a last resort, ask me for help (

You can continue to send mail to the list agents mailing list using the address agents at If the sending address is recognized as a subscriber, your message will distributed immediately and without moderation. Otherwise, you will be notified that your it awaits moderation, which might take a day or two.

In our old majordomo system, we maintained a separate list of additional pre-approved sending addresses. In general, if your sending address is not the same as your subscribed address, you should change the subscribed address. If you want to be able to send unmoderated messages from several accounts (e.g., your .edu and gmail accounts), you can always subscribe all of your accounts and disable email delivery for all but one.

Messages sent through the Mailman system will be available in an archive. The archive of old majordomo-era traffic is in disarray, but I think we have virtually all of the messages from 1994-2007. Eventually we’ll get it sorted out and online for posterity.

Our old moderation list was so inundated with spam and bounces from bad addresses that it became virtually impossible to moderate effectively. We anticipate that the new system will address both of these problems well and we will be thus be able to manage the moderation process better.

You can get more information about the list as well as manage subscriptions on the admin page and from the Mailman user guide. There are sure to be a few issues when we start using Mailman. If you have questions or suggestions about the list configuration, please let me know or send a message to the list if you think it should be of interest to the community.

Social Web Technologies, UMBC, Spring 2008, MW 7:10pm

January 20th, 2008

The success of social media sites like Facebook, Youtube, Myspace, and Flickr reflect a change in the way people use the Web. Current estimates are that over one third of all new material being added to the Web is user-generated content from sites like these.

Business leaders and VCs have taken note, and this is probably the most active area for new Internet startups and new ventures from established businesses. There’s a demand for people who understand the phenomenon, the technologies that make it work and the new technologies that will enable the next wave of successful services and sites. Who knows, maybe you can be the next Mark Zuckerberg. (If you do, please remember your alma mater.)

If you are still looking for a course to round out your Spring schedule, you might consider CMSC 491S/691S — ‘Social Web Technologies’. This special topics course will meet MW 7:10-8:25pm and be taught by Dr. Harry Chen, who received his Ph.D. from UMBC in 2004. For format will be seminar-style and project-oriented. It will cover (1) how social web technologies such as blogs, social networking, social bookmarking, photo/video sharing and folksonomies can improve the productivity of people and (2) how to apply the latest Web technologies (e.g., Machine learning, text mining, RDF, PhP, GIS, SOA, Javascript) to create social web applications. More information.

How Dr. Seuss would prove the halting problem undecidable

January 19th, 2008

I just discovered (via an extraordinary proof of the halting problem by linguist Geoffrey Pullum, now at the University of Edinburgh. What’s unusual about it is that it’s written as a poem in the style of Dr. Seuss.

Geoffrey K. Pullum, Scooping the loop snooper: An elementary proof of the undecidability of the halting problem. Mathematics Magazine 73.4 (October 2000), 319-320.

It’s a marvelous proof, sure to liven up any undergraduate theory of computation class. But I noticed errors in the proof — not logical errors, but a transcriptional ones in the form of a mangled word, perhaps introduced by an OCR system. The third line of the fifth stanza reads “that would take and program and call P (of course!)” which has problems in syntax, semantics, rhythm and meter. I’d guess it should be “that would take any program and call P (of course!)”. Similarly, “the” in the third line in the third stanza should probably be “they”. Most of the online version I found had these errors, but I eventually found what I take to be a correct version on the QED blog. I’ve not been able to get to the original version in Mathematical Magazine to verify the corrected version which I include below.


Scooping the Loop Snooper
an elementary proof of the undecidability of the halting problem

Geoffrey K. Pullum, University of Edinburgh

No program can say what another will do.
Now, I won’t just assert that, I’ll prove it to you:
I will prove that although you might work til you drop,
you can’t predict whether a program will stop.

Imagine we have a procedure called P
that will snoop in the source code of programs to see
there aren’t infinite loops that go round and around;
and P prints the word “Fine!” if no looping is found.

You feed in your code, and the input it needs,
and then P takes them both and it studies and reads
and computes whether things will all end as they should
(as opposed to going loopy the way that they could).

Well, the truth is that P cannot possibly be,
because if you wrote it and gave it to me,
I could use it to set up a logical bind
that would shatter your reason and scramble your mind.

Here’s the trick I would use – and it’s simple to do.
I’d define a procedure – we’ll name the thing Q –
that would take any program and call P (of course!)
to tell if it looped, by reading the source;

And if so, Q would simply print “Loop!” and then stop;
but if no, Q would go right back to the top,
and start off again, looping endlessly back,
til the universe dies and is frozen and black.

And this program called Q wouldn’t stay on the shelf;
I would run it, and (fiendishly) feed it itself.
What behaviour results when I do this with Q?
When it reads its own source, just what will it do?

If P warns of loops, Q will print “Loop!” and quit;
yet P is supposed to speak truly of it.
So if Q’s going to quit, then P should say, “Fine!” –
which will make Q go back to its very first line!

No matter what P would have done, Q will scoop it:
Q uses P’s output to make P look stupid.
If P gets things right then it lies in its tooth;
and if it speaks falsely, it’s telling the truth!

I’ve created a paradox, neat as can be –
and simply by using your putative P.
When you assumed P you stepped into a snare;
Your assumptions have led you right into my lair.

So, how to escape from this logical mess?
I don’t have to tell you; I’m sure you can guess.
By reductio, there cannot possibly be
a procedure that acts like the mythical P.

You can never discover mechanical means
for predicting the acts of computing machines.
It’s something that cannot be done. So we users
must find our own bugs; our computers are losers!

Read the rest of this entry »

Semantic Wave 2008: Industry Roadmap to Web 3.0

January 18th, 2008

ReadWriteWeb reports that Project10X has released a 400 page report entitled Semantic Wave 2008 Report: Industry Roadmap to Web 3.0 and Multibillion Dollar Market Opportunities. The full report will set you back $3,495, but you can get a free 27 page executive summary, a $235 value. Project10X describes their Semantic Wave report as follows.

“It is the first comprehensive industry study of the next stage of internet evolution — Web 3.0. This landmark 400-page report is written for executives, developers, designers, entrepreneurs, investors, and others who want to better understand semantic technologies, the business opportunities they present, and the ways Web 3.0 will change how we use and experience the internet. The semantic wave is a “long wave” of innovation and investment that will bring fundamental shifts in paradigm, technology, and economics. Over the next decade semantic technologies will drive trillion dollar global economic expansions, transforming industries as well as our experience of the internet. ”

The report also includes a supplier directory with more than 270 companies that are researching and developing semantic technology products and services and an annotated bibliography.

Hot showers considered NP-complete

January 18th, 2008

Guaranteeing that you can take a hot shower is NP complete, at lest in one formalization the problem by Christina Matzke and Damien Challet in a recent paper.

Christina Matzke, Damien Challet, Taking a shower in Youth Hostels: risks and delights of heterogeneity, arXiv:0801.1573v1 , 10 January, 2008. … Tuning one’s shower in some hotels may turn into a challenging coordination game with imperfect information. The temperature sensitivity increases with the number of agents, making the problem possibly unlearnable. Because there is in practice a finite number of possible tap positions, identical agents are unlikely to reach even approximately their favorite water temperature. Heterogeneity allows some agents to reach much better temperatures, at the cost of higher risk.

Spotted on the physics arXiv blog.

Who you gonna believe, me or your lying sensor-units

January 15th, 2008

Dario Floreano’s research at EPFL includes studying the role of evolution in robotics. According to a note in Discover Magazine, Robots Evolve And Learn How to Lie, he’s observed the discovery of lying.

“Floreano and his colleagues outfitted robots with light sensors, rings of blue light, and wheels and placed them in habitats furnished with glowing “food sources” and patches of “poison” that recharged or drained their batteries. Their neural circuitry was programmed with just 30 “genes,” elements of software code that determined how much they sensed light and how they responded when they did. The robots were initially programmed both to light up randomly and to move randomly when they sensed light. To create the next generation of robots, Floreano recombined the genes of those that proved fittest—those that had managed to get the biggest charge out of the food source.
    By the 50th generation, the robots had learned to communicate—lighting up, in three out of four colonies, to alert the others when they’d found food or poison. The fourth colony sometimes evolved “cheater” robots instead, which would light up to tell the others that the poison was food, while they themselves rolled over to the food source and chowed down without emitting so much as a blink.”