 | 2006 November 
Archive for November, 2006
November 14th, 2006, by Tim Finin, posted in Uncategorized
David Yager distinguished professor of Art and Design has founded a new lab, the Innovation and Design Lab (IDL). The lab is currently working on a program with Johns Hopkins Medical Institute (JHMI) Children’s Center, “Hospital of the Future: The Living Laboratory”
They are using outcomes based design to come up with some very interesting ideas from bearings, to ceiling tiles to spam control. IDL is a transdisciplinary lab working with mechanical engineers, information systems, comp sci, design and the social sciences. They are currently seeking some corporate partners for their research.
Professor Yager is responsible for some of the most successful programs on campus, such as CAVC and IRC. If those are any indication this new venture should be very interesting.
I am going to accompany him on his safety rounds at the Children’s Center JHMI in a couple weeks. I look forward to seeing what their problems are and how I can use AI and pervasive computing to help.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 12th, 2006, by Tim Finin, posted in Uncategorized
The Washington Post has an article, New Clicks in the Arab World, about how blogging and bloggers are challenging longtime cultural and political restrictions. The story features “Saudi Arabia’s most popular blogger”, 31 year old Fouad al-Farhan
“Farhan is part of a growing wave of young Arabs who have turned to blogging to bypass the restrictions on free expression in a predominantly authoritarian, conservative and Muslim region. Blogging is so novel here that the equivalent term in Arabic, tadween, to chronicle, was coined only this year. But it has spread rapidly among the increasingly urban youth and in the process has loosened the limits of what’s open for discussion. Activists have used their blogs to organize demonstrations and boycotts, and to criticize corruption and government policies. The less politically inclined have turned them into forums for heated debates on religion and a place to share personal stories and sexual fantasies.”
An interesting (to me) fact is that al-Farhan studied Computer Science at Ball State university in Indiana, receiving an MS degree. The article describes attempts to establish a blogging advocacy group
This month, the Kingdom of Saudi Arabia Bloggers, founded by Farhan and a group of his friends, will post their charter online and open membership to male and female bloggers. Members will then vote for a president, male or female, and make amendments to the charter by majority vote. Meetings will be held online.
The internet has long been a force that is difficult to control, giving voice to people and groups even when oppressive countries try to silence them.
“But with the medium’s growing clout and appeal in the Arab world, the inevitable crackdown has followed. At least six Egyptian bloggers were jailed for a time earlier this year, and several blogs in Bahrain and Saudi Arabia have been blocked by the state-owned bodies that control Internet access.”
But, as John Gilmore is reputed to have said, “The Net interprets censorship as damage and routes around it”. For more context, recall that David Sifry’s October 2006 State of the Blogosphere report noted that Farsi is now one of the top-ten languages on the Blogosphere. And I doubt they are getting much of a boost from Farsi-language splogs!
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 11th, 2006, by Tim Finin, posted in Uncategorized
NYT reporter John Markoff has a story in tomorrow’s Times on Web 3.0, envisioned as the infusion of AI techniques and capabilities into the current Web. In Entrepreneurs See a Web Guided by Common Sense, he discusses attempts to make Web applications and services smarter:
From the billions of documents that form the World Wide Web and the links that weave them together, computer scientists and a growing collection of start-up companies are finding new ways to mine human intelligence. Their goal is to add a layer of meaning on top of the existing Web that would make it less of a catalog and more of a guide — and even provide the foundation for systems that can reason in a human fashion. That level of artificial intelligence, with machines doing the thinking instead of simply following commands, has eluded researchers for more than half a century.
The article mentions stealth mode startups Radar Networks and Metaweb as well as IBM’s Web Fountain and Cyc.
Sadly, the article is pretty much content free from a technology perspective. The Semantic Web is mentioned, but almost in passing, for example.
Both Radar Networks and Metaweb have their roots in part in technology development done originally for the military and intelligence agencies. Early research financed by the National Security Agency, the Central Intelligence Agency and the Defense Advanced Research Projects Agency predated a pioneering call for a semantic Web made in 1999 by Tim Berners-Lee, the creator of the World Wide Web a decade earlier.
We should have invited John to come to Athens. Maybe we can get him to Vancouver for the AI and the Web track at AAAI-07.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 11th, 2006, by Tim Finin, posted in Uncategorized
One thing I learned from Gio Wiederhold was the Heilmeyer catechism. Or is it Heilmeier catechism? George Heilmeyer/Heilmeier was the director of ARPA in the mid 1970s and required proposals for new programs to answer these questions.
- What is the problem, why is it hard?
- How is it solved today?
- What is the new technical idea; why can we succeed now?
- What is the impact if successful?
- How will the program be organized?
- How will intermediate results be generated?
- How will you measure progress?
- What will it cost?
I’ve found this very helpful over the years when writing research proposals and have suggested that our students use it to organize proposals for projects, theses and dissertations. Occasionally I would google for it and was always surprised at how few references I found. Recently, I discovered that there are quite a few references to it, but as Heilmeier’s Catechism.
What is the correct spelling? Google asks if you meant to enter Heilmeier when you type Heilmeyer, but it’s going on statistics. And we know that they often lie, especially when the information comes from the Web. When you search for “george (H) Heilme?er”, here’s what you get:
|
|
Web
|
site:.mil
|
site:darpa.mil
|
| heilmeyer |
|
|
|
| heilmeier |
|
|
|
Looking at some of the top results of the heilmeier search, however, convinces me that it’s the probably right spelling. The sources are from "professional" sites, including the national academy of engineering, magazines, the Smithsonian, MITRE, IEEE, and ACM. I have more trust in the information on these sites than I do in random web pages like this one.
So, I guess I will be updating my page soon.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 8th, 2006, by Tim Finin, posted in Uncategorized
Blogtalk reloaded was an interdisciplinary conference held on 2-3 October in Vienna on all manners of social software. In addition to papers the talks were recorded and put online. Some of the talks look very interesting and touch on the use of ontologies and on the semantic web. (spotted on SWIG IRC)
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 6th, 2006, by Tim Finin, posted in Uncategorized
If you have photos on Flickr from ISWC, the International Semantic Web Conference, please add them to the public Flickr ISWC group. Feel free to add photos from any of the past ISWCs as well as the current one.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 6th, 2006, by Tim Finin, posted in Uncategorized
David Sifry has published the latest in his series of reports on the State of the Blogosphere, noting trends and changes. As ususal, there’s lots of interesting material. One trend is a leveling off of the nunber of new blogs and posts, which he attributes in part to Technorati getting better at filtering out splogs.
As we’ve said in the past, some of the new blogs in our index are Spam blogs or ’splogs’. The good news is Technorati has gotten much better at preventing these kinds of blogs from getting into our indexes in the first place, which may be a factor in the slight slowing in the average of new blogs created each day. The spikes in red on the chart above shows the increased activity that occurs when spammers create massive numbers of fake blogs and try to get them into our indexes. As the chart shows, we’ve done a much better job over the last quarter at nearly eliminating those red spikes. While last quarter I reported about 8% of new blogs that get past our filters and make it into the index are splogs, I’m happy to report that that number is now more like 4%. As always, we’ll continue to be hyper-focused on making sure that new attacks are spotted and eliminated as quickly as possible. My gut feeling is that since we’re better at dealing with Spam now, even some of the blue areas in last quarter’s graph were probably accountable to spam, which would mean that rather than the bumpy ride shown above, we’re actually seeing a steady increased (but slower) growth of the blogosphere. Hopefully we’ll be able to have a more detailed analysis of these issues next quarter.
The data on the globalization of the Blogosphere is also very interesting and, I think, significant. Here’s a facinating observation that provides evidence that the blogosphere is not just the plaything of the current generation of young people in the developed world.
Coincident with a rise in blog posts about escalating Middle East tensions throughout the summer and fall, Farsi has moved into the top 10 languages of the blogosphere, indicating that blogging continues to play a critical role in debates about the important issues of our time
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 5th, 2006, by Tim Finin, posted in Uncategorized
ISWC06 is, of course, the 2006 International Semantic Web Conference.
I arrived at at ATL last night and took the ISWC bus to Athens, which is about 60 miles outside Atlanta. Chris Thomas, a PhD student at UGA, did a great job of finding us all and telling us about Athens and UGA during the 90 minute ride. He even offered to provide a personal tour of the Athens night life. I passed. It’s surprisingly cold here — I was expecting Georgia to be warmer in Early November — but the University of Georgia campus is quite attractive and their conference center is very nice. Today I’m participating in the Semantic Web Policy Workshop. Grit Denker of SRI started off our workshop by giving an invited talk on their work on using policies to control agile, software controled radios used in wireless communication. This was a great talk to start the workshop, since it’s grounded in a real application with some non-toy requirements, like delivering policy decisions in milliseconds and reasoning with complicated numerical constraints. Consequently, their policies are not represented in OWL but in a custom language that is supported by a policy reasoning implemented in Prolog.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 4th, 2006, by Tim Finin, posted in Uncategorized
Jeff Howe of crowdsourcing.com has written a WiredNews article, Gannett to Crowdsource News , describing changes and reorganizations in the Gannett Company, publisher of USA Today and more than 90 other US dailies.
The initiative emphasizes four goals: Prioritize local news over national news; publish more user-generated content; become 24-7 news operations, in which the newspapers do less and the websites do much more; and finally, use crowdsourcing methods to put readers to work as watchdogs, whistle-blowers and researchers in large, investigative features.
News publishers are in a panic, of course, due to declining circulation and ad revenue. Even though profits for the big companies remain very good, their stock prices have fallen due to fears that the circulation trends will continue downward.
Of all the pilot projects the company has conducted over the last few months, the most promising would seem to be the crowdsourcing of in-depth investigations into government malfeasance. Crowdsourcing involves taking functions traditionally performed by employees and using the internet to outsource them to an undefined, generally large group of people. The compensation is usually far less than what an employee might make for performing the same service. Well-known examples include Wikipedia and iStockphoto.
“We’ve already had some really amazing results with the crowdsourcing element of this,” said Jennifer Carroll, Gannett’s VP for new media content. “Most of us got into this business because we were passionate about watchdog journalism and public service, and we’ve just watched those erode. We’ve learned that no one wants to read a 400-column-inch investigative feature online. But when you make them a part of the process they get incredibly engaged.”
I’m quite conflicted about this. Getting readers more involved with a local newspapers sounds like a winning idea and crowdsoourcing is one way to do it. But it is a terrible idea if publishers see this as a way to lower costs by reducing their need for beat reporters. Journalism is a longstanding, honorable profession with a strong concern for ethics and good practice. Being a reporter is a challenging job requiring many skills and it doesn’t pay all that well. Crowdsourcing can’t replace most of what reporters do.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 3rd, 2006, by Tim Finin, posted in Uncategorized
Alexander Ratushnyak is the first winner of Hutter Prize for Lossless Compression of Human Knowledge. Ratushnyak’s paq8hp5 submission was able to compress the first 100MB of Wikipedia to 17,073,018 bytes, a 6.8% improvement over the baseline. Ratushnyak’s program is the first to use semantic associations between words for achieving higher text compression. Ratushnyak, who is a member of the Moscow State University Compression Project, announced that he will share his 3416€ prize (500€ for each improvement percentage point) with Przemyslaw Skibinski of the University of Wroclaw Institute of Computer Science for his contributions to the underlying PAQ compression algorithm.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 2nd, 2006, by Pranam Kolari, posted in Uncategorized
Netcraft released their November 2006 Web Server Survey. Its a great milestone, emerged to the top of TechMeme, and was featured on Slashdot. Looking closely, I was amazed at the nature of comments on Slashdot.
…how many of them are ad/pr0n/phishing-laden cybersquats, how many are “my first webpage” single-page sites, how many contain the default IIS … In short, how many of them are actual, funct^M usable, ongoing websites? That’s what I want to know. link
50 million more sites or 50 million more domain name squatters? link
How many of these “new” domains are those horrible “parked domains” that advertise their own sale and link to other sites (presumably to lift their google ranking)? link
and many more.. Yes, there is reason for concern.
Just to put this in perspective, here are all 4 letter info domains, that pinged weblogs.com in October. Looking closely it appears as though all domains were generated by permuting alphabets. Similar domains exists across the Web. Most of them appear to be spam.
If the menace of spam on the web is not controlled, who knows, 200 million won’t take all that long.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 1st, 2006, by Tim Finin, posted in Uncategorized
Today’s Washington Post has an article, First Ears, Then Hearts and Minds, on DARPA’s continuing efforts to develop automatic, real-time spoken language translation. This is part of a 50+ year investment in developing human language technology, including, speech, by the DoD. A current diver, of course, is the extreme shortage of Arabic linguists, translators and interpreters in the military.
One recently deployed result is the Phraselator, a PDA like translation device developed by VoxTec, an Annapolis-based company. Early versions of the device were used in Afghanistan in 2001 and more recent ones are ion use in Iraq. VoxTec’s Phraselator is a one-way device recognizes a set of pre-defined phrases and plays a corresponding recorded translation. Since the speech, language and domain models are in software, it can be easily ported to new languages or domains.
The DoD is also using a similar translation device developed by Integrated Wave Technologies that lets one enter key phrases that are then turned into appropriate Arabic sentences.
“You say ‘house search’ and then it will say in Arabic: ‘We’re here to search your house. Please stay in this room. Do you have any weapons?’” said Tim McCune, the company’s president.
The limitation of both devices is that they are one-way — they do not allow a two-way conversation.
“In years past, there wasn’t a great need for the individual soldier to speak a foreign language to do his mission,” said Wayne Richards, branch chief for technology implementation at U.S. Joint Forces Command. But in Iraq and Afghanistan, soldiers are increasingly interacting with Iraqi civilians, giving advice at checkpoints or guidance during home searches, he said. During those door-to-door searches, the soldiers need to be able to calm them down and reassure them,” Richards said. “We’re fighting for hearts and minds. But if I can’t tell her, ‘Ma’am, please calm down,’ . . . that wouldn’t be assuring.”
DARPA has an ongoing research program, Translation System for Tactical Use (TRANSTAC) , in which IBM, SRI and CMU were recently funded to develop the next generation of portable speech to speech translation systems. SRI’s IraqComm, for example, “performs bidirectional, speech-to-speech machine translation between English and colloquial Iraqi Arabic.” IBM’s MASTOR is a software only solution that can run on a PDA or laptop computer and is designed as a “two-way, free form speech translator that assists human communication using natural spoken language for people who do not share a common language.” CMU’s project is developing a “two-way translation between English and Arabic Iraqi” and “investigating issues surrounding the rapid deployment of new languages, especially, low-resource languages and colloquial dialects.”
While progress in speech-to-speech translation is steady, it is also slow. It will be many years before we have the Universal Translator seen in Star Trek. Not only could that device handle virtually all alien languages, it could even communicate with non-biological life forms. It could not, however, talk to lawyers.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
|  | Recent postsStudents: brand yourself with a blogSocial Data on the Web workshop at ISWC 2008Petrini: Streaming Applications on the Cell BE Processor, 3pm 5/13 UMBCGossip-Based Outlier Detection for Mobile Ad Hoc NetworksInt. Conf. Semantic Web deadlines this week and next (ISWC 2008)
Ebiquity communityFieldmarking data blog
Geospatial Semantic Web
Harry Chen thinks aloud
Planet social media research
Social media research blog
TrackForward by Kolari
UMBC GAIM
|  |