Archive for April, 2009
April 30th, 2009, by Tim Finin, posted in Google, Social media
Google has produced a special Mexico Flu Trends page to aggregate flu-related search queries from users in Mexico and various states within Mexico.
“We’ve created experimental estimates of flu activity in Mexico using aggregated search data. Unlike Google Flu Trends for U.S., this data has not been validated against confirmed cases of flu. After conferring with US and Mexican health officials, we’ve decided to share these initial results to provide additional information on the evolving epidemic.”
An article in the New York Times, To Aid Mexico, Google Expands Flu Tracking, quotes one expert on the limitations of the Google data
Dr. Henry L. Niman, a biochemist in Pittsburgh who runs Recombinomics, a Web site that tracks the genetics of flu cases worldwide, said that Google’s service appeared to provide only limited advance warning. “I am not saying that it is not useful. It probably works to complement other sources of surveillance and data,” he said.
April 28th, 2009, by Tim Finin, posted in Programming
Guido van Rossum has been blogging about the lack of support for optimizing tail recursion in Python (he’s agin it). His most recent post, Final Words on Tail Calls, includes this paragraph near the end.
‘And here it ends. One other thing I learned is that some in the academic world scornfully refer to Python as “the Basic of the future”. Personally, I rather see that as a badge of honor, and it gives me an opportunity to plug a book of interviews with language designers to which I contributed, side by side with the creators of Basic, C++, Perl, Java, and other academically scorned languages — as well as those of ML and Haskell, I hasten to add. (Apparently the creators of Scheme were too busy arguing whether to say “tail call optimization” or “proper tail recursion.” :-)’
April 27th, 2009, by Tim Finin, posted in Humor, Social media, Twitter
While we can use Twitter for news or reports on unfolding events from the field, it’s a noisy channel. As usual, Randall Munroe captures it well. I especially like its highlighting how Twitter’s search page lets you know there have been dozens of new matching tweets since you searched a moment ago. It seems that the flu-related tweets are arriving faster than anyone can read them.
April 26th, 2009, by Tim Finin, posted in Google, sEARCH, Semantic Web, Social media
Google has had a special “flu trends” site up for many months that provides “up-to-date estimates of flu activity in the United States based on aggregated search queries.”
They have found that how many people search for flu-related topics is a leading indicator for reports on how many people actually have flu symptoms. They believe that this metric “may indicate flu activity up to two weeks ahead of traditional flu surveillance systems”. Click on the flash video below to see the relationship between the flu searches and flu symptoms.
So, is Google magic? The explanation for why changes in in the level of flu searches precedes changes in the level of flu symptoms is more mundane.
“So why bother with estimates from aggregated search queries? It turns out that traditional flu surveillance systems take 1-2 weeks to collect and release surveillance data, but Google search queries can be automatically counted very quickly. By making our flu estimates available each day, Google Flu Trends may provide an early-warning system for outbreaks of influenza.”
You can get the details in a recent article in nature:
J. Ginsberg, M. Mohebbi, R. Patel, L. Brammer, M. Smolinski and L. Brilliant, Detecting influenza epidemics using search engine query data, Nature 457, 1012-1014 (19 February 2009).
Of course, such leading indicators may not correlate well if there is a “black swan” flu epidemic or even if there is an unfounded fear of one. Sometimes the crowds are wise, but often not. Remember when we all thought
technology stocks real estate was a good thing to invest in?
The Google site also allows you to look at the data by state as well. Click on the image below to try it out.
April 25th, 2009, by Tim Finin, posted in GENERAL
View H1N1 Swine Flu in a larger map. Pink markers are suspect. Purple markers are confirmed. Deaths lack a dot in marker. Created and maintained by niman.
Click graph to see an updated Google Search trend for ‘flu’
Click graph to see an updated Blog Pulse for ‘flu’
Trend for ‘flu’ on Twitter. Follow CDCemergency for authoritative news from the CDC or check the Twitter flu chatter to see what people are saying.
April 25th, 2009, by Tim Finin, posted in GENERAL, Social
“Conservatism and cognitive ability are negatively correlated”. How’s that for a provocative opening sentence in an academic paper! Lazar Stankova of the National Institute of Education in Singapore reports this finding in a paper published earlier this year in the Elsevier journal Intelligence.
Lazar Stankova, Conservatism and cognitive ability, Intelligence, v37, n3, pp. 294-304, May-June 2009.
I’ve only scanned the paper, but it looks like a serious study. Here’s the abstract:
“Conservatism and cognitive ability are negatively correlated. The evidence is based on 1254 community college students and 1600 foreign students seeking entry to United States’ universities. At the individual level of analysis, conservatism scores correlate negatively with SAT, Vocabulary, and Analogy test scores. At the national level of analysis, conservatism scores correlate negatively with measures of education (e.g., gross enrollment at primary, secondary, and tertiary levels) and performance on mathematics and reading assessments from the PISA (Programme for International Student Assessment) project. They also correlate with components of the Failed States Index and several other measures of economic and political development of nations. Conservatism scores have higher correlations with economic and political measures than estimated IQ scores.
The paper describes a meta-analysis based on data from three studies that employed the same set of psychological measures. Twenty-two of these measures were selected, drawn from four domains: personality, social attitudes, values, and social norms. While the paper finds strong support for the hypothesis that low cognitive ability is associated with high conservatism it doesn’t make any statements about causality.
There is room for disagreement about the definition of conservatism and it’s projection to the 22 measures. The following narrative definition of conservatism is given, which is broad and dominated by personal and social aspects. It’s clearly not limited to the political or economic domain.
“The Conservative syndrome describes a person who attaches particular importance to the respect of tradition, humility, devoutness and moderation as well as to obedience, self-discipline and politeness, social order, family, and national security and has a sense of belonging to and a pride in a group with which he or she identifies. A Conservative person also subscribes to conventional religious beliefs and accepts the mystical, including paranormal, experiences. The same person is likely to be less open to intellectual challenges and will be seen as a responsible “good citizen” at work and in the society while expressing rather harsh views toward those outside his or her group.”
If you can’t access the paper on Elsevier’s Science Direct digital library, you can look at three key tables here: Table 1, Table 2, and Table 3.
April 24th, 2009, by Tim Finin, posted in Social media, Twitter
A graduate student at the University of Wisconsin, Madison has developed a system that allows a person to send tweets just by thinking.
Researchers use brain interface to post to Twitter
In early April, Adam Wilson posted a status update on the social networking Web site Twitter — just by thinking about it. Just 23 characters long, his message, “using EEG to send tweet,” demonstrates a natural, manageable way in which “locked-in” patients can couple brain-computer interface technologies with modern communication tools.
April 21st, 2009, by Tim Finin, posted in Games, UMBC
The 4th annual UMBC Digital Entertainment Conference will be held 10-6 Saturday, April 25, 2009 in Lecture hall 2. This event is organized by the UMBC Game Developers Club and is free and open to the public. This year’s conference will feature speakers from local studios who will talk about programming, game design and art in game development, including:
- Justin Boswell, Senior Programmer, Firaxis
- Barry Caudill, Executive Producer, Firaxis
- Dave Inscore, Studio Art Director, Big Huge Games
- Eric Jordan, Programmer, Firaxis
- Martin Kau, Concept Artist, Big Huge Games
- Jon Shafer, Designer/Programmer, Firaxis
You can find more information and RSVP on the FaceBook DEC page.
April 18th, 2009, by Tim Finin, posted in Semantic Web, Web, Web 2.0
ReadWriteWeb has a post up on The Web of Data: Creating Machine-Accessible Information that focuses on Linked Open Data.
“In the coming years, we will see a revolution in the ability of machines to access, process, and apply information. This revolution will emerge from three distinct areas of activity connected to the Semantic Web: the Web of Data, the Web of Services, and the Web of Identity providers. These webs aim to make semantic knowledge of data accessible, semantic services available and connectable, and semantic knowledge of individuals processable, respectively. In this post, we will look at the first of these Webs (of Data) and see how making information accessible to machines will transform how we find information.”
I did find the three ‘Webs’ mentioned in their into — data, services and identity providers — to be interesting. The first two are standard components of the envisioned future Web but their third, a web of identity providers, less so. I am unsure its meant to refer to authentication services and protocols (e.g., oauth) or maybe some kind of named entity recognition services from text. The former is certainly necessary for web services and APIs to work more seamlessly, but doesn’t seem to me to be as significant a problem as developing highly interoperable and integrable Webs of data and services. Of course, I am probably unaware of the subtleties involved in getting this right while maintaing security and appropriate privacy. In any case, I look forward to the articles to follow.
April 16th, 2009, by Tim Finin, posted in GENERAL
Here’s an interesting paper that will appear in SIGMOD’09 comparing the MapReduce paradigm to parallel conventional databases. The benchmark study described in the paper showed that the parallel database approach performed significantly faster, although it took longer to load the data.
A Comparison of Approaches to Large-Scale Data Analysis, Pavlo, Paulson, Rasin Abadi, DeWitt, Madden, and Stonebraker.
There is currently considerable enthusiasm around the MapReduce (MR) paradigm for large-scale data analysis. Although the basic control flow of this framework has existed in parallel SQL database management systems (DBMS) for over 20 years, some have called MR a dramatically new computing model. In this paper, we describe and compare both paradigms. Furthermore, we evaluate both kinds of systems in terms of performance and development complexity. To this end, we define a benchmark consisting of a collection of tasks that we have run on an open source version of MR as well as on two parallel DBMSs. For each task, we measure each system’s performance for various degrees of parallelism on a cluster of 100 nodes. Our results reveal some interesting trade-offs. Although the process to load data into and tune the execution of parallel DBMSs took much longer than the MR system, the observed performance of these DBMSs was strikingly better. We speculate about the causes of the dramatic performance difference and consider implementation concepts that future systems should take from both kinds of architectures.
Benchmark details available so others can recreate the trials.
April 15th, 2009, by Tim Finin, posted in Ebiquity, Social media, Twitter
We were happy to see recent UMBC alumnus Akshay Java’s work on Twitter is mentioned in an article, Utility in the Jumble of Tweets, in yesterday’s New York Times.
“Some developers are creating tools to help companies keep an eye on the buzz. Akshay Java, a scientist at Microsoft, is trying to figure out a way to identify which experts are most influential on given topics by automatically analyzing the content of their tweets and who is in their Twitter network. Companies like Microsoft could use that information to figure out which twitterers they should contact to create buzz about a new product.”
April 10th, 2009, by Tim Finin, posted in GENERAL
Oh, if only!