UMBC ebiquity
2010 August

Archive for August, 2010

Welcome picnic for CSEE grad students, 1:30-3:30 Mon 8/30

August 29th, 2010, by Tim Finin, posted in UMBC

The UMBC ACM Student Chapter and CSEE Department are hosting a Welcome/Welcome Back picnic for all new and returning CSEE graduate students, faculty and staff this coming Monday, 30 August. It will be held from 2:00pm to 4:00pm 1:30pm to 3:30pm in the atrium of the Engineering and Computer Science (ECS) building. Food and drinks will be provided.

To get to the ECS building atrium, walk from ITE to the ECS building from the second floor of ITE and you will enter the atrium. Please come out on the day before classes and enjoy some food while catching up.

Everyone is also encouraged to also attend Convocation 2010, the formal opening of the academic year at UMBC, from 3:30 to 4:30 pm in the Retriever Activities Center. President Hrabowski will address the gathering and Wendy Salkind, Presidential Teaching Professor for 2010-13, will make brief remarks.

After Convocation, all faculty, staff and students are invited to yet another free Community Picnic on the UMBC Quad from 4:30 to 7:00pm. The rain location will be the Residence Life Dining Hall.

UMBC launches new cybersecurity graduate programs

August 27th, 2010, by Tim Finin, posted in cybersecurity, Security, UMBC

UMBC has established two new graduate programs in cybersecurity education, one leading to a Master’s in Professional Studies (MPS) degree in cybersecurity and another to a graduate certificate in cybersecurity strategy and policy. Both are designed for students and working professionals who aspire to make a difference in the security, stability, and functional agility of the national and global information infrastructure. The programs will begin in January 2011.

Yahoo! using Bing search engine in US and Canada

August 24th, 2010, by Tim Finin, posted in Google, sEARCH, Semantic Web, Social media

Google, Bing, Yahoo!Microsoft’s Bing team announced on their blog that that the Bing search engine is “powering Yahoo!’s search results” in the US and Canada for English queries. Yahoo also has a post on their Yahoo! Search Blog.

The San Jose Mercury News reports:

“Tuesday, nearly 13 months after Yahoo and Microsoft announced plans to collaborate on Internet search in hopes of challenging Google’s market dominance, the two companies announced that the results of all Yahoo English language searches made in the United States and Canada are coming from Microsoft’s Bing search engine. The two companies are still racing to complete the transition of paid search, the text advertising links that run beside and above the standard search results, before the make-or-break holiday period — a much more difficult task.”

Combining the traffic from Microsoft and Yahoo will give the Bing a more significant share of the Web search market. That should help them by providing both companies with a larger stream of search related data that can be exploited to improve search relevance, ad placement and trend spotting. It will also help to foster competition with Google focused on developing better search technology.

Hopefully, Bing will be able to benefit from the good work done at Yahoo! on adding more semantics to Web search.

Middle-earth dictionary attack

August 24th, 2010, by Tim Finin, posted in Humor, Security

Middle-earth dictionary attack

Middle earth dictionary attack

Researchers install PAC-MAN on Sequoia voting machine w/o breaking seals

August 23rd, 2010, by Tim Finin, posted in Games, Security, Social media, Technology Impact

Here’s a new one for the DIY movement.

Security researchers J. Alex Haldeman and Ariel Feldman demonstrated PAC-MAC running on a Sequoia voting machine last week at the EVT/WOTE Workshop held at the USENIX Security conference in DC.

Amazingly, they were able to install the game on a Sequoia AVC Edge touch-screen DRE (direct-recording electronic) voting machine without breaking the original tamper-evident seals.

Here’s how they describe what they did on Haldeman’s web site:

What is the Sequoia AVC Edge?

It’s a touch-screen DRE (direct-recording electronic) voting machine. Like all DREs, it stores votes in a computer memory. In 2008, the AVC Edge was used in 161 jurisdictions with almost 9 million registered voters, including large parts of Louisiana, Missouri, Nevada, and Virginia, according to Verified Voting.

What’s inside the AVC Edge?

It has a 486 SLE processor and 32 MB of RAM—similar specs to a 20-year-old PC. The election software is stored on an internal CompactFlash memory card. Modifying it is as simple as removing the card and inserting it into a PC.

Wouldn’t seals expose any tampering?

We received the machine with the original tamper-evident seals intact. The software can be replaced without breaking any of these seals, simply by removing screws and opening the case.

How did you reprogram the machine?

The original election software used the psOS+ embedded operating system. We reformatted the memory card to boot DOS instead. (Update: Yes, it can also run Linux.) Challenges included remembering how to write a config.sys file and getting software to run without logical block addressing or a math coprocessor. The entire process took three afternoons.”

You can find out more from the presentation slides from the EVT workshop, Practical AVC-Edge CompactFlash Modifications can Amuse Nerds. They sum up their study with the following conclusion.

“In conclusion, we feel our work represents the future of DREs. Now that we know how bad their security is, thousands of DREs will be decommissioned and sold by states over the next several years. Filling our landfills with these machines would be a terrible waste. Fortunately, they can be recycled as arcade machines, providing countless hours of amusement in the basements of the nations’ nerds.”

Google unemployment index estimates and predicts unemployment

August 20th, 2010, by Tim Finin, posted in Google, Social media

The Google Unemployment Index is an economic indicator based on queries sent to Google’s search engine related to unemployment, social security, welfare, and unemployment benefits. Since some of these search terms are probably leading indicators, it can also be used to predict upcoming changes in the actual unemployment rate.

The index is based on queries tracked via Google Insights for Search that are tuned to different countries and you can also focus on particular regions or metropolitan areas and compare the index in several locations. Here’s an example comparing Florida (blue) and Maryland (red).

Smart Grid: the collision of energy and information

August 19th, 2010, by Tim Finin, posted in Machine Learning, UMBC

The Maryland Clean Energy Technology Incubator (CETI) at bwtech@UMBC will host a seminar series this Fall with focus on the Smart Grid. The series will discuss the issues and opportunities and speculate on expected business opportunities in this major restructuring of the electric grid. Huge investments (tens of billions of dollars) are committed to the Smart Grid for the coming decade.

About six seminars are planned for Fall 2010 to be held (mostly) on Wednesdays from 4-6pm and UMBC faculty, staff and students are encouraged to participate. They will include a ~45 minute presentation followed by a lively discussion and opportunity to socialize and enjoy light refreshments.

The first speaker, Peter Kelly-Detwiler leads a group at Constellation Energy that is developing new methods for data analysis and presentation. He is an “entrepreneur” within Constellation with 20 years of experience in the energy field and he has a perspective on the Smart Grid like few others.

A smart grid perspective: finding value in
the collision of energy and information

Peter Kelly-Detwiler, Constellation Energy

4-6pm Wednesday, 8 September 2010
2nd floor Courtyard Conference Room
UMBC Tech Center

Many people have heard of the term “smart grid” and there are many varying interpretations of what it means. But everybody can agree on three things:

  • It involves increased and timely access to information
  • There’s money in it
  • It will create new and unforeseen technologies and entrepreneurial opportunities

The discussion will center around why smart grid is needed, how an energy provider views the challenges and opportunities, the forces we see gathering on the horizon, and how Constellation Energy is responding. Issues related to power grid economics, volatility, risk management, and customer elasticities and perspectives will be addressed.

Peter Kelly-Detwiler is Senior Vice President of Energy Technology Services for Constellation NewEnergy, Inc., a subsidiary of Constellation Energy Group. He and his company-wide team oversee the integration of efficiency technologies and applications that help customers better manage their total energy bills and create optimal energy solutions. Peter has 20 years of experience in the energy industry. His accomplishments include managing the development of energy efficiency projects and reviewing economic impact of energy products.

Please RSVP to Bjorn Frogner (, the CETI Entrepreneur in Residence, if you plan to attend.

Probability-based processor might speed AI applications

August 18th, 2010, by Tim Finin, posted in GENERAL, Semantic Web, Social media

Lyric Semiconductor LEC chipAnalog computers were a hot idea — in the 1950s! But I find this intriguing because I’ve come around to the position that a lot of our human “intelligence” is the result of acquiring and using probabilistic models. So supporting this in hardware might be a big win, especially for low-cost, low-power devices. It will also support lots of other common tasks in social computing, image processing and language technology.

Technology review has a short article, A New Kind of Microchip, on computer chip being developed by Lyric Semiconductor that process signals representing probabilities rather than digital bits.

“A computer chip that performs calculations using probabilities, instead of binary logic, could accelerate everything from online banking systems to the flash memory in smart phones and other gadgets. … And because that kind of math is at the core of many products, there are many potential applications. “To take one example, Amazon’s recommendations to you are based on probability,” says Vigoda. “Any time you buy [from] them, the fraud check on your credit card is also probability [based], and when they e-mail your confirmation, it passes through a spam filter that also uses probability.”

All those examples involve comparing different data to find the most likely fit. Implementing the math needed to do this is simpler with a chip that works with probabilities, says Vigoda, allowing smaller chips to do the same job at a faster rate. A processor that dramatically speeds up such probability-based calculations could find all kinds of uses.”

Lyric’s chip is called LEC and was developed with support from DARPA. It is 30 times smaller in size than current digital error correction technology according to Wired. Although small it yields “a Pentium’s worth of computation,” according to Lyric CEO Vigoda. His 2003 dissertation at MIT was on a related topic, Analog Logic: Continuous-Time Analog Circuits for Statistical Signal Processing.

You can also read about the LEC chip in a story in yesterday’s NYT, A Chip That Digests Data and Calculates the Odds.

UMBC ranked #4 in IT degrees among US research universities

August 18th, 2010, by Tim Finin, posted in CS, UMBC

For the past twenty years, UMBC has had a large number of student majoring in information technology. Our Computer Science and Information Systems programs are among the largest on campus and newer ones like Computer Engineering and Bioinformatics are growing.

Last week I had a chance to look at the latest information from the Department of Education’s National Center for Education Statistics, which is available from NSF’s WebCASPAR site. Data from the IPEDS Completions Survey shows that UMBC is fourth among U.S. research universities in the production of IT degrees and certificates.

In this analysis, I averaged the numbers from the two most recent years available — 2007 and 2008. Here are the top ten in terms of total production in the Carnegie classification categories RU/VH and RU/H.

average yearly production in 2007 and 2008
Penn State
University of Southern California
Johns Hopkins University
New Jersey Institute Technology
Georgia Tech
University of California-Irvine

In this group, UMBC also ranks #2, #21 and #31 for undergraduate, MS and PhD degree production, respectively. Here’s a graph of the top 50 — click through for a larger version.

Top 50 producers of IT degrees among US research universities

Looking at all schools shows the University of Phoenix generates the most IT grads, with an average of 3318 students over 2007 and 2008! Here are the top 15 schools of any type.

average yearly production in 2007 and 2008
University of Phoenix
Community College of the Air Force
University of Maryland University College
Strayer College
ECPI College of Technology
De Paul University
Penn State
Rochester Institute of Technology
University of Southern California
DeVry Institute of Tech
Johns Hopkins University
New Jersey Institute Technology
Baker College of Flint

Usability determines password policy

August 16th, 2010, by Tim Finin, posted in Policy, Privacy, Security, Social media

Some online sites let you use any old five-character string as your password for as long as you like. Others force you to pick a new password every six months and it has to match a complicated set of requirements — at least eight characters, mixed case, containing digits, letters, punctuation and at least one umlaut. Also, it better not contain any substrings that are legal Scrabble words or match any past password you’ve used since the Bush 41 administration.

A recent paper by two researchers from Microsoft concludes that an organization’s usability requirements is the main factor that determines the complexity of its password policy.

Dinei Florencio and Cormac Herley, Where Do Security Policies Come From?, Symposium on Usable Privacy and Security (SOUPS), 14–16 July 2010, Redmond.

We examine the password policies of 75 different websites. Our goal is understand the enormous diversity of requirements: some will accept simple six-character passwords, while others impose rules of great complexity on their users. We compare different features of the sites to find which characteristics are correlated with stronger policies. Our results are surprising: greater security demands do not appear to be a factor. The size of the site, the number of users, the value of the assets protected and the frequency of attacks show no correlation with strength. In fact we find the reverse: some of the largest, most attacked sites with greatest assets allow relatively weak passwords. Instead, we find that those sites that accept advertising, purchase sponsored links and where the user has a choice show strong inverse correlation with strength.

We conclude that the sites with the most restrictive password policies do not have greater security concerns, they are simply better insulated from the consequences of poor usability. Online retailers and sites that sell advertising must compete vigorously for users and traffic. In contrast to government and university sites, poor usability is a luxury they cannot afford. This in turn suggests that much of the extra strength demanded by the more restrictive policies is superfluous: it causes considerable inconvenience for negligible security improvement.

h/t Bruce Schneier

An ontology of social media data for better privacy policies

August 15th, 2010, by Tim Finin, posted in Policy, Privacy, Security, Semantic Web, Social media

Privacy continues to be an important topic surrounding social media systems. A big part of the problem is that virtually all of us have a difficult time thinking about what information about us is exposed and to whom and for how long. As UMBC colleague Zeynep Tufekci points out, our intuitions in such matters come from experiences in the physical world, a place whose physics differs considerably from the cyber world.

Bruce Schneier offered a taxonomy of social networking data in a short article in the July/August issue of the IEEE Security & Privacy. A version of the article, A Taxonomy of Social Networking Data, is available on his site.

“Below is my taxonomy of social networking data, which I first presented at the Internet Governance Forum meeting last November, and again — revised — at an OECD workshop on the role of Internet intermediaries in June.

  • Service data is the data you give to a social networking site in order to use it. Such data might include your legal name, your age, and your credit-card number.
  • Disclosed data is what you post on your own pages: blog entries, photographs, messages, comments, and so on.
  • Entrusted data is what you post on other people’s pages. It’s basically the same stuff as disclosed data, but the difference is that you don’t have control over the data once you post it — another user does.
  • Incidental data is what other people post about you: a paragraph about you that someone else writes, a picture of you that someone else takes and posts. Again, it’s basically the same stuff as disclosed data, but the difference is that you don’t have control over it, and you didn’t create it in the first place.
  • Behavioral data is data the site collects about your habits by recording what you do and who you do it with. It might include games you play, topics you write about, news articles you access (and what that says about your political leanings), and so on.
  • Derived data is data about you that is derived from all the other data. For example, if 80 percent of your friends self-identify as gay, you’re likely gay yourself.”

I think most of us understand the first two categories and can easily choose or specify a privacy policy to control access to information in them. The rest however, are more difficult to think about and can lead to a lot of confusion when people are setting up their privacy preferences.

As an example, I saw some nice work at the 2010 IEEE International Symposium on Policies for Distributed Systems and Networks on “Collaborative Privacy Policy Authoring in a Social Networking Context” by Ryan Wishart et al. from Imperial college that addressed the problem of incidental data in Facebook. For example, if I post a picture and tag others in it, each of the tagged people can contribute additional policy constraints that can narrow access to it.

Lorrie Cranor gave an invited talk at the workshop on Building a Better Privacy Policy and made the point that even P3P privacy policies are difficult for people to comprehend.

Having a simple ontology for social media data could help us move forward toward better privacy controls for online social media systems. I like Schneier’s broad categories and wonder what a more complete treatment defined using Semantic Web languages might be like.

Papers with more references are cited more often

August 15th, 2010, by Tim Finin, posted in Semantic Web, Social media

The number of citations a paper receives is generally thought to be a good and relatively objective measure of its significance and impact.

Researchers naturally are interested in knowing how to attract more citations to their papers. Publishing the results of good work helps of course, but everyone knows there are many other factors. Nature news reports on research by Gregory Webster that analyzed the 53,894 articles and review articles published in Science between 1901 and 2000.

The advice the study supports is “cite and you shall be cited”.

A long reference list at the end of a research paper may be the key to ensuring that it is well cited, according to an analysis of 100 years’ worth of papers published in the journal Science.
     The research suggests that scientists who reference the work of their peers are more likely to find their own work referenced in turn, and the effect is on the rise, with a single extra reference in an article now producing, on average, a whole additional citation for the referencing paper.
     ‘There is a ridiculously strong relationship between the number of citations a paper receives and its number of references,” Gregory Webster, the psychologist at the University of Florida in Gainesville who conducted the research, told Nature. “If you want to get more cited, the answer could be to cite more people.’

A plot of the number of references listed in each article against the number of citations it eventually received reveal that almost half of the variation in citation rates among the Science papers can be attributed to the number of references that they include. And — contrary to what people might predict — the relationship is not driven by review articles, which could be expected, on average, to be heavier on references and to garner more citations than standard papers.

You are currently browsing the UMBC ebiquity weblog archives for August, 2010.

  Home | Archive | Login | Feed