UMBC ebiquity
2010 October

Archive for October, 2010

Recorded Future analyses streaming Web data to predict the future

October 30th, 2010, by Tim Finin, posted in AI, Datamining, Google, Machine Learning, NLP, sEARCH, Semantic Web, Social media

Recorded Future is a Boston-based startup with backing from Google and In-Q-Tel uses sophisticated linguistic and statistical algorithms to extract time-related information from streams of Web data about entities and events. Their goal is to help their clients to understand how the relationships between entities and events of interest are changing over time and make predictions about the future.

Recorded Future system architecture

A recent Technology Review article, See the Future with a Search, describes it this way.

“Conventional search engines like Google use links to rank and connect different Web pages. Recorded Future’s software goes a level deeper by analyzing the content of pages to track the “invisible” connections between people, places, and events described online.
   “That makes it possible for me to look for specific patterns, like product releases expected from Apple in the near future, or to identify when a company plans to invest or expand into India,” says Christopher Ahlberg, founder of the Boston-based firm.
   A search for information about drug company Merck, for example, generates a timeline showing not only recent news on earnings but also when various drug trials registered with the website clinicaltrials.gov will end in coming years. Another search revealed when various news outlets predict that Facebook will make its initial public offering.
   That is done using a constantly updated index of what Ahlberg calls “streaming data,” including news articles, filings with government regulators, Twitter updates, and transcripts from earnings calls or political and economic speeches. Recorded Future uses linguistic algorithms to identify specific types of events, such as product releases, mergers, or natural disasters, the date when those events will happen, and related entities such as people, companies, and countries. The tool can also track the sentiment of news coverage about companies, classifying it as either good or bad.”

Pricing for access to their online services and API starts at $149 a month, but there is a free Futures email alert service through which you can get the results of some standing queries on a daily or weekly basis. You can also explore the capabilities they offer through their page on the 2010 US Senate Races.

“Rather than attempt to predict how the the races will turn out, we have drawn from our database the momentum, best characterized as online buzz, and sentiment, both positive and negative, associated with the coverage of the 29 candidates in 14 interesting races. This dashboard is meant to give the view of a campaign strategist, as it measures how well a campaign has done in getting the media to speak about the candidate, and whether that coverage has been positive, in comparison to the opponent.”

Their blog reveals some insights on the technology they are using and much more about the business opportunities they see. Clearly the company is leveraging named entity recognition, event recognition and sentiment analysis. A short A White Paper on Temporal Analytics has some details on their overall approach.

Chinese Tianhe-1A is fastest supercomputer

October 28th, 2010, by Tim Finin, posted in High performance computing, Multicore Computation Center

Tianhe-1AChina’s Tianhe-1A is being recognized as the world’s fastest supercomputer. It has 7168 NVIDIA Tesla GPUs and achieved a Linpack score of 2.507 petaflops, a 40% speedup over Oak Ridge National Lab’s Jaguar, the previous top machine. Today’s WSJ has an article,

“Supercomputers are massive machines that help tackle the toughest scientific problems, including simulating commercial products like new drugs as well as defense-related applications such as weapons design and breaking codes. The field has long been led by U.S. technology companies and national laboratories, which operate systems that have consistently topped lists of the fastest machines in the world.

But Nvidia says the new system in Tianjin—which is being formally announced Thursday at an event in China—was able to reach 2.5 petaflops. That is a measure of calculating speed ordinarily translated into a thousand trillion operations per second. It is more than 40% higher than the mark set last June by a system called Jaguar at Oak Ridge National Laboratory that previously stood at No. 1 on a twice-yearly ranking of the 500 fastest supercomputers.”

The NYT and HPCwire also have good overview articles. The HPC article points out that the Tianhe-1A has a relatively low Linpack efficiency compaed to the Jaguar.

“Although the Linpack performance is a stunning 2.5 petaflops, the system left a lot of potential FLOPS in the machine. Its peak performance is 4.7 petaflops, yielding a Linpack efficiency of just over 50 percent. To date, this is a rather typical Linpack yield for GPGPU-accelerated supers. Because the GPUs are stuck on the relatively slow PCIe bus, the overhead of sending calculations to the graphics processors chews up quite a few cycles on both the CPUs and GPUs. By contrast, the CPU-only Jaguar has a Linpack/peak efficiency of 75 percent. Even so, Tianhe-1A draws just 4 megawatts of power, while Jaguar uses nearly 7 megawatts and yields 30 percent less Linpack.

The (unofficial) “official” list of the fastest supercomputers is TOP500 which seems to be inaccessible at the moment, due no doubt to the heavy load caused by the news stories above. The TOP500 list is due for a refresh next month.

How Rapleaf is eroding our privacy on the Web

October 24th, 2010, by Tim Finin, posted in Privacy, Social media

RapLeaf knows what you did last summer.

The Wall Street Journal continues its exploration of how our privacy is eroding on the Web in new article by Emily Steel — A Web Pioneer Profiles Users by Name. The article profiles the San Francisco startup RapLeaf, which defines its vision as follows.

“We want every person to have a meaningful, personalized experience – whether online or offline. We want you see the right content at the right time, every time. We want you to get better, more personalized service. To achieve this, we help Fortune 2000 companies gain insight into their customers, engage them more meaningfully, and deliver the right message at the right time. We also help consumers understand their online footprint.”

RapLeaf ties email address to profiles with information about people and uses the profiles to target advertisements for clients. The articles shows the information collected for one person, Linda Twombly of Nashua NH, and what some of the coded information means.

Rapleaf does allow you to see the information it has collected about you, but you have to create a RapLeaf account to see it. You might be surprised about how well it knows you. Visit this page to see if your browser has RapLeaf cookies. You can also use it to opt out your email addresses from the RapLeaf system.

To be fair, RapLeaf and other companies are not doing anything illegal and mainly collect information that people choose to make public on the Web. However, their use of cookies does allow them to aggregate and integrate information about individuals and to associate that information with email addresses, Facebook UIDs and dozens of other identifiers. The information can be used to help Web-based systems serve you better — but their idea of serving you better is likely to involve peppering you with targeted ads.

How RapLeaf collects information about Web users

WSJ: many Facebook apps transmit user IDs to advertising and tracking companies

October 17th, 2010, by Tim Finin, posted in Facebook, Privacy, Social media, Web

This Wall Street Journal article says that many of the most popular of the 550,000 Facebook apps (!) have been transmitting identifying information about users and their friends to dozens of advertising and Internet tracking companies.

“The apps reviewed by the Journal were sending Facebook ID numbers to at least 25 advertising and data firms, several of which build profiles of Internet users by tracking their online activities.

Defenders of online tracking argue that this kind of surveillance is benign because it is conducted anonymously. In this case, however, the Journal found that one data-gathering firm, RapLeaf Inc., had linked Facebook user ID information obtained from apps to its own database of Internet users, which it sells. RapLeaf also transmitted the Facebook IDs it obtained to a dozen other firms, the Journal found.

RapLeaf said that transmission was unintentional. “We didn’t do it on purpose,” said Joel Jewitt, vice president of business development for RapLeaf.”

Update: Facebook responds.

The Maverick Meerkat is here

October 10th, 2010, by Krishnamurthy Viswanathan, posted in GENERAL

Today is the 10th day of the 10th month of year ’10. Canonical, today released Ubuntu 10.10 (Maverick Meerkat), sidestepping its usual Thursday release. Go to ubuntu.com to give it a spin. As usual, you can download it for free and burn it on a CD. It has all the great features that we are used to, plus a couple of cool new ones. Remember that you can always use the ISO to try out all the features in the new release without installing it.

The new Ubiquity installer has been redesigned to be easier to use and it also installs drivers and download updates even as it is installing the OS.  Their new service, Ubuntu One offers 2 GB of free “personal cloud” space to users, and also provides sharing and syncing options . A beta for the Microsoft Windows client is set to begin soon.

The server edition of Ubuntu 10.10 is touted as “the default open-source choice for cloud computing.” You can also try Ubuntu 10.10 server on Amazon EC2 for an hour, free.

Twitter turns to ads

October 10th, 2010, by Tim Finin, posted in Social media, Twitter, Web

Sic transit gloria mundi.

After building a huge audience, Twitter turns to ads to cash in:

“In the last two weeks, the company has introduced several advertising plans, courted Madison Avenue at Advertising Week, the annual industry convention, and promoted Dick Costolo, who has led Twitter’s ad program, to chief executive — all signs that Twitter means business about business.

Advertisers pay for Promoted Tweets to appear at the top of search results. … Promoted Tweets will eventually show up in Twitter timelines, not just when people search, based on the interests of people that users follow. Twitter also sells Promoted Trends, so advertisers can show up in the list of topics most discussed on Twitter, for $100,000 a day.”

It seems like AdBlock already suppresses the Promoted Tweets, at least this one.


Twitter promoted tweet

Google robot-controlled car frees users to text

October 9th, 2010, by Tim Finin, posted in Agents, AI, Google

No, this is not an article from The Onion, but Google is working on a computer-controlled car. Two articles for tomorrow’s New York Times describe a research project at Google on developing an autonomous vehicle. Here is a picture of the prototype.

Google autonomous vehicle

In the science science section, John Markoff has a story Google Cars Drive Themselves, in Traffic.

“Anyone driving the twists of Highway 1 between San Francisco and Los Angeles recently may have glimpsed a Toyota Prius with a curious funnel-like cylinder on the roof. Harder to notice was that the person at the wheel was not actually driving. A self-driving car developed and outfitted by Google, with device on roof, cruising along recently on Highway 101 in Mountain View, Calif. The car is a project of Google, which has been working in secret but in plain view on vehicles that can drive themselves, using artificial-intelligence software that can sense anything near the car and mimic the decisions made by a human driver.”

A companion article, also by Markoff, has some additional material, including this interesting note on the current approach.

“One main technique used by the Google team is known as SLAM, or simultaneous localization and mapping, which builds and updates a map of a vehicle’s surroundings while keeping the vehicle located within the map. To make a SLAM map, the car is first driven manually along a route while its sensors capture location, feature and obstacle data. Then a group of software engineers annotates the maps, making certain that road signs, crosswalks, street lights and unusual features are all embedded. The cars then drive autonomously over the mapped routes, recording changes as they occur and updating the map. The researchers said they were surprised to find how frequently the roads their robots drove on had changed.”

The project was the idea of Stanford computer science professor Sebastian Thrun who is also a Principal Engineer at Google, where he helped invent the Street View mapping service. Thrun has led the Stanford team that developed the Stanley robot car which won the 2005 DARPA Grand Challenge that was focused on developing autonomous vehicle technology.

It’s not clear what is the business case for this Google research project. But Google has the cash and the intellectual capital that might actually develop something in this space that can make money.

In a Google blog post from earlier today, What we’re driving at, Thrun gives one motivation.

“Larry and Sergey founded Google because they wanted to help solve really big problems using technology. And one of the big problems we’re working on today is car safety and efficiency. Our goal is to help prevent traffic accidents, free up people’s time and reduce carbon emissions by fundamentally changing car use.

So we have developed technology for cars that can drive themselves. Our automated cars, manned by trained operators, just drove from our Mountain View campus to our Santa Monica office and on to Hollywood Boulevard. They’ve driven down Lombard Street, crossed the Golden Gate bridge, navigated the Pacific Coast Highway, and even made it all the way around Lake Tahoe. All in all, our self-driving cars have logged over 140,000 miles. We think this is a first in robotics research.”

update: Techcrunch has an article speculating on the possible business applications, World-Changing Awesome Aside, How Will The Self-Driving Google Car Make Money?.

New Facebook Groups Considered Somewhat Harmful

October 7th, 2010, by Tim Finin, posted in Facebook, Privacy, Security, Social media

I always think of things I should have added in the hour after making a post. Sigh. Here goes…

The situation is perhaps not so different from mailing lists, Google groups or any number of similar systems. I can set up one of those and add people to them without their consent — even people who are are not my friends. Even people whom I don’t know and who don’t know me. Such email-oriented lists can also have public membership lists. The only check on this is that most mailing lists frameworks send a notice to people being added informing them of the action. But many frameworks allow the list owner to suppress such notifications.

But still, Facebook seems different, based on the how the rest of it is configured and on how people use it. I believe that a common expectation would be that if you are listed as a member of an open or private group, that you are a willing member.

When you get a notification that you are now a member of the Facebook group Crazy people who smell bad, you can leave the group immediately. llBut we have Facebook friends, many of them in fact, who only check in once a month or even less frequently. Notifications of their being added to a group will probably be missed.

Facebook should fix this by requiring that anyone added to a group confirm that they want to be in the group before they become members. After fixing it, there’s lots more that can be done to make Facebook groups a powerful way for assured information sharing.

New Facebook Groups Considered Harmful

October 7th, 2010, by Tim Finin, posted in Facebook, Privacy, Security, Social, Social media

Facebook has rolled out a new version of groups announced on the Facebook blog.

“Until now, Facebook has made it easy to share with all of your friends or with everyone, but there hasn’t been a simple way to create and maintain a space for sharing with the small communities of people in your life, like your roommates, classmates, co-workers and family.

Today we’re announcing a completely overhauled, brand new version of Groups. It’s a simple way to stay up to date with small groups of your friends and to share things with only them in a private space. The default setting is Closed, which means only members see what’s going on in a group.”

There are three kinds of groups: open, closed and secret. Open groups have public membership listings and public content. Private ones have public membership but public but private content. For secret groups, both the membership and content are private.

A key part of the idea is that the group members collectively define who is in the group, spreading the work of setting up and maintaining the group over many people.

But a serious issue with the new Facebook group framework is that a member can unilaterally add any of their friends to a group. No confirmation is required by the person being added. This was raised as an issue by Jason Calacanis.

The constraint that one can only add Facebook friend to a group he belongs to does offer some protection against ending up in unwanted groups (e.g., by spammers). But it could still lead to problems. I could, for example, create a closed group named Crazy people who smell bad and add all of my friends without their consent. Since the group is not secret like this one, anyone can see who is in the group. Worse yet, I could then leave the group. (By the way, let me know if you want to join any of these groups).

While this might just be an annoying prank, it could spin out of control — what might happen if one of your so called friends adds you to the new, closed “Al-Queda lovers” group?

The good news is that this should be easy to fix. After all, Facebook does require confirmation for the friend relation and has a mechanism for recommending that friends like pages or try apps. Either mechanism would work for inviting others to join groups.

We have started working with a new group-centric secure information sharing model being developed by Ravi Sandhu and others as a foundation for better access and privacy contols in social media systems. It seems like a great match.

See update.

How the DC Internet voting pilot was hacked

October 6th, 2010, by Tim Finin, posted in cybersecurity, Security, Social

University of Michigan professor J. Alex Halderman explains how his research group compromised the Washington DC online voting pilot in his blog post, Hacking the D.C. Internet Voting Pilot.

“The District of Columbia is conducting a pilot project to allow overseas and military voters to download and return absentee ballots over the Internet. Before opening the system to real voters, D.C. has been holding a test period in which they’ve invited the public to evaluate the system’s security and usability. … Within 36 hours of the system going live, our team had found and exploited a vulnerability that gave us almost total control of the server software, including the ability to change votes and reveal voters’ secret ballots. In this post, I’ll describe what we did, how we did it, and what it means for Internet voting.”

The problem was a shell-injection vulnerability that involved the procedure used to upload absentee ballots. Halderman concludes

“The specific vulnerability that we exploited is simple to fix, but it will be vastly more difficult to make the system secure. We’ve found a number of other problems in the system, and everything we’ve seen suggests that the design is brittle: one small mistake can completely compromise its security. I described above how a small error in file-extension handling left the system open to exploitation. If this particular problem had not existed, I’m confident that we would have found another way to attack the system.”

Smart phones to absorb credit cards with RFID?

October 5th, 2010, by Tim Finin, posted in Apple, Mobile Computing, RFID

iphone + RFID + credit cards Fastcompany has an article, Credit Cards Will Go Electronic, Then Disappear Into iPhone 5, predicting the merger of RFID-enabled credit cards and smart phones.

“Nokia plans to add antennas and RFID communications chips into its phones soon, and Apple has been patenting the heck out of the idea, but both companies were probably going to rely on an in-phone antenna loop. It seems increasingly certain Apple is going to bring RFID into common usage with the iPhone for 2011 (the iPhone 5) because there’s a new patent that shows just how far Apple has gone with design thinking for RFID. The patent shows how an RFID loop, powerful enough to act as both RFID tag or a tag-reader, can actually be built right into the complex layered circuitry of the iPhone (or iPod Touch) screen. We know Apple is fond of highly-polished design and integration, and this innovation is no exception. The screen has to be exposed by its very nature, which is good for RFID purposes — the wireless signal is unobstructed by other bulk in the smartphone, and it frees up Apple to do what it likes with the rest of the phone’s design.”

Maybe building RFID into smart phones will finally unleash the potential the technology offers for cool people oriented applications, as opposed to boring inventory management tasks. However, I don’t like the idea of not being able to use my credit card because my phone ran out of power.

Stuxnet worm update

October 5th, 2010, by Tim Finin, posted in cybersecurity

From slashdot earlier today:

“Numerous Stuxnet related stories continue to flow through my bin today, so brace yourself: Unsurprisingly, Iran blames Stuxnet on a plot set up by the west designed to infect its nuclear facilities. A Symantec researcher analyzed the code and put forth attack scenarios. A threatpost researcher writes about the sophistication of the worm. Finally, Dutch multinationals have revealed that the worm is also attacking them. We may never know what this thing was really all about.”

You are currently browsing the UMBC ebiquity weblog archives for October, 2010.

  Home | Archive | Login | Feed