UMBC ebiquity
Web

Archive for the 'Web' Category

Recorded Future analyses streaming Web data to predict the future

October 30th, 2010, by Tim Finin, posted in AI, Datamining, Google, Machine Learning, NLP, sEARCH, Semantic Web, Social media

Recorded Future is a Boston-based startup with backing from Google and In-Q-Tel uses sophisticated linguistic and statistical algorithms to extract time-related information from streams of Web data about entities and events. Their goal is to help their clients to understand how the relationships between entities and events of interest are changing over time and make predictions about the future.

Recorded Future system architecture

A recent Technology Review article, See the Future with a Search, describes it this way.

“Conventional search engines like Google use links to rank and connect different Web pages. Recorded Future’s software goes a level deeper by analyzing the content of pages to track the “invisible” connections between people, places, and events described online.
   ”That makes it possible for me to look for specific patterns, like product releases expected from Apple in the near future, or to identify when a company plans to invest or expand into India,” says Christopher Ahlberg, founder of the Boston-based firm.
   A search for information about drug company Merck, for example, generates a timeline showing not only recent news on earnings but also when various drug trials registered with the website clinicaltrials.gov will end in coming years. Another search revealed when various news outlets predict that Facebook will make its initial public offering.
   That is done using a constantly updated index of what Ahlberg calls “streaming data,” including news articles, filings with government regulators, Twitter updates, and transcripts from earnings calls or political and economic speeches. Recorded Future uses linguistic algorithms to identify specific types of events, such as product releases, mergers, or natural disasters, the date when those events will happen, and related entities such as people, companies, and countries. The tool can also track the sentiment of news coverage about companies, classifying it as either good or bad.”

Pricing for access to their online services and API starts at $149 a month, but there is a free Futures email alert service through which you can get the results of some standing queries on a daily or weekly basis. You can also explore the capabilities they offer through their page on the 2010 US Senate Races.

“Rather than attempt to predict how the the races will turn out, we have drawn from our database the momentum, best characterized as online buzz, and sentiment, both positive and negative, associated with the coverage of the 29 candidates in 14 interesting races. This dashboard is meant to give the view of a campaign strategist, as it measures how well a campaign has done in getting the media to speak about the candidate, and whether that coverage has been positive, in comparison to the opponent.”

Their blog reveals some insights on the technology they are using and much more about the business opportunities they see. Clearly the company is leveraging named entity recognition, event recognition and sentiment analysis. A short A White Paper on Temporal Analytics has some details on their overall approach.

How Rapleaf is eroding our privacy on the Web

October 24th, 2010, by Tim Finin, posted in Privacy, Social media

RapLeaf knows what you did last summer.

The Wall Street Journal continues its exploration of how our privacy is eroding on the Web in new article by Emily Steel — A Web Pioneer Profiles Users by Name. The article profiles the San Francisco startup RapLeaf, which defines its vision as follows.

“We want every person to have a meaningful, personalized experience – whether online or offline. We want you see the right content at the right time, every time. We want you to get better, more personalized service. To achieve this, we help Fortune 2000 companies gain insight into their customers, engage them more meaningfully, and deliver the right message at the right time. We also help consumers understand their online footprint.”

RapLeaf ties email address to profiles with information about people and uses the profiles to target advertisements for clients. The articles shows the information collected for one person, Linda Twombly of Nashua NH, and what some of the coded information means.

Rapleaf does allow you to see the information it has collected about you, but you have to create a RapLeaf account to see it. You might be surprised about how well it knows you. Visit this page to see if your browser has RapLeaf cookies. You can also use it to opt out your email addresses from the RapLeaf system.

To be fair, RapLeaf and other companies are not doing anything illegal and mainly collect information that people choose to make public on the Web. However, their use of cookies does allow them to aggregate and integrate information about individuals and to associate that information with email addresses, Facebook UIDs and dozens of other identifiers. The information can be used to help Web-based systems serve you better — but their idea of serving you better is likely to involve peppering you with targeted ads.

How RapLeaf collects information about Web users

WSJ: many Facebook apps transmit user IDs to advertising and tracking companies

October 17th, 2010, by Tim Finin, posted in Facebook, Privacy, Social media, Web

This Wall Street Journal article says that many of the most popular of the 550,000 Facebook apps (!) have been transmitting identifying information about users and their friends to dozens of advertising and Internet tracking companies.

“The apps reviewed by the Journal were sending Facebook ID numbers to at least 25 advertising and data firms, several of which build profiles of Internet users by tracking their online activities.

Defenders of online tracking argue that this kind of surveillance is benign because it is conducted anonymously. In this case, however, the Journal found that one data-gathering firm, RapLeaf Inc., had linked Facebook user ID information obtained from apps to its own database of Internet users, which it sells. RapLeaf also transmitted the Facebook IDs it obtained to a dozen other firms, the Journal found.

RapLeaf said that transmission was unintentional. “We didn’t do it on purpose,” said Joel Jewitt, vice president of business development for RapLeaf.”

Update: Facebook responds.

Twitter turns to ads

October 10th, 2010, by Tim Finin, posted in Social media, Twitter, Web

Sic transit gloria mundi.

After building a huge audience, Twitter turns to ads to cash in:

“In the last two weeks, the company has introduced several advertising plans, courted Madison Avenue at Advertising Week, the annual industry convention, and promoted Dick Costolo, who has led Twitter’s ad program, to chief executive — all signs that Twitter means business about business.

Advertisers pay for Promoted Tweets to appear at the top of search results. … Promoted Tweets will eventually show up in Twitter timelines, not just when people search, based on the interests of people that users follow. Twitter also sells Promoted Trends, so advertisers can show up in the list of topics most discussed on Twitter, for $100,000 a day.”

It seems like AdBlock already suppresses the Promoted Tweets, at least this one.


Twitter promoted tweet

New Facebook Groups Considered Somewhat Harmful

October 7th, 2010, by Tim Finin, posted in Facebook, Privacy, Security, Social media

I always think of things I should have added in the hour after making a post. Sigh. Here goes…

The situation is perhaps not so different from mailing lists, Google groups or any number of similar systems. I can set up one of those and add people to them without their consent — even people who are are not my friends. Even people whom I don’t know and who don’t know me. Such email-oriented lists can also have public membership lists. The only check on this is that most mailing lists frameworks send a notice to people being added informing them of the action. But many frameworks allow the list owner to suppress such notifications.

But still, Facebook seems different, based on the how the rest of it is configured and on how people use it. I believe that a common expectation would be that if you are listed as a member of an open or private group, that you are a willing member.

When you get a notification that you are now a member of the Facebook group Crazy people who smell bad, you can leave the group immediately. llBut we have Facebook friends, many of them in fact, who only check in once a month or even less frequently. Notifications of their being added to a group will probably be missed.

Facebook should fix this by requiring that anyone added to a group confirm that they want to be in the group before they become members. After fixing it, there’s lots more that can be done to make Facebook groups a powerful way for assured information sharing.

New Facebook Groups Considered Harmful

October 7th, 2010, by Tim Finin, posted in Facebook, Privacy, Security, Social, Social media

Facebook has rolled out a new version of groups announced on the Facebook blog.

“Until now, Facebook has made it easy to share with all of your friends or with everyone, but there hasn’t been a simple way to create and maintain a space for sharing with the small communities of people in your life, like your roommates, classmates, co-workers and family.

Today we’re announcing a completely overhauled, brand new version of Groups. It’s a simple way to stay up to date with small groups of your friends and to share things with only them in a private space. The default setting is Closed, which means only members see what’s going on in a group.”

There are three kinds of groups: open, closed and secret. Open groups have public membership listings and public content. Private ones have public membership but public but private content. For secret groups, both the membership and content are private.

A key part of the idea is that the group members collectively define who is in the group, spreading the work of setting up and maintaining the group over many people.

But a serious issue with the new Facebook group framework is that a member can unilaterally add any of their friends to a group. No confirmation is required by the person being added. This was raised as an issue by Jason Calacanis.

The constraint that one can only add Facebook friend to a group he belongs to does offer some protection against ending up in unwanted groups (e.g., by spammers). But it could still lead to problems. I could, for example, create a closed group named Crazy people who smell bad and add all of my friends without their consent. Since the group is not secret like this one, anyone can see who is in the group. Worse yet, I could then leave the group. (By the way, let me know if you want to join any of these groups).

While this might just be an annoying prank, it could spin out of control — what might happen if one of your so called friends adds you to the new, closed “Al-Queda lovers” group?

The good news is that this should be easy to fix. After all, Facebook does require confirmation for the friend relation and has a mechanism for recommending that friends like pages or try apps. Either mechanism would work for inviting others to join groups.

We have started working with a new group-centric secure information sharing model being developed by Ravi Sandhu and others as a foundation for better access and privacy contols in social media systems. It seems like a great match.

See update.

An agent-based model of the peer-review process

September 19th, 2010, by Tim Finin, posted in Agents, AI, Social media

The peer review process is central to most research disciplines and is used in the selection of papers for publication and research proposals for funding.

A new paper by Stefan Thurner and Rudolf Hanel develops an agent-based model of the scientific peer review process, Peer-review in a world with rational scientists: Toward selection of the average.

“… we are interested in the effects of rational referees, who might not have any incentive to see high quality work other than their own published or promoted. We find that a small fraction of incorrect (selfish or rational) referees can drastically reduce the quality of the published (accepted) scientific standard. We quantify the fraction for which peer review will no longer select better than pure chance. Decline of quality of accepted scientific work is shown as a function of the fraction of rational and unqualified referees. We show how a simple quality-increasing policy of e.g. a journal can lead to a loss in overall scientific quality, and how mutual support-networks of authors and referees deteriorate the system.”

Their agent model has several reviewers types:

  • The correct: Accepts good and rejects bad papers.
  • The stupid: This referee can not judge the quality of a paper (e.g. because of incompetence or lack of time) and takes a random decision on a paper.
  • The rational: The rational referee knows that work better than his/her own might draw attention away from his/her own work. For him there is no incentive to accept anything better than one’s own work, while it might be fine to accept worse quality.
  • The altruist: Accepts all papers.
  • The misanthropist: Rejects all papers.

I’ve known them all, as I am sure many of us have. As an editor or program chair I’ve met a few other types, including these:

  • The Bartleby: His or her response to an invitation is always “I would prefer not to.”
  • The Black Hole: Messages go in and nothing ever comes out.
  • The Gary Cooper: A person of few words, even when many are called for.
  • The Perseverator: Sees all sides of any decision and keeps all carefull in balance. Usually recommends “major revision”.

I am sure I’ve overlooked some — suggest your own via a comment.

(h/t Shlomo Argamon)

Zuck opens up

September 13th, 2010, by Tim Finin, posted in Facebook, Social media

Jose Antonio Vargas profiles Mark Zuckerberg in this week’s New Yorker in The Face of Facebook, Mark Zuckerberg opens up. It’s a short piece, but I learned a few facts. One in fourteen people in the world has a Facebook account. All of Zuckerberg’s acquaintances call him Zuck. Zuck has eight hundred and seventy-nine Facebook friends. Zuck likes Ender’s Game and roasting goats. He considers himself an “awkward person”. Not mentioned in the article, but of possible interest, is that The Social Network opens on October 1.

Facebook Browser gets a low F1-score in my book

September 12th, 2010, by Tim Finin, posted in Semantic Web, Social media, Web

Facebook has rolled out Facebook Browser as what sounds like a simple and effective idea — recommend pages based on on a user’s country and social network. My impression is mixed, however. While I like it’s top recommendation for me, I am already a fan. It’s suggestions for the celebrities category are a bust — Rush Limbaugh, Glenn Beck, Michelle Malkin, Mark Levin, Red Green and Bill O’Reilly. And Movies? Don’t even go there! Maybe it’s trying to tell me I need a new set of friends? Inside Facebook summarizes Facebook Browser this way:

“Facebook has launched a new way to “Discover Facebook’s Popular Pages” called Browser. It shows icons of Pages that are popular in a user’s country, but factors in which Pages which are popular amongst their unique friend network. When the Page icons are hovered over they display a Like button. Browser could cause popular Pages to get more popular, widening the gap between them and smaller Pages, similar to the frequently criticized and since abandoned Twitter Suggested User List.”

I think the idea is sound, though, and I like my Facebook friends. So, my conclusion is that Facebook needs to tweak the algorithm.

Follow UMBC Ebiquity on Twitter, Facebook and/or your feed reader

September 7th, 2010, by Tim Finin, posted in Ebiquity, Social media

We are generating short status messages for Ebiquity news and pushing them out to Twitter and Facebook. The messages generally have a shortened links connecting back to the full item, which might be a new paper, an event or a blog post. This will be a convenient way to track what is new on the Ebiquity site for many.

Now there are three easy ways to enjoy fresh Ebiquity news:

  • Check out the Ebiquity twitter page and follow @ebiquity if you want to have our tweets show up in your stream.
  • If Facebook is your thing, you can go to the UMBC Ebiquity Research Group page and click on the LIKE button to have the short Ebiquity updates show up on your wall.
  • If you’re old school, you can also view our combined news stream on Planet Ebiquity and/or get it as an atom RSS feed for your favorite feed reader.

SWSA seeks ISWC 2012 bids, 11th Int. Semantic Web Conf.

September 6th, 2010, by Tim Finin, posted in Semantic Web

Semantic WebThe Semantic Web Science Association (SWSA) is seeking statements of interest from organizations or consortia interested in hosting the 11th International Semantic Web Conference, ISWC 2012. The conference series moves regularly between the Americas, Europe, and the Asia/Pacific region and we expect that the 2012 edition will be held in the US Americas in late October or early November 2012.

Organizations wishing to host ISWC 2012 should contact SWSA President Professor James Hendler (swsa-president@aifb.uni-karlsruhe.de) who will work with the SWSA members who are co-ordinating the bidding process for ISWC 2012.

The process comprises two stages. During the first stage, statements of interest are solicited through an open call that request responses using a simple form. Once the first phase is complete, SWSA will shortlist a number of applications, who will be invited to submit a full proposal, using a standard form and budget template. More information about the ISWC Conference Series and the bidding process for hosting a conference in the series can be found in the ISWC Conference Guide.

The important dates for applying to host a Conference in 2012 are:

  • September 30, 2010: Deadline for receiving statements of interest
  • November 15, 2010: Notifications to shortlisted bids are sent out
  • January 15, 2011: Formal applications received from shortlisted bids
  • March 1, 2011: SWSA decides on location for the 2012 Conference

Economist on mining social networks

September 4th, 2010, by Tim Finin, posted in Social media

The Economist article Untangling the Social Web describes growing interest in business and government organizations in extracting information and making predictions by collecting and analyzing social network data. The article leads with an example of how mobile phone companies in the very competitive Indian market analyze their customer’s social networks to identify the most influential ones in order to “keep them on board with special discounts and promotions”. (See Social ties and their relevance to churn in mobile telecom networks.

According to the Economist, there’s a big market for such software.

“By one estimate there are more than 100 programs for network analysis, also known as link analysis or predictive analysis. The raw data used may extend far beyond phone records to encompass information available from private and governmental entities, and internet sources such as Facebook. IBM, the supplier of the system used by Bharti Airtel, says its annual sales of such software, now growing at double-digit rates, will exceed $15 billion by 2015. In the past five years IBM has spent more than $11 billion buying makers of network-analysis software. Gartner, a market-research firm, ranks the technology at number two in its list of strategic business operations meriting significant investment this year.”

The article also touches on more sophisticated systems that integrate additional information, including V.S. Subrahmanian’s work on STOP:

“Called SOMA Terror Organization Portal, it analyses a wide range of information about politics, business and society in Lebanon to predict, with surprising accuracy, rocket attacks by the country’s Hizbullah militia on Israel. Attacks tend to increase, for example, as more money from Islamic charities flows into Lebanon. Attacks decrease during election years, particularly as more Hizbullah members run for office and campaign energetically. By the middle of 2010 SOMA was sucking up data from more than 200 sources, many of them newspaper websites. The number of sources will have more than doubled by the end of the year.”

You are currently browsing the archives for the Web category.

  Home | Archive | Login | Feed