A panel on “Technologies to Understand it Now and Gain Insight in the Future” was part of the AAAI Weblogs Symposium today.
Panel Technologies to Understand it Now and Gain Insight in the Future Moderator Mark Liberman Participants Chris Redlitz (President, Feedster) Michael Sippey (Vice President Product, Six Apart) Howard Kaushansky (CEO, Umbria Inc.) Tony Perkins (Founder and Creator, AlwaysOn Network) Andrew Bernstein (CEO, Cymfony) Carrie Grimes (Group Lead – Search Quality, Google Inc.) Cameron Marlow (Yahoo Research Berkeley)
The panel was organized around questions. We summarize what the panel thought about these questions below. Panelists are identified by their abbreviated names.
What information do you get from Blogs? HK’s take is “Market Research”. Putting the “finger on the pulse” of consumers. AB agrees with HK and also points PR Marketing. AB discusses blog analysis in addition to e-mails received by organizations, and message-board discussions and the task of correlating all of them, with blogs as the pivot! AB mentions splogs and how they are effecting their analysis. CR talks about growth of the blogosphere. It all started as “The most recent post about X”. Relevancy is becoming more important now. CR talks about AOL connection and how AOL is using their index to track conversations on the blogopshere. CR ends by saying that traditional media is now open to involving unedited content on their pages.
How good is it? How much does it matter? Discussion was centered around SPAM and related issues. The question is regarding the robustness of current analysis techniques against spam? CG says don’t worry about it – data is always dirty. Blogger has cracked down spam blog postings. CG deviates from the problem and suggest that query disambiguation and other issues are more important. HK brings back the topic by asking about conflict between search engine revenues and SPAM. MS talks about how spammers go to such an extent as creating paid accounts on TypePad and how Six Apart does not allow automated content generation. MS says blogosphere does care about splogs as opposed to CG. ML gives a great example where hijacked content from his blog listed on another site had outlinks replaced to porn sites.
How will consumers use these analyses? CM raises an interesting point of privacy being an issue in the next 5 years, as organizations increasingly generate blog data. IP rights for RSS feeds is another issue. MS says we will see a bifurcation between bloggers that want to be public and bloggers who don’t. MS also talks about privacy issues in the future. Some bloggers say — “I don’t want to be in Google’s index”, but want to talk about it with my friends. MS says analysis engines have to worry about not having access to this data in the future. TP says privacy won’t be that important in the next generation, he says go look at “MySpace”, and says that tools should look at integrating blogging with social networking.
What do you need from researchers? CG gives a broad view, and talks about personalization in general. MS says there are 3 million active LJ users and there are many coommunities. MS points to recommending communities as very important from their perspective. AB talks about having better tools for relevancy. CR supports the community view. TP talks about employee group blog and how employees can make the entire organization more competitive. HK says there isn’t anything we cannot do with language technologies but almost all of them are not sufficiently accurate. So he suggests that researchers should work on making them more accurate.