UMBC ebiquity research group Building intelligent systems in open, heterogeneous, dynamic, distributed environments
16 May 2008, 00:26:40 EDT  
Social media

Archive for the 'Social media' Category

Visualizing social networks

March 29th, 2008, by Tim Finin, posted in Social media

FAS Research social network visualizationsFAS Research is an Austrian company that specializes in social network analysis. Their site has a nice collections of social network visualizations that are artistic as well as informative.

Call for ISWC 2008 Research Papers

March 6th, 2008, by Tim Finin, posted in iswc, Social media, Web 2.0, Web, Semantic Web

The call for ISWC 2008 research papers for the Seventh International Semantic Web Conference is online. The track is co-chaired by Amit Sheth and Steffen Staab and has nineteen distinguished vice chairs and an program committee of experienced experts. Key dates for the research track are:

  • Abstracts due by 9 May 2008
  • Submissions due before 16 May 2008
  • Rebuttal phase during 14-16 June 2008
  • Notification sent by 11 July 2008
  • Camera ready due before 15 August 2008

Words your mobile phone is not allowed to say

March 3rd, 2008, by Tim Finin, posted in Social media, NLP, Humor, Mobile Computing

Language models are widely used in processing both written and spoken language. They are used for part of speech tagging, sense tagging, disambiguation, text similarity metrics, and many other tasks, including predicting the words a person intends when typing on a telephone keypad. The last application has some interesting wrinkles, as this video we spotted on Language Log explains.



The most popular predictive text system in use today is T9, developed by Nuance Communications. You can check out the video’s examples using this T9 demo.

MIT NYTE project visualizes New York communications

March 1st, 2008, by Tim Finin, posted in Social media, GENERAL

AP has an article, MIT Creates Picture of NY Communications, that highlights work of New York Talk Exchange (NYTE) project being done in the MIT SENSEable City Laboratory.

“For the past two months, 24 hours a day, MIT researchers have been collecting the electronic communications of millions of New Yorkers — but not for salacious gossip or to protect national security. They’ve been building a census that shows, neighborhood by neighborhood, New York’s telephone and Internet links to other cities across the planet and how those connections change over time.” (link)

Globe Encounters visualizes in real time the volumes of Internet data flowing between New York and cities around the world. The size of the glow on a particular city location corresponds to the amount of IP traffic flowing between that place and New York City. A greater glow implies a greater IP flow.

Visualizations from the NYTE project are part of the Design and the Elastic Mind exhibit at the Museum of Modern Art, which focuses on the use of technology in design.

“New York Talk Exchange illustrates the global exchange of information in real time by visualizing volumes of long distance telephone and IP (Internet Protocol) data flowing between New York and cities around the world. In an information age, telecommunications such as the Internet and the telephone bind people across space by eviscerating the constraints of distance. To reveal the relationships that New Yorkers have with the rest of the world, New York Talk Exchange asks: How does the city of New York connect to other cities? With which cities does New York have the strongest ties and how do these relationships shift with time? How does the rest of the world reach into the neighborhoods of New York?” (link)

The data was provided to the MIT researchers by AT&T from voice and Internet traffic after being anonymized to remove any personal information.

WIkipedia research papers

February 28th, 2008, by Tim Finin, posted in Wikipedia, Social media, Web, Semantic Web

Mike Bergman has a comprehensive list of about 100 papers on Wikipedia as a knowledge source.

“Since about 2005 — and at an accelerating pace — Wikipedia has emerged as the leading online knowledge base for conducting semantic Web and related research. The system is being tapped for both data and structure. Wikipedia has arguably replaced WordNet as the leading lexicon for concepts and relations. Because of its scope and popularity, many argue that Wikipedia is emerging as the de facto structure for classifying and organizing knowledge in the 21st century.”

This complements a similar list on Wikipedia itself, Wikipedia in academic studies.

“Below is an incomplete list of academic conference presentations, peer-reviewed papers and other types of academic writing which focus on Wikipedia as their subject. Works that mention Wikipedia only in passing are unlikely to be listed. Unpublished works of presumably academic quality are listed in a dedicated section.”

(spotted on the dbpedia mailing list)

Join the ICWSM community on CrowdVine

February 26th, 2008, by Tim Finin, posted in Social media, Web 2.0, Web

We invite you to join the ICWSM 2008 social networking community site hosted by CrowdVine. ICWSM 2008 is the Second International Conference on Weblogs and Social Media which will take place in Seattle between March 30 and and April 2. If you are coming to ICWSM next month, you can use this site to help plan and shape the event, facilitate finding and connecting with people at the conference, and share your ideas and comments. If you aren’t able to make it to Seattle, it will provide a way for you to engage even though you can’t be there. Joining the ICWSM community on CrowdVine is easy and free, so please check it out.

No spam on Twitter?!

February 25th, 2008, by Tim Finin, posted in Twitter, Social media, splog, Blogging, Web

Can it be true? Russell Beattie posts that on Twitter there are nearly a million users, and no spam or trolls. Spam does exist on Twitter, of course, but it does seem to be less of a problem than on the Blogosphere, Web or email. Maybe it’s because that search engines don’t treat tweets like Web pages or blog posts.

Wisdom of the crowd control?

February 24th, 2008, by Tim Finin, posted in Wikipedia, Social media, Web 2.0, Web

Slate has an interesting article, The Wisdom of the Chaperones — Digg, Wikipedia, and the myth of Web 2.0 democracy, that explores who controls some of the popular social media sites. It turns out that the social web is more hegemonic than we thought.

wikipedia hegemony

“Social-media sites like Wikipedia and Digg are celebrated as shining examples of Web democracy, places built by millions of Web users who all act as writers, editors, and voters. In reality, a small number of people are running the show. According to researchers in Palo Alto, 1 percent of Wikipedia users are responsible for about half of the site’s edits. The site also deploys bots—supervised by a special caste of devoted users—that help standardize format, prevent vandalism, and root out folks who flood the site with obscenities. This is not the wisdom of the crowd. This is the wisdom of the chaperones.” (link)

The work cited is by the Augmented Social Cognition research group at PARC. See, for example, their post on the behavior of the most active Wikipedians. Very interesting.

I think it’ even worse, in many ways, on Digg, which the article also discusses.

“The same undemocratic underpinnings of Web 2.0 are on display at Digg.com. Digg is a social-bookmarking hub where people submit stories and rate others’ submissions; the most popular links gravitate to the site’s front page. The site’s founders have never hidden that they use a “secret sauce”—a confidential algorithm that’s tweaked regularly—to determine which submissions make it to the front page. Historically, this algorithm appears to have favored the site’s most active participants. Last year, the top 100 Diggers submitted 44 percent of the site’s top stories. In 2006, they were responsible for 56 percent.” (link)

Will rule by the few always be the case? Who knows. The article does point out that the moderation system used by Slashdot helps to broaden the elite and also describes a simple “write one, rate two” policy used by Helium, a site new to me. Helium is a community for freelance writers that helps them connect with publishers who will pay for articles on their topics. The publishers are vetted, so students seeking to buy term papers will have to look elsewhere.

Google slow to index blog posts?

February 24th, 2008, by Tim Finin, posted in Google, Social media, Blogging, Web

Last week I noticed that some of our blog posts took a long time to show up in the Google Blog search index. During the past year, Google has been very fast at indexing blog posts, typically taking less than five minutes from the time is made to when it shows up in their blog search index. But this week it seemed that our posts, or at least some of them, took more than twelve hours to be indexed.

Yesterday I tried to watch a post I made on the IT job market which I wrote just before 11:00am (GMT-5). It showed up in Google Feed Reader quickly enough but had not yet appeared in Google Blog Search when I finally went to bed 14 hours later. When I checked at 9:00am today, it was there, so it took sometime between 14 and 22 hours.

It’s not the case that all posts are being delayed — do a Google Blog search for a popular term (e.g., TV) sorted by date and you’ll see posts made in the past few minutes. Nor do I think it’s related to pageRank — their blog search ingest is based on pings rather than crawling. Besides, our blog enjoys a reasonable rank. Finally, it can’t be the case that Google’s systems are being overwhelmed by new blogs — the growth of the Blogosphere has slowed.

So I’m puzzled about what is going on. (goomtitag)

Update 1: Posted at 9:49, in Google Feed Reader at 10:14, indexed by Google Blog Search by ~19:15 and in Google’s main index about the same time. Maybe this is a clue — it used to be the case that a post hit the blog index within a few minutes and showed up in the main index after about twelve hours. This post hit both indexes around the same time — after about ten hours. Maybe there is now just one (logical) index.

Update 2: Hmmm. Another post seems to have made it into Google’s main index before it got into the blog search index. I imagine that Google revisited our blog home page as part of it’s regular crawl and picked up the new post.

How to use XFN (XML Friends Network)

February 21st, 2008, by Tim Finin, posted in Social media, Web 2.0, Semantic Web

Brian Suda has a good, practical article on XFN on opera.dev — XFN encoding, extraction, and visualizations.

“In this article I will take a good look at XFN - the microformat for describing relationships between people. I will look briefly at what it is and the basic markup needed to add the information to your sites, before then going into depth, looking at the benefits you can get from that data by extracting it and using it in different ways.”

He covers the how and why of XFN and has good examples and code fragments. FOAF is only mentioned once in passing, however..

Approximating the Community Structure of the Long Tail

February 18th, 2008, by Akshay Java, posted in Social media, Web 2.0, Web, Machine Learning, Semantic Web

Social Networks and Web graphs exhibit certain typical properties. The classic work by Barabási–Albert showed how nodes in such network link preferentially — popular nodes often gain disproportionately larger share of the links. This is also known in other fields as the 80/20 rule or simply the “rich get richer phenomenon“. Another early work by Steve Borgatti studied social networks and found that they exhibit a core-periphery property. A small set of (popular) nodes form the core and the rest comprise of the peripheral nodes. To the best of my knowledge, community detection algorithms have often worked independent of such underlying network properties.

I have been exploring an idea that can utilize the core-periphery structure of social networks to approximately compute the communities in the graph. The intuition behind this method is really quite simple. The basic idea boils down to the following:

“The core of the social network typically defines the communities present in it. By looking at the link structure of the core and identifying how the rest of the network connects to the core we can efficiently compute communities in large graphs.”

This idea can be easily explained by considering the following network of email communication (obtained from Dr. Mark Newman’s site). The original adjacency matrix was permuted to order the nodes based on their degree. Thus the core is represented by submatrix A which is quite dense. The submatrix B, here corresponds to how the rest of the network links to its core. The submatrix C is a very sparse matrix that consists of links between nodes in the long tail. Since C is quite sparse, it can be ignored without much degradation of the clustering/community detection results. Thus it leads to saving a significant amount of computation and storage. By utilizing just the core of the social network (matrix A) and how other nodes link to the core (matrix B) we can approximate the overall community structure of the entire graph, much more efficiently.

The rest boils down the to the mathematical formulation of the above idea using Spectral clustering techniques. You can read more about it in my poster paper that was recently accepted to ICWSM. (A Tech Report version with a more detailed analysis would be available shortly)

ICWSM early registration extended to 23:59 Monday 2/18

February 18th, 2008, by Tim Finin, posted in Social media, Web 2.0, Blogging, Web

The Second International Conference on Weblogs and Social Media (ICWSM 2008) will be held March 30 - April 2, 2008 at the Hilton in Seattle, Washington. The early registration deadline is Monday February 18. The program includes some great invited speakers: Bernardo Huberman (HP Labs), who will speak on “Social Dynamics in the Age of the Web,” David Sifry (Founder, Technorati, Sputnik, and Linuxcare), and Brad Fitzpatrick (Google, LiveJournal Founder). Two tutorials are planned, including “Subjectivity and Sentiment Analysis” by Jan Wiebe (Univ. of Pittsburgh) and “Graph Mining Techniques for Social Media Analysis” by Mary McGlohon and Christos Faloutsos (CMU). See the web site for details.

You are currently browsing the archives for the Social media category.

  Home | Archive | Login | Feed

Recent posts

  • Students: brand yourself with a blog
  • Social Data on the Web workshop at ISWC 2008
  • Petrini: Streaming Applications on the Cell BE Processor, 3pm 5/13 UMBC
  • Gossip-Based Outlier Detection for Mobile Ad Hoc Networks
  • Int. Conf. Semantic Web deadlines this week and next (ISWC 2008)

  • Ebiquity community

  • Fieldmarking data blog
  • Geospatial Semantic Web
  • Harry Chen thinks aloud
  • Planet social media research
  • Social media research blog
  • TrackForward by Kolari
  • UMBC GAIM

  • UMBC