UMBC ebiquity research group Building intelligent systems in open, heterogeneous, dynamic, distributed environments
Blogging

Archive for the 'Blogging' Category

Twitterment, domain grabbing, and grad students who could have been rich!

July 8th, 2008, by Anupam Joshi, posted in AI, Blogging, Datamining, Social media, Twitter, Web 2.0, cloud computing

Here at Ebiquity, we’ve had a number of great grad students. One of them, Akshay Java, hacked out a search engine for twitter posts around early April last year, and named it twitterment. He blogged about it here first. He did it without the benefit of the XMPP updates, by parsing the public timeline. It got talked about in the blogosphere, (including by Scoble), got some press, and there was an article in the MIT Tech review that used his visualization of some of the twitter links. It even got talked about in Wired’s blog, something we found out only yesterday. We were also told that three days after the post in Wired’s blog, someone somewhere registered the domain twitterment.com (I won’t feed them pagerank by linking!), and set up a page that looks very similar to Akshay’s. It has Google Adsense, and of course just passes the query to Google with a site restriction to twitter. So they’re poaching coffee and cookie money from the students in our lab :-)

So of course we played with Akshay’s hack, hosted it on one of our university boxes for a few months, but didn’t really have the bandwidth or compute (or time) resources to keep up. Startups such as summize appeared later and provided similar functionality. For the last week or two we’ve  been moving the code of twitterment to Amazon’s cloud to restart the service. Of course, today comes the news that twitter might buy summize, quasi confirmed by Om Malik. Lesson to you grad students — if you come up with something clever, file an invention disclosure with your university’s tech transfer folks. And don’t listen to your advisors if they think that there isn’t a paper in what you’ve hacked — there may yet be a few million dollars in it :-)

Spammers are using Amazon EC2

July 1st, 2008, by Tim Finin, posted in Social media, Web, splog

The Washington Posts Security Fix blog has a post, Amazon: Hey Spammers, Get Off My Cloud!, reporting on allegations that spammers are starting to use Amazon’s Elastic Compute Cloud (EC2) servers. It only makes sense — you can sign up easily without committing to a contract of any length, the price is low, and the IP addresses are drawn from a wide range, making it hard to block them all. Besides, if Amazon’s EC2 IP addresses all get put in a spam blacklist, it will be bad for their many legitimate users. It may be tricky for Amazon to police this.

Blog comment spam magnet

July 1st, 2008, by Tim Finin, posted in Social media, splog

A good fraction of the comment spam that makes it through our Akismet filter is from people who are trying to add a comment to one of our posts about spam blogs or comments. Here’s an example from today’s batch, a comment on a two-year old post Blog comment spam with plagiarized text: hard to spot from cameroun trying to promote the site africapresse.com.

“spam is a real problem in this day not just for .edu but for the entire internet world. Plagiarism is a problem too.”

It’s easy for me to classify this as spam since the comment was made on a very old post, is short, includes a reference to a site that looks commercial, makes a few general and superficial statements that are not really tied to any of the posts details.

I think it’s ironic that so many SEO wannabes try to spam posts about spam. I guess they just have spam on the brain. So, I offer up this post as food for the comment spammers and their search and comment tools.

akismet, anti-spam, antispam, automated, automated, automatic, backlink, backlinks, bad behavior, blacklist, block, blocking, blog, blogging, capcha, comment, comment spam, comments, human, keywords, links, links, nofollow, pagerank, people, plagiarize, plagiarism, rank, search engine optimization, seo, spam, spam blogs, spam comments, spam karma, spamming, splog, splog, splogs, steal, target, trackbacks, traffic, typepad, wordpress.

Splogs and politics

July 1st, 2008, by Tim Finin, posted in Social media, Web, splog

Here’s something I never expected: splogs as a political issue. Actually, it’s allegations of political blogs being splogs, or rather allegations of accusing political blogs of being a splogs in order to get Google to block them. The NYT Bits blog has a post, Google and the Anti-Obama Bloggers, that describes the controversy.

“Did Google use its network of online services to silence critics of Barack Obama? That was the question buzzing on a corner of the blogosphere over the last few days, after several anti-Obama bloggers were unable to update their sites, which are hosted on Google’s Blogger service. … In an article that appeared on Bloggasm.com, the reporter Simon Owens spoke with some of the affected bloggers, who said they believed that Google had fallen prey to a campaign by activists supporting Senator Obama. According to the bloggers, the Obama supporters had clicked on a “flag” on the anti-Obama blogs alerting Google that they were spam.”

Maybe this is a good reason to rely on the judgment of machines, at least until they start running for office.

Feedburner to include AdSense ads starting next week

May 30th, 2008, by Tim Finin, posted in Blogging, Web, Web 2.0

A post on the Feedburner blog, Into the wild: AdSense for feeds, annunced that Google will start integrating AdSense ads into feeds next week.

“… publishers already in the FeedBurner Ad Network will continue to see premium CPM ads directly sold onto their content, but with the added bonus of contextually targeted ads that will fill up the remainder of their inventory. … And with AdSense, you’ll know that your back-filled ads are using the strongest contextual ad engine, ensuring the most relevant and profitable ads are delivered to your subscribers. … For publishers who are not yet placing ads in their feeds, any publisher who meets the requirements to join the AdSense program will also be able to use AdSense for feeds. You will be able to manage your feed ad units directly from AdSense Setup tab, and track performance right on the AdSense Report tab. …”

Students: brand yourself with a blog

May 6th, 2008, by Tim Finin, posted in Blogging, Social media

ACM’s TechCareers site offers “career-related resources, news and job postings for IT and engineering professions”. They recommend that IT professionals and those seeking to become one, should try Branding Yourself With A Blog.

“… Certainly personal branding isn’t a new concept, but the future of personal branding could be in at your fingertips—with a blog. One of the first steps in creating a brand for yourself is to make your blog visible. Post meaningful entries, comment on your industry’s top blogs, or simply gain a regular readership. “Visibility creates opportunities,” says Schawbel, a social media specialist at EMC Corporation. He believes that when you brand yourself, the competition becomes irrelevant. “The goal of personal branding is to be recruited based on your brand, not applying for jobs,” Schawbel says. …”

This is especially good advice for students.

Environmental detection/protection.

April 7th, 2008, by joel, posted in Blogging, Ecoinformatics, GENERAL, Semantic Web, Social media, Web 2.0

EPA is on a web 2.0 kick. They sponsored a 2-day monster mashup exercise last Fall, the Puget Sound Information Challenge, and are making plans for further efforts. EPA’s CIO Molly O’neill talks a little about it here.

They’ve also been tracking and flirting with the semantic web, and are wondering how much effort to expend on a more full-on semantic engagement. I presented our semantic eco-blogging work at EPA headquarters in February, and was surprised at the turnout and enthusiasm. In response to a screen shot of a Fieldmarking post describing beach closings, a person from the Water Office related that he learned of the closing of his favorite Lake Erie swim-spot from a blog post. This made an impression on him, since, by rights, the closing should have been reported at the county level, up to the state level, and, ultimately, to his office in DC. It struck him that EPA should be systematically tapping the blogosphere for citizen sentiment and concern.

If they to do this, they will, implicitly, be saying to the citizenry “If you can’t be bothered to fill out the right form in the right office, at least blog about it, and maybe the machinery of the blogosphere will direct your thoughts our way.” I kind of like that. (This particular example – finding information on beach closings in a given area – can probably be done fairly efficiently with Yahoo pipes).

EPA will be hosting this week’s meeting of the multilateral ecoinformatics cooperation, and there will be participation from a wide swathe of EPA – I’m curious to learn of their plans.

No spam on Twitter?!

February 25th, 2008, by Tim Finin, posted in Blogging, Social media, Twitter, Web, splog

Can it be true? Russell Beattie posts that on Twitter there are nearly a million users, and no spam or trolls. Spam does exist on Twitter, of course, but it does seem to be less of a problem than on the Blogosphere, Web or email. Maybe it’s because that search engines don’t treat tweets like Web pages or blog posts.

Google slow to index blog posts?

February 24th, 2008, by Tim Finin, posted in Blogging, Google, Social media, Web

Last week I noticed that some of our blog posts took a long time to show up in the Google Blog search index. During the past year, Google has been very fast at indexing blog posts, typically taking less than five minutes from the time is made to when it shows up in their blog search index. But this week it seemed that our posts, or at least some of them, took more than twelve hours to be indexed.

Yesterday I tried to watch a post I made on the IT job market which I wrote just before 11:00am (GMT-5). It showed up in Google Feed Reader quickly enough but had not yet appeared in Google Blog Search when I finally went to bed 14 hours later. When I checked at 9:00am today, it was there, so it took sometime between 14 and 22 hours.

It’s not the case that all posts are being delayed — do a Google Blog search for a popular term (e.g., TV) sorted by date and you’ll see posts made in the past few minutes. Nor do I think it’s related to pageRank — their blog search ingest is based on pings rather than crawling. Besides, our blog enjoys a reasonable rank. Finally, it can’t be the case that Google’s systems are being overwhelmed by new blogs — the growth of the Blogosphere has slowed.

So I’m puzzled about what is going on. (goomtitag)

Update 1: Posted at 9:49, in Google Feed Reader at 10:14, indexed by Google Blog Search by ~19:15 and in Google’s main index about the same time. Maybe this is a clue — it used to be the case that a post hit the blog index within a few minutes and showed up in the main index after about twelve hours. This post hit both indexes around the same time — after about ten hours. Maybe there is now just one (logical) index.

Update 2: Hmmm. Another post seems to have made it into Google’s main index before it got into the blog search index. I imagine that Google revisited our blog home page as part of it’s regular crawl and picked up the new post.

ICWSM early registration extended to 23:59 Monday 2/18

February 18th, 2008, by Tim Finin, posted in Blogging, Social media, Web, Web 2.0

The Second International Conference on Weblogs and Social Media (ICWSM 2008) will be held March 30 – April 2, 2008 at the Hilton in Seattle, Washington. The early registration deadline is Monday February 18. The program includes some great invited speakers: Bernardo Huberman (HP Labs), who will speak on “Social Dynamics in the Age of the Web,” David Sifry (Founder, Technorati, Sputnik, and Linuxcare), and Brad Fitzpatrick (Google, LiveJournal Founder). Two tutorials are planned, including “Subjectivity and Sentiment Analysis” by Jan Wiebe (Univ. of Pittsburgh) and “Graph Mining Techniques for Social Media Analysis” by Mary McGlohon and Christos Faloutsos (CMU). See the web site for details.

Anonymous vs Scientology: sometimes they shouted TL;DR

February 11th, 2008, by Tim Finin, posted in Blogging, Social media, Web

Yesterday had been declared by anonymous as a day of protest against the Church of Scientology and a number of street protests were held around the world. People are blogging accounts of some of them, including one from London, where many wore V masks.

“In London, around 200 masked demonstrators gathered outside the Church of Scientology for a peaceful protest near Blackfriars. One unnamed protestor said: “We are here to raise awareness of the blatant exploitation of its members. “They actually scare me.” Onlooker Mark Thompson, 22, added: “You could tell they felt passionate about their cause. “There was a heavy police presence but they were never really used.” (link)

My favorite quote from the blog account was this.

It was the perfect internet anarchist protest. We shouted slogans. People with ghetto blasters played announcements. We shouted at THEM. People with megaphones addressed the crowd. Sometimes we cheered and clapped, sometimes we shouted “TL;DR!” (link)

Google social graph API

February 2nd, 2008, by Tim Finin, posted in Blogging, Semantic Web, Social media, Web, Web 2.0

Late this week Google released the Google social graph API which provides structured access to information Google’s has extracted from public FOAF and XFN data on the Web. Google also says it mines the web for “and other publicly declared connections”. I wonder what that means? Brad Fitzpatrick gives a three minute explanation in this video. This is exciting and likely to give a push to any number of emerging themes, including data portability, linked data, and the Semantic Web in general. There’s lots of comment from the ususal suspects and also on the SWIG IRC

By the way, he will give an invited talk at the 2008 International Conference on Weblogs and Social Media at the end of March in Seattle.

Here’s a simple call to the API starting with the ebiquity blog

  http://socialgraph.apis.google.com/lookup?q=ebiquity.umbc.edu%2Fblogger%2F&fme=1&pretty=1

You can see from the results that they are returned using JSON. The possible parameters and what they mean are given here.

You are currently browsing the archives for the Blogging category.

  Home | Archive | Login | Feed






UMBC