 | Blogging 
Archive for the 'Blogging' Category
December 16th, 2007, by Tim Finin, posted in Social media, Blogging, Semantic Web
A post on the physics arXiv blog points to an interesting open access article, Maximizing PageRank via Outlinks, on how to structure your own web pages to maximize the PageRank scores they receive. The paper does not consider tactics for getting sites to link to your pages, but instead looks at how you can organize the internal link structure of your site to maximize your pageRank.
Cristobald de Kerchove, Laure Ninove and Paul Van Dooren, Maximizing PageRank via Outlinks, submitted to Linear Algebra Applications, 19 November 2007, arXiv:0711.2867v1 [cs.IR].
We analyze linkage strategies for a set I of web pages for which the webmaster wants to maximize the sum of Google’s PageRank scores. The webmaster can only choose the hyperlinks starting from the web pages of I and has no control on the hyperlinks from other web pages. We provide an optimal linkage strategy under some reasonable assumptions.
What is being optimized is the sum of the PageRanks for the pages in your site.
The optimal structure for your site is roughly this: organize your site as a linear chain of pages, each linking to the next in the chain and also back to each of its chain ancestors. The final node in the chain should be the only one that links to any node outside of your site, and it should link to just one outside page.
Edit | Bookmark@del.icio.us | Trackback | 1 Comment »
November 1st, 2007, by Tim Finin, posted in Social media, Web 2.0, Blogging
The New York Times has incorporated Blogrunner into it’s Web site. Techcrunch characterizes Blogrunner as a Techmeme Killer
“Last night, the New York Times quietly launched Blogrunner on the technology section of its main site. Blogrunner was one of many Techmeme copycat sites, until the New York Times bought it last year. Like Techmeme, Blogrunner is a service that keeps track of the latest news and blog posts on a range of topics (Politics, Technology, Media, Business, Economy, Law, Health, Movies, Books, Religion, Iraq, Entertainment). Now those links are appearing on the New York Time’s main site, starting with the technology section, in a middle column titled “Technology Headlines from Around the Web.” (link)
Here’s the NYT Bits blog on Blogrunner:
“The biggest change is the feature in the middle column of the technology page titled “Technology Headlines From Around the Web.” It presents a constantly updated list of hot technology stories. Notice what we are not worried about. … Even more interesting to me is how this list gets generated. It is mainly created by an automated algorithm developed by Philippe Lourier, the developer of Blogrunner, a Web site The New York Times Co. bought last year. It has something in common with Digg, the site on which readers vote on what articles they find interesting. But for Blogrunner, votes are links from blogs or other Web sites. This approach, of course, is what powers the PageRank algorithm of Google, and Techmeme, an excellent technology news site. (link)
I wonder what is taught at J Schools about this these days.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
October 28th, 2007, by Tim Finin, posted in Social media, Blogging, Web
This week USA Today had an article, ‘Gray Googlers’ strike gold, on older Americans operating websites that make money on Google ads.
“Jerry Alonzy figured he’d be working into his 70s at least. As an independent handyman at the mercy of weather patterns near Hartford, Conn., he’d always made a decent income that rarely grew. Then he found Google, and his life changed. Alonzy, 57, now makes $120,000 a year from the ads Google places on his Natural Handyman website, and he couldn’t be more thrilled. “I put in two, maybe three hours a day on the site, and the checks pour in,” he says. “What’s not to like?”
Of course, this is not a guaranteed get rich quick scheme. You have to have the right niche that will attract good paying ads, constantly writ new quality content, and build up your pagerank. Note that Alonzy spends about 20 hours a week tending his site — not an insignificant amount of time. The story cites some other examples that are probably more typical of what one can expect.
While the upside of working with AdSense sounds exhilarating, it’s not that way for everybody. Scott says she posted an unsold novel on Google and earns about $5 a month from the AdSense ads on the site. Al Needham, 74, who runs a site about the care of bees (bees-online.com) from his home near Boston, reaps about $250 a month. “Forget about getting rich overnight,” says Alonzy. “It takes time to learn.”
Edit | Bookmark@del.icio.us | Trackback | No Comments »
October 4th, 2007, by Tim Finin, posted in Ebiquity, Blogging, Security, Web
Sigh….
At the end of last week we had a catastrophic failure that resulted in our losing most of our posts. We had a security problem where someone had managed to compromise one of our blog accounts with administrative privileges. Some of the files were modified. We noticed it right away and decided to restore the site files and database from our nightly dump.
However … it turned out that when we did a major Wordpress update back in February 2006, we created a new database but failed to update our backup script. So, for the past 19 months, it’s been creating a nightly backup of the old database. Restoring the old database not only resulted in loosing 19 months worth of posts, but also left the database out of sync with the current Wordpress version.
One of our former students (thanks Filip!) wrote a script to recover the old posts from Google’s cache and reinsert them into the database. it was a tour de force demonstration of quick programming skill. There are still some problems that we’ll need to attend to — we’ve lost all of the new categories that we’ve added since 2/2006, the ‘related posts’ plugin is no longer working, I think the feed links aren’t all right, etc. But we recovered the posts.
We’ve tightened up our security but continue to see lots of malicious visitors knocking on the door and checking the locks.
It’s a jungle out there.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
February 6th, 2006, by Tim Finin, posted in splog, memeta, Blogging
Technorati’s David Sifry has posted another State of the Blogosphere report with lots of interesting statistics. Highlights include
- Technorati tracks 50K posts and hour from 27M blogs.
- The number of blogs doubles evey six months.
- Splogs and spings are increasing.
- Tagging is increasingly popular.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
February 5th, 2006, by Tim Finin, posted in Blogging, Web
coComment is a free service to help keep track of comment-based conversations on the blogosphere. After registering, you add their bookmarklet to you browser. When making a comment on a blog using any of the most common platforms (e.g., WordPress, blogger), you first click on the bookmarklet, and then submit your comment. The bookmarklet sends a copy of your comment to coComment which adds it to their database, along with the context. The result is that you can visit their page and see the comments you’ve made and can also add some code to your own blog(s) to show recent comments. Here’s what it should look like:
One thing that’s missing, IMHO, is the ability to register your comments with several IDs. I’d like to have my personal ID, but also define it as part of a group ebiquity ID. We could put code to link the ebiquity group ID comments on our ebiquity group blog.
Btw — to sign up you need an invitation code. To get an invitation code, just enter your email address to be notified when one is available. You may get it almost immediately in email, like I did.
Edit | Bookmark@del.icio.us | Trackback | 2 Comments »
February 4th, 2006, by Tim Finin, posted in splog, Swoogle, Blogging, Web, Semantic Web
We are using bbclone to generate reports on Swoogle access. Look at today’s top 10 referers as of 3:00pm:
www.legaladvocate.net 246 26.14%
www.myjavaserver.com 152 16.15%
www.google.com 125 13.28%
dannyayers.com 44 4.68%
lucky7.to 34 3.61%
ebiquity.umbc.edu 25 2.66%
www.google.de 18 1.91%
planetrdf.com 18 1.91%
mail.google.com 18 1.91%
groups.google.com 14 1.49%
One and five are clearly spam sites and two is suspicious, too. The first, for example, appears to be about poker, though the site name is legaladvocat. The site’s text is obviously automatically generated nonsense. All of the links point to subpages in the same domain with a similar structure and content. I assume that once the site achineves a high pageRank, it will be repurposed or sold.
So, it seems like nearly 50% of our hits are due to referer log spamming. I’d guess Swoogle was picked by finding its URL on recent posts found on a blog search engine or a ping server.
Edit | Bookmark@del.icio.us | Trackback | 3 Comments »
January 25th, 2006, by Tim Finin, posted in Ebiquity, Blogging, Web, Semantic Web, GENERAL
Google Scholar, it’s a good thing, as Martha Stewart would say.
We recently added a feature to our ebiquity paper repository that ties papers to their Google Scholar entries. The main motivation was to allow us to track citations.
As I’ve worked through our papers to verify and add their Google Scholar keys, other benefits are becoming apparent. In several cases I’ve discovered errors or omissions in our own meta data. Sometimes our own entries have had the title wrong! In other cases, I’ve found several Google Scholar entries for the same paper. Sometimes this is due to an error by the author of a citing paper, which can propagate.
I suspect that some of the errors originate with us. Here’s one scenario. When a paper is accepted for publication, the author is happy and excited and adds an entry in our database, along with softcopy of the draft. People download and read the draft and, if it’s good, start citing it. Months later the ultimate copy, which may have a different title and even a different author list, is finalized. Ideally, our site is edited to reflect the final metadata and final softcopy. But, sometimes this doesn’t happen or the final softcopy is not uploaded for copyright reasons. In any case, the old, and possibly incorrect metadata and draft may have escaped to roam the Internet.
Lately I’ve started to add a header to draft copies of papers posted to our side that states that they are drafts and also where the final version will appear. I’ve found Acrobat’s ability to add a header to an existing pdf file to be very handy for this. I’ve also used Acrobat to extract the first page of an article for which we don’t hold the copyright, add a header pointing to it’s source, and post that on our site (as in this example.)
Finally, one of the ideas that underlies the current Semantic Web vision is that it’s very useful for things on the web to have good identifiers. The Uniform Resource Identifier (URI) is the Semantic Web’s favorite identifier, but we all recognize that just using URIs is to simple for many objects (e.g., people). OWL’s contribution to this is the notion of an inverse functional property. If my ontology defines SSN as an inverse functional property, then two objects that share the same SSN must be the same. So, along these lines, the googleScholarKey property should be inverse-functional and have domain=publication and range=string.
Edit | Bookmark@del.icio.us | Trackback | 1 Comment »
January 24th, 2006, by Tim Finin, posted in splog, Blogging, Web, GENERAL
Two years ago Bill Gates predicted that the spam problem would be solved by now, as this article in The Register reports.
Hey Bill, why am I still getting spam?
Junk mail outlives MS mortality prediction
By John Leyden, 24 January 2006
Two years ago today Bill Gates predicted that spam email would be eradicated as a problem within 24 months. The Microsoft chairman predicted the death of spam in a speech at the World Economic Forum on 24 February 2004.
Gates outlined a three-stage plan to eradicate spam within two years. Microsoft’s scheme calls for better filters to weed out spam messages and sender authentication via a form of challenge-response system. Secondly, Microsoft wants to see to a form of tar-pitting so that emails coming from unknown senders are slowed down to a point where bulk mail runs become impractical.
Lastly, and most promisingly as far as Gates is concerned, is a digital equivalent of stamps for email, to be paid out only if the recipient considers an email to be spam. Blocking spam email would appear to be a simple problem but in practice is far trickier than Gates, or indeed the industry, first thought.
…
It’s tempting to think that we are close to being able to solve the splog identification problem, which enable blog search engines to weed the slogs out of their indices. But, I’ll bet that splogs will be with us for a long time, as is the case with spam. Of course, we do have to work hard to keep them under control, just as we do with spam. If we don’t, the blogosphere will be quickly overrun and its promise squandered.
Edit | Bookmark@del.icio.us | Trackback | 2 Comments »
January 17th, 2006, by Tim Finin, posted in splog, Ebiquity, Blogging, Web, GENERAL
Baltimore Sun’s Troy McCullough talks about Pranam Kolari’s work on detecting splogs in his column on Sunday, 15 January 2006. The column also has an associated podcast.
Fighting spam sites - latest battle in the blog wars
On Blogs: Troy McCullough, Jan 15, 2006
It seems that everyone has a blog these days - a spot that others can visit to find out what they have to say about something or nothing in particular. Some blogs are widely valued fonts of specialized wisdom, but many are viewed as uninteresting expressions of personal ego. The difficulty of sorting the good blogs from the bad can be a frustrating challenge - one that is seen as a serious threat to what has been viewed as a vital feature of the Internet.
Now, three University of Maryland, Baltimore County researchers have made a far more disturbing conclusion about blogs. After analyzing millions of blog posts, they have determined that the blogosphere is drowning in spam, the pejorative nickname given to unsolicited Internet advertising. Using data collected by weblogs.com, a prominent blog tracking service, doctoral student Pranam Kolari and professors Tim Finin and Anupam Joshi analyzed 40 million blog updates submitted from 14 million blogs.
…
Edit | Bookmark@del.icio.us | Trackback | 1 Comment »
January 13th, 2006, by Pranam Kolari, posted in Blogging, Technology Impact, Technology, Web
Ping-O-Matic, a great tool and arguably the most popular update ping service is currently down. Matt blogs about a complete revamp. Apparently their current system was accepting pings on just one box!. Technorati is helping them out.
Most of us don’t even bother to check which update ping services our blog software notifies automatically. Now, is this a good enough motivation to notify additional update ping services ? If yes, who is set to gain? Given the recent valuation of weblogs.com, a short downtime of Ping-O-Matic might well create another multi-million dollar asset.
Related:
Attention Wordpress users!!! from Nick Starr, Ping-o-Matic is offline from Jeff Smith, Pingomatic is gone from Alan Fraser.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
January 10th, 2006, by Tim Finin, posted in Blogging, Web, GENERAL
The infrastructure to set up online communities is not all that complicated — the member base is the real asset. I’m not sure that MSM companies will know how to manage them, as the following article suggests.
Get out of MySpace, bloggers rage at Murdoch
Nicholas Wapshott, The Independent, 08 Jan 2006
Angry members of MySpace, the personal file-sharing website for young adults, are accusing Rupert Murdoch’s News Corporation of censoring their postings and blocking their access to rival sites. The 38 million subscribers to MySpace, which News Corp bought for $629m (£355m) last July, discovered that when they wrote to each other about rival video-swapping site YouTube, the words were automatically deleted, and attempts to download video images from YouTube led to blank screens.
…
The protests gathered pace, and when 600 MySpace customers complained and a campaign began to boycott the site and relocate to rival sites such as Friendster, Linkedin, revver.com and Facebook.com, News Corp relented and restored the links. However, MySpace managers promptly shut down the blog forum on which members had complained about the interference. An online notice said the problem was the result of “a simple misunderstanding”.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
|  | You are currently browsing the archives for the Blogging category.
  Home
|
Archive
|
Login
|
Feed
Recent postsThe "Missouri Mom" (Lori Drew) case -- Privacy Issues and New Legal Theories ?An account of the Estonian Internet WarPhD proposal: Context and Policies in Declarative Networked SystemsRPI group developing Second Life robotThe Psychology of Social Networking on KQED Forum show
Ebiquity communityFieldmarking data blog
Geospatial Semantic Web
Harry Chen thinks aloud
Planet social media research
Social media research blog
TrackForward by Kolari
UMBC GAIM
|  |