Archive for September, 2006
If Techmeme was special, Gabe’s approach to monetizing goes one step further. He announced their new business model today, impressing the blogosphere, and opening up a new advertising based engagement (reach-out) model for businesses that use blogs. This reminds me of something Amazon came up with this year, and what they call Plogs, where select amazon editors/sellers engage customers in a marketing loop. Perhaps Gabe’s work with Techmeme will bring Plogs and many other related efforts into real world use.
Far reaching implications I am sure. Smart!
Looking forward, I wonder what plans Gabe has for memeorandum.The audience catered to by memeorandum is just perfect for many political campaigns.
On Del.icio.us I noticed an article from the Economist on the World’s hardest simple sliding-block puzzle. As it happens, I’m teaching our AI class this semester and we’re right in the middle of discussion problem solving as search, so it’s very apropos.
The puzzle was invented by Jim Lewis who set out to develop a very hard sliding block puzzle. Using a program (Computer Assisted Puzzle Analyzer) written in Haskell, Lewis generated and evaluated the search spaces of puzzles with different sized blocks (1×1, 1×2, 2×2) in frames of various sizes where the goal is moving the largest block from one corner to another.
He found that puzzles with 3×4 or 4×4 frames turned out to be too easy, those bigger than 4×5 too hard and puzzles with blocks with a dimension larger than two tended to suffer from gridlock.
From the Economist story:
For each candidate puzzle, Mr Lewis’s program calculated all the possible moves from the initial configuration, and then all possible moves from the resulting positions, and so on. In this way, it constructed a “solution tree” for each puzzle. The taller the tree, the more moves are required to get to the solution. The broader the tree, the more dead ends there are along the way.
Mr Lewis looked at the output of the program, and one puzzle leapt out: its solution tree was both tall and broad. At first, he found the puzzle insoluble, and assumed there was a bug in his program. But the program did work correctly, and the puzzle is indeed soluble.
You won’t find Quzzle in your local Wal-Mart just yet, but you can buy it online from Jim Lewis’s company Quirkle, which he describes as “dedicated to taking classic toys and gadgets and re-engineering them to the extreme”. You can also try this Quzzle Java Applet developed by Nick Baxter. P. Aylett and C. Prabhakar have also produced a visualization of the Quzzle search space.
Shorter Pew report on the future of the Internet: kind of like today, only more so.
The Pew Internet and American Life Project released a 115 page report with findings from an online sample of nearly 750 Internet “stakeholders” in the period from December 2005 to March 2006. The survey asked participants if they agreed or disagreed with seven scenarios about the future and were given the opportunity to elaborate on their answers.
“A survey of technology thinkers and stakeholders shows they believe the internet will continue to spread in a “flattening” and improving world. There are many, though, who think major problems will accompany technology advances by 2020.”
I was disappointed to see virtually no mention of the Semantic Web, other ’semantic’ technolgies or web services. There was also little about social networking and not much on user-generated content. I suspect this is due to the seven scenarios that participants were asked to respond to.
Maybe the shorter version should be: kind of like 2001, only more so.
Voting is a basic mechanism for group decision making and the foundation of modern societies. It has also proved surprisingly difficult to do well in practice. UMBC is organizing a Collegiate Voting Systems Competition to engage students in nationally important, state-of-the-art security and privacy research projects and course work. Student teams will design and implement a complete voting system that must have been used in some election, such as one for a student government or organization, by May 2007. Papers describing and analyzing the system are then submitted for the conference and used to select candidates for the final competition. The conference, to be held in Portland in July 2007, will include demonstrations, mock elections, submitted presentations and invited talks. A panel of judges will make awards for the best overall system, best presentation, best attack, and best paper on voting system metrics. VoComp 2007 will be run by Professor Alan Sherman with the generous support of the NSF Cyber Trust program. More information on the competition, its rules, and an example system is available at the VoComp site.
CiteULike is a free service to help academics to share, store, and organise the academic papers they are reading. When you see a paper on the web that interests you, you can click one button and have it added to your personal library. CiteULike automatically extracts the citation details, so there’s no need to type them in yourself.
Imagine a television broadcaster generating advertisement revenues off stolen programs, thats what Bitacle is getting at, at the scale of the entire Blogosphere. This is just not acceptable.
Well we all love how search engines, aggregators and blog readers organize Web content, eventually directing users to the original source. Bitacle however creates a black hole around copied user content — once you are in, you are in. My concern (and the general debate on the blogosphere) is on their “aggregator” facility, which pulls together user posts and hosts advertisements. To make matters worse they also host new comment threads (gosh!), and this is ours, btw.
It appears that the debate starts with Ivan’s post on “Are Bitacle blog thieves too?”, as early as March 2006. Interestingly an employee from Bitacle has explanations!, in comments, and compares themselves with Google and Yahoo, for god’s sake!
The reason itâ€™s that we donâ€™t be only a blog search engine we are a â€œarchive blog search engineâ€ that itâ€™s different concept.
One question: why you donâ€™t ban Goolge, Yahoo or MSN? That search engines cache all your pages.
Bloggers are outraged, just titles speak for themselves –
All your blogs are belong to Bitacle
Bitacle: thieves now open for business in the 8th circle of hell (a good overview of the issue)
Why is bitacle stealing all our blogs??
BITACLE DEBACLE CONTINUES — BLOGGERS OUTRAGED — NO NEWS COVERAGE BY OLD MEDIA?
Bitacle is Heisting My Content
As I write “Bitacle” is the 7th most searched keyword on Technorati today. Bitacle, totally unethical and unprofessional.
UPDATE: I notice their sitemap lists all plagiarized blogs.
We recently upgraded our site’s simple keyword system to be a more fashionable tagging system. We’ve always had a feature by which key words and phrases could be associated with most of the objects in our site’s database, e.g., papers, projects, software, events, etc., but we never did much with it and suspected that visitors rarely used them to search for things of interest.
Several weeks ago, Filip revised it to be more of a tagging system. We have a site-wide tag cloud and can generate pages listing all of the items with a particular set of tags, like splog or pervasive computing and semantic web. These tag related pages have links for associated bookmarks (e.g., run the query again) and RSS feeds (e.g., run the query and deliver results as an RSS feed).
What we have is a group tagging system. That is, objects have tags and any ebiquity member can modify the set of tags associated with any object. One novel feature is a simple tag rewriting system intended to keep our tagging somewhat consistent. Lab members can add rules that will automatically rewrite tags, e.g., replacing BLOGS with BLOG.
One problem is that this new tagging system is completely separate from the category-based tagging system used on the ebiquity blog. At some point in the near future, we will probably add a linking table, so that the two will be connected somehow.
Bloginfluence.net computes a score intended to measure the influence of a blog based on the number of blogs, posts and web pages that link to it, the number of Bloglines subscribers it has, and its Google PageRank. Specificially, the formula is given as:
The ebiquity blog’s influence number is 17276.8 as I write this. What what does this mean? Clearly bigger is better, but how big is big? Roland Piquepaille has some interesting observations on his ‘Blogs for companies’. To better understand what the score really means, I’d like to see a graph of the distribution of scores for “feeds that matter”. We’ve used the most popular feeds mined from bloglines subscribers with public profiles to do various kinds of analyses.
We’ve also played with other influence models for the blogosphere, with some of our preliminary work described in
Modeling the Spread of Influence on the Blogosphere, Akshay Java, Pranam Kolari, Tim Finin, and Tim Oates, UMBC TR-CS-06-03, March 2006.
One thing the Bloginfluence formula does not try to capture is that the importance of links from other blogs and their posts should be weighted by their importance. Using Google pagerank in the formula does this to some degree. Computing a more sophisticated score would, of course, require having a global model of the Blogosphere, which is costly to acquire and maintain. Maybe just using Google PageRank is a good approximation. In our work in Blogosphere influence we found that it was.
(spotted on SmartMobs)
Businessweek has a long and generally good cover story article on Click
Martin Fleischmann put his faith in online advertising. … Now, Fleischmann’s faith has been shaken. Over the past three years, he has noticed a growing number of puzzling clicks coming from such places as Botswana, Mongolia, and Syria. … Fleischmann is a victim of click fraud: a dizzying collection of scams and deceptions that inflate advertising bills for thousands of companies of all sizes. The spreading scourge poses the single biggest threat to the Internet’s advertising gold mine and is the most nettlesome question facing Google and Yahoo, whose digital empires depend on all that gold.
A part of this problem is due to splogs. The article notes that
The trouble arises when the Internet giants boost their profits by recycling ads to millions of other sites, ranging from the familiar, such as cnn.com, to dummy Web addresses like insurance1472.com, which display lists of ads and little if anything else.
One of the easiest ways to set up a sites with ads that your “paid to read” gang clicks on is to establish a nest of splogs and automatically populate them with plagiarized content from other blogs. Companies like Google and Yahoo can benefit from better automatic splog detection. It might be possible to test this hypothesis by analyzing the frequency of splogs as a source of clicks for an advertiser. If anyone whould like to share their data we might be able to do such an analysis. Contact us if you are interested.
Monitor110 is a NYC startup planning to launch a system early next year that will allow institutional investors to monitor “chatter” about the market on the Web. A differentiator is offering information that is available before the “historic point of investor visibility”. The graphic on their splash page lists the sources, from earliest on, as “joe bloggers”, employeesâ€™ personal blogs, expert blogs, journalists’ personal blogs, drug trial participants’ discussion boards, special interest sites, corporate sites, “top blogs” and regulatory sites. Do you see a trend in this list?
There is not much in depth information on the site, but they do mention using many relevant technologies, including data mining, text classification, machine learning, natural language processing, sentiment detection and reputation modeling. Of course, it’s very easy to talk about all of these advanced technologies and moderately easy to use most of them in trivial ways. It will be a challenge to find the right way to use these to make a real difference. But, mining the Blogosphere for intelligence is a hot topic and doing it for investments seems like a very natural domain. It might just work.
But what about spam and other attempts to game the system? Here’s where their approach to reputation will be key. False stock tips is a common spam genre that is known to work. The success of systems like Monitor110 might lead to their own demise as spammers quickly adapt to it, creating and populating blogs that provide false signals and information. Detecting these bad sources, and doing it quickly, will be difficult.
Monitor110 announced a $5M Series B round of financing earlier this month. See this Financial Times article for more information.
Ok, maybe it’s not such a good example of pervasive computing, but you must agree it’s a great practical application. CNET news reports that the Intermission bar at London’s University of Westminster student union has six touch-screen tables that let patrons order drinks from a menu on the screen. This speeds up service and encourages drinkers to try new beverages, going beyond the usual pint of beer. Students can also send messages to other tables and to electronically buy someone a drink. The system also includes games, and the ability to call for a cab will soon be added.
I wonder how the wait staff feel about it. It reduces work, but maybe also tips.