UMBC ebiquity research group Building intelligent systems in open, heterogeneous, dynamic, distributed environments
Web

Archive for the 'Web' Category

NOSQL: distributed key-value data stores

July 2nd, 2009, by Tim Finin, posted in Database, Semantic Web, Web

ComputerWorld has an article on the “nosql” movement and a recent nosql meetup held in San Francisco, No to SQL? Anti-database movement gains steam. Nosql systems are distributed, non-relational data stores that typically use a simple key-value approach to indexing and retrieving data and use a simple procedural query API rather than a sophisticated declarative query language.

“The inaugural get-together of the burgeoning NoSQL community crammed 150 attendees into a meeting room at CBS Interactive. Like the Patriots, who rebelled against Britain’s heavy taxes, NoSQLers came to share how they had overthrown the tyranny of slow, expensive relational databases in favor of more efficient and cheaper ways of managing data.

“Relational databases give you too much. They force you to twist your object data to fit a RDBMS [relational database management system],” said Jon Travis, principal engineer at Java toolmaker SpringSource, one of the 10 presenters at the NoSQL confab (PDF). NoSQL-based alternatives “just give you what you need,” Travis said.”

There were presentation on nine different ‘nosql’ databases: Voldemort, Cassandra, Dynomite, HBase, Hypertable, CouchDB, VPork, MongoDb as well as general presentations by Google’s Jonas Karlsson, and Cloudera’s Todd Lipcon.

Johan Oskarsson of Last.fm wrote a debriefing post on his blog.

“The relatively young but rapidly growing “nosql” community met last Thursday in San Francisco. The idea was to give attendees a solid introduction to how distributed, non relational databases work as well as an overview of the various projects out there.”

and provides links to the presentation slides and videos. You can also search for NOSQL on Vimeo to get the videos.

I learned of this meeting on Hacker News, where you can find some interesting comments.

Of course their are many popular key-value stores that are not designed to support the highly-scalable distributed needs of many Web applications. I found, for example, that as a persistent RDF store for rdflib, Sleepycat out performed MySQL.

Changes in FaceBook default privacy policy

July 1st, 2009, by Tim Finin, posted in Privacy, Security, Social, Social media, Web

FaceBook is changing how it manages privacy starting today. After reading last week’s post on the FaceBook blog, More Ways to Share in the Publisher, and a followup note on ReadWriteWeb, A Closer Look at Facebook’s New Privacy Options, I thought I understood: Facebook was sharing more but only for people who have made their profiles public. From the official FaceBook post:

“We’ve received some questions in the comments about default privacy settings for this beta. Nothing has changed with your default privacy settings. The beta is only open to people who already chose to set their profile and status privacy to “Everyone.” For those people, the default for sharing from the Publisher will be the same. If you have your default privacy set to anything else—such as “Friends and Networks” or “Friends Only”—you are not part of this beta.”

But the New York Times has an article, The Day Facebook Changed: Messages to Become Public by Default that clearly says more is coming (emphasis added):

“By default, all your messages on Facebook will soon be naked visible to the world. The company is starting by rolling out the feature to people who had already set their profiles as public, but it will come to everyone soon. You’ll be able each time you publish a message to change that message’s privacy setting and from that drop down there’s a link to change your default setting.

But most people will not change the setting. Facebook messages are about to be publicly visible. A whole lot of people are going to hate it. When ex-lovers, bosses, moms, stalkers, cops, creeps and others find out what people have been posting on Facebook - the reprimand that “well, you could have changed your default setting” is not going to sit well with people.”

But it will come to everyone soon! That’s a big change if true. There will be blood.

I hope that there is come clarification soon from FaceBook. I, for one, am left confused.

CFP: JWS special issue on Semantic Web and Social Media

June 27th, 2009, by Tim Finin, posted in Blogging, Semantic Web, Social media, Wikipedia
important dates
abstracts 21 Sept 09
submissions 01 Oct 09
notification 15 Dec 09
final copy 15 Jan 10
publication April 10

The Journal of Web Semantics will publish a special issue on Data Mining and Social Network Analysis for integrating Semantic Web and Web 2.0 in the spring of 2010. The special issue will be edited by Bettina Berendt, Andreas Hotho and Gerd Stumme and initial abstracts for papers must be submitted via the Elsevier EES system by September 21, 2009.

The special issue, invites contributions that show how synergies between Semantic Web and Web 2.0 techniques can be successfully used. Since both communities work on network-like data structures, analysis methods from different fields of research could form a link between those communities. Techniques can be - but are not limited to - social network analysis, graph analysis, machine learning and data mining methods.

Relevant topics include

  • ontology learning from Web 2.0 data
  • instance extraction from Web 2.0 systems
  • analysis of Blogs
  • discovering social structures and communities
  • predicting trends and user behaviour
  • analysis of dynamic networks
  • using content of the Web for modelling
  • discovering misuse and fraud
  • network analysis of social resource sharing systems
  • analysis of folksonomies and other Web 2.0 data structures
  • analysis of Web 2.0 applications and their data
  • deriving profiles from usage
  • personalized delivery of news and journals
  • Semantic Web personalization
  • Semantic Web technologies for recommender systems
  • ubiquitous data mining in Web (2.0) environment
  • applications

The $1M Netflix Grand Prize taken by BellKor’s Pragmatic Chaos?

June 26th, 2009, by Tim Finin, posted in AI, Machine Learning, Social media

BellKor’s Pragmatic Chaos has broken the 10% barrier, a feat that may have won them the $1M Netflix prize. We’ll know for sure in 30 days.

“June 26, 2009: Today our team submitted our solution to the Netflix Prize, resulting in a score of .8558, which corresponds to an improvement over Netflix Cinematch algorithm of 10.05%. This is the first submission in the competition to break the 10% barrier and sets off a 30 day period where all competitors are invited to submit their best and final solutions.

The prize is the award by Netflix for an open competition that started in October 2006 for the best collaborative filtering algorithm predicting user ratings for films from a database of previous ratings. Today the BellKor’s Pragmatic Chaos team submitted an entry that improved on the existing algorithm by 10.05%, exceeding the 10% improvement threshold required of a winner. The team is a collaboration between people from Pragmatic Theory, Commendo, Yahoo and AT&T.

“The Netflix Prize seeks to substantially improve the accuracy of predictions about how much someone is going to love a movie based on their movie preferences. Improve it enough and you win one (or more) Prizes. Winning the Netflix Prize improves our ability to connect people to the movies they love.”

Google is from Mars, Facebook is from Venus

June 23rd, 2009, by Tim Finin, posted in Google, Social media

Wired has an interesting article on Facebook vs. Google, Great Wall of Facebook: The Social Network’s Plan to Dominate the Internet — and Keep Google Out.

“Today, the Google-Facebook rivalry isn’t just going strong, it has evolved into a full-blown battle over the future of the Internet—its structure, design, and utility. For the last decade or so, the Web has been defined by Google’s algorithms—rigorous and efficient equations that parse practically every byte of online activity to build a dispassionate atlas of the online world. Facebook CEO Mark Zuckerberg envisions a more personalized, humanized Web, where our network of friends, colleagues, peers, and family is our primary source of information, just as it is offline. In Zuckerberg’s vision, users will query this “social graph” to find a doctor, the best camera, or someone to hire—rather than tapping the cold mathematics of a Google search. It is a complete rethinking of how we navigate the online world, one that places Facebook right at the center. In other words, right where Google is now.”

This is definitely a David and Goliath match, what with Facebook not having turned a profit yet. The article does a good job of pointing out how their services are different and complement one another.

At the risk of evoking discredited stereotypes, maybe Google is from Mars and Facebook is from Venus.

Etiquette of using your smartphone in a meeting

June 21st, 2009, by Tim Finin, posted in Social media

The New York Times has an article on the etiquette of using your smart phone in meetings, At Meetings, It’s Mind Your BlackBerry or Mind Your Manners.

“As Web-enabled smartphones have become standard on the belts and in the totes of executives, people in meetings are increasingly caving in to temptation to check e-mail, Facebook, Twitter, even (shhh!) ESPN.com. But a spirited debate about etiquette has broken out. Traditionalists say the use of BlackBerrys and iPhones in meetings is as gauche as ordering out for pizza. Techno-evangelists insist that to ignore real-time text messages in a need-it-yesterday world is to invite peril.”

Professors have been dealing with this for several years, since most of our students come to class with their laptops. Maybe they are taking notes. But why is he smiling? Now he’s laughing! Was my comment on hill climbing really that funny?

Of course, the dynamics of this is different outside the classroom.

“In many professional circles, where connections are power, making a show of reaching out to those connections even as co-workers are presenting a spreadsheet presentation seems to have become a kind of workplace boast. Mr. Brotherton, the consultant, wrote in an e-mail message that it was customary now for professionals to lay BlackBerrys or iPhones on a conference table before a meeting — like gunfighters placing their Colt revolvers on the card tables in a saloon. “It’s a not-so-subtle way of signaling ‘I’m connected. I’m busy. I’m important. And if this meeting doesn’t hold my interest, I’ve got 10 other things I can do instead.’ ”

The Iranian revolution will be Twittered, not televised

June 15th, 2009, by Tim Finin, posted in Social media, Twitter, Web

Social media systems share some aspects of television, but not all. They differ in that their content is created by their users. While the revolution will not be televised, it can be tweeted. It’s been more than 50 years since TV was the thing.

The NYT has an article on the role that social media sites are playing in the conflicts surrounding the Iran election, Social Networks Spread Iranian Defiance Online.

“Iranians are blogging, posting to Facebook and, most visibly, coordinating their protests on Twitter, the messaging service. Their activity has increased, not decreased, since the presidential elections on Friday and ensuing attempts by the government to restrict or censor their online communications.
     On Twitter, reports and links to photos from a peaceful mass march through Tehran on Monday, along with accounts of street fighting and casualties around the country, have become the most popular topic on the service worldwide, according to Twitter’s published statistics.
     A couple of Twitter feeds have become virtual media offices for the supporters of the leading opposition candidate, Mir Hussein Moussavi. One feed, mousavi1388, (1388 is the year in the Persian calendar) is filled with news of protests and exhortations to keep up the fight, in Persian and English. It has more than 7,000 followers. Mr. Moussavi’s fan group on Facebook has swelled to over 50,000 members, a significant increase since election day.”

The article also reports on efforts to encourage cyber attacks on Iran sites

“Some Twitter users were also going on the offensive. On Monday morning, an antigovernment activist using the Twitter account “DDOSIran” asked supporters to visit a Web site to participate in an online attack to try to crash government Web sites by overwhelming them with traffic. By Monday afternoon, many of those sites were not accessible, though it was not clear if the attack was responsible — and the Twitter account behind the attack had been removed. A Twitter spokeswoman said the company had no connection to the deletion of the account.”

A php script is still available on the web and can be found if you search for it.

Tweets from Iran good source of immediate information on #iranelection

June 15th, 2009, by Tim Finin, posted in Social media, Twitter

The urban areas of Iran is developed and many there use social media, including Twitter. You can see their reactions to the election results and the public unrest in response to it via their tweets. Use this Twitter search query for a sample. This is an important example of how social media is having an impact on news.

Update: Also check out recent Flickr photos tagged with iranelection.

Update 2: Here are tweets geolocated to Tehran.

BlindSearch evaluates Google, Bing and Yahoo search engines

June 7th, 2009, by Tim Finin, posted in Google, Web, sEARCH

Who’s got the best basic web search engine? One way to approach that question is to conduct an experiment in which subjects rank the results returned by several engines without knowing which is which.

BlindSearch is a simple and neat site that collects ‘objective’ opinions on search quality by showing query results from Google, Yahoo and Bing side by side without identifying which is which and inviting you to select the best.

“Type in a search query above, hit search then vote for the column which you believe best matches your query. The columns are randomised with every query.

The goal of this site is simple, we want to see what happens when you remove the branding from search engines. How differently will you perceive the results?”


BlindSearch evaluates Google, Bing and Yahoo

As of this writing there have been 1679 votes for preferred results with Google getting 39%, Bing 39% and Yahoo: 22%.

update 2:14pm edt 6/7: Google: 45%, Bing: 32%, Yahoo: 22% | 11,130 votes

Google Chrome for Linux and Mac

June 5th, 2009, by Tim Finin, posted in Google, Web, sEARCH

How’s this for truth in advertising. The Chromium blog announces beta versions of Google Chrome for MAC OS X and Linus, but warns people not to try them in a post Danger: Mac and Linux builds available.

“In order to get more feedback from developers, we have early developer channel versions of Google Chrome for Mac OS X and Linux, but whatever you do, please DON’T DOWNLOAD THEM! Unless of course you are a developer or take great pleasure in incomplete, unpredictable, and potentially crashing software. How incomplete? So incomplete that, among other things, you won’t yet be able to view YouTube videos, change your privacy settings, set your default search provider, or even print.”

Of course, they know that this will make trying them irresistible to some of us. If that includes you, go get the Mac or Linux version.

Rising tide lifts all browsers

June 3rd, 2009, by Tim Finin, posted in Web

Mozilla’s Asa Dotzler posted some interesting graphs showing historical browser usage. Looking at the percentage of users, Internet Explorer is slowly losing market share to Firefox and Safari.



Looking at the total number of users, all three are increasing.



Bing vs. Google, side by side comparison

June 1st, 2009, by Tim Finin, posted in Google, Security, Semantic Web, Social media, sEARCH

Microsoft’s new Bing search engine is getting a lot of interest. Glenn McDonald posts about a nice side-by-side Bing vs Google comparator tat he developed. It makes it easy to compare how the two services do on a range of different types of searches. Here are the ones that Glen said he found useful in developing his initial opinion.

I sense form some of these queries that he is probing the systems where an advanced search engine can exploit a little bit of semantic knowledge. For example, recognizing that a user’s query “boston to asheville” matches a common pattern “ to “, and she probably is interested in information about how to travel from the first location tot he second. It seems like Google has been working on adding more such patterns, at least for the low hanging fruit.

Of course, if everyone hits on this site it may get throttled or blocked by either or both of the search engines. @Glen — would you be willing to share your code?

(spotted on hacker news)

You are currently browsing the archives for the Web category.

  Home | Archive | Login | Feed






UMBC