 | Web 
Archive for the 'Web' Category
November 20th, 2009, by Tim Finin, posted in Semantic Web, Social media, Twitter
Twitter turned on its API for geotagging tweets yesterday, as announce in in a post on their blog, Think Globally, Tweet Locally. Currently, geographic information will only be associated with your tweets if you use an application that adds it and will only be used to display your tweets when viewed with an application that can exploit it. Here’s the way Twitter described it.
“This release is unique in that it’s API-only which means you won’t see any changes on twitter.com, yet. Instead, Twitter applications like Birdfeed, Seesmic Web, Foursquare, Gowalla, Twidroid, Twittelator Pro and others are already supporting this new functionality (go try them out now!) in interesting ways that include geotagging your tweets and displaying the location from where a tweet was posted.”
Examining Twitter’s status update API description shows how one associates a location with a Tweet. Pretty simple.
Since disclosing your location raises privacy concerns, Twitter has made geotagging an opt-in service and also allows users to delete all of the location information associated with their tweets. Moreover, their policy, as described here, says
“We require application developers to be upfront and obvious about when they are Geotagging an update. If you ever find that an application is doing it without notifying you, please let us know.”
Twitter has updated its privacy policy to cover location information.
You can read more on ReadWriteWeb and Techcrunch.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 15th, 2009, by Tim Finin, posted in KR, Semantic Web, Social media
Wikipedia has an interesting RFC on approaches to achieve and maintain better coherence in its infobox templates. This is significant because Wikipedia is becoming the new CYC — a broad, practical KB filled with general purpose background knowledge. The RFC was kicked off by discussions on dbpedia template annotations. The RFC defines the problem as:
“Wikipedia uses hundreds of infobox templates for describing various entity types like NFL teams, schools in Canada, train stations etc. These infoboxes are separated and do not use a common vocabulary. Several different spellings of attributes are used for them, which all stand for the same meaning (e.g. birth_place, birthPlace, origin). This poses limitations to checking consistency within Wikipedia infoboxes, amongst different language editions, and it makes it hard for external tools to reuse the information in infoboxes.”
The goals mentioned in the RFC include (1) establishing the currently missing links between synonymous template attributes, (2) enabling authors to use template annotations to check for for factual inconsistencies (e.g., outdated population figures), and (3) providing consensus about which properties should be used in templates and what data they should contain.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 13th, 2009, by Tim Finin, posted in Privacy, Social media, Web
This ought to be fun.
According to an article in the WSJ, Europe Approves New Cookie Law, “the Council of the European Union has approved new legislation that would require Web users to consent to Internet cookies..”
The law could have broad repercussions for online ads. “Almost every site that carries advertising should be seeking its visitors’ consent to the serving of cookies,” wrote Struan Robertson, a lawyer specializing in technology at Pinsent Masons and editor of Out-Law.com. “It also catches sites that count visitors — so if your site uses Google Analytics or WebTrends, you’re caught.”
This hit slashdot (“Breathtakingly Stupid” EU Cookie Law Passes) this morning.
By the way, our ebiquity site uses cookies. Send mail to no-more-ebiquity-cookies at cs.umbc.edu if you want to opt out.
Hmmmm. I wonder how we would implement cookie opt-out. I think setting a cookie to indicate that the user has opted out of your site’s cookies would be a good approach.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 11th, 2009, by Tim Finin, posted in AI, Semantic Web, sEARCH
Yong Yu and Rudi Studer are editing a special issue of the Journal of Web Semantics on semantic search that will appear in the summer 2010. The special issue will cover interdisciplinary topics between Semantic Web and search. See the call for papers for a list of relevant topics and details on how to submit papers, which are due by 20 January 2010
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 11th, 2009, by Tim Finin, posted in AI, Google, NLP, Semantic Web, sEARCH
PCWorld has a story, Google VP Mayer Describes the Perfect Search Engine, with some interesting comments on semantic search from Marissa Mayer, Google’s vice president of Search Products & User Experience.
“IDGNS: What’s the status of semantic search at Google? You have said in the past that through “brute force” — analyzing massive amounts of queries and Web content — Google’s engine can deliver results that make it seem as if it understood things semantically, when it really functions using other algorithmic approaches. Is that still the preferred approach?
Mayer: We believe in building intelligent systems that learn off of data in an automated way, [and then] tuning and refining them. When people talk about semantic search and the semantic Web, they usually mean something that is very manual, with maps of various associations between words and things like that. We think you can get to a much better level of understanding through pattern-matching data, building large-scale systems. That’s how the brain works. That’s why you have all these fuzzy connections, because the brain is constantly processing lots and lots of data all the time.
IDGNS: A couple of years ago or so, some experts were predicting that semantic technology would revolutionize search and blindside Google, but that hasn’t happened. It seems that semantic search efforts have hit a wall, especially because semantic engines are hard to scale.
Mayer: The problem is that language changes. Web pages change. How people express themselves changes. And all those things matter in terms of how well semantic search applies. That’s why it’s better to have an approach that’s based on machine learning and that changes, iterates and responds to the data. That’s a more robust approach. That’s not to say that semantic search has no part in search. It’s just that for us, we really prefer to focus on things that can scale. If we could come up with a semantic search solution that could scale, we would be very excited about that. For now, what we’re seeing is that a lot of our methods approximate the intelligence of semantic search but do it through other means.”
I interpret these comments to mean that Google’s management still views the concept of semantic search (and the Semantic Web) as involving better understanding of the intended meaning of text in documents and queries. The W3C’s web of data model is still not on their radar.
Edit | Bookmark@del.icio.us | Trackback | 3 Comments »
November 10th, 2009, by Tim Finin, posted in High performance computing, Privacy, Security, Semantic Web
The Economist has been running a series of online Oxford Union style debates on topical issues — CEO pay, healthcare, climate change, etc. The latest one is on the cloud computing: This house believes that the cloud can’t be entirely trusted.
In his opening remarks, moderator Ludwig Siegele says
“The participants in this debate, including the three guest speakers, all agree that computing is moving into the cloud. “We are experiencing a disruptive moment in the history of technology, with the expansion of the role of the internet and the advent of cloud-based computing”, says Stephen Elop, president of Microsoft’s business division, which generates about a third of the firm’s revenues ($13 billion) and more than half of its profits ($4.5 billion) in the most recent quarter. Marc Benioff, chief executive of Salesforce.com, the world’s largest SaaS provider with over $1.2 billion in sales in the past 12 months, is no less bullish: ‘Like the shift [from the mainframe to the client/server architecture] that roiled our industry in decades past, the transition to cloud computing is happening now because of major discontinuities in cost, value and function.’”
While the debate’s proposition suggests that security or privacy is its focus, it’s really a broader argument about how software services will be delivered in the future in which security is just one aspect.
“Whether and to what extent companies and consumers elect to hand their computing over to others, of course, depends on how much they trust the cloud. And customers still have many questions. How reliable are such services? What about privacy? Don’t I lose too much control? What if Salesforce.com, for instance, changes its service in a way I do not like? Are such web-based services really cheaper than traditional software? And how easy is it to get my data if I want to change providers? Are there open technical standards that would make this easier?”
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 9th, 2009, by Tim Finin, posted in Ontologies, Semantic Web, Social media, Web
The Journal of Web Semantics now has a facebook page and a Twitter account to augment its blog. All three will be used for news and announcements of call for papers, special issues, availability of new papers, etc. As you might expect, the tweets will be terse items, the facebook updates longer notes and the blog posts full of details. Those who are interested can follow @journalWebSem on Twitter, become a fan of the JWS on facebook, and subscribe to the blog’s feed.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 6th, 2009, by Tim Finin, posted in Semantic Web
UMBC alumnus Joab Jackson has an article in Government Computer News, Tim Berners-Lee: Machine-readable Web still a ways off, reporting on the International Semantic Web Conference help outside of Washington DC at the end of October. The article uses data.gov to illustrate the challenges and opportunities for the Semantic Web. Data.gov is a site whose purpose “is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.”
Jackson quotes Tim Berners-Lee
“When you look at putting government data on the Web, one of the concerns is … to not just put it out there on Excel files on Data.gov,” he said. “You should put these things in” the Resource Description Framework.
and later describes a project at RPI to republish information from data.gov in RDF leaded by another UMBC alumnus, Li Ding.
“Our goal is to make the whole thing shareable and replicable for others to re-use,” said project researcher Li Ding. By rendering data into RDF, it can be more easily interposed with other sets of data to create entirely new datasets and visualizations, Ding said. He showed a Google Map-based graphic that interposed RDF-versions of two different data sources from the Environmental Protection Agency, originally rendered in CSV files.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
November 5th, 2009, by Tim Finin, posted in Google, Privacy, Semantic Web, Social media, Web
Google added a great new service, Dashboard, that summarizes data stored for a Google account — see MY ACCOUNT>PERSONAL SETTINGS>DASHBOARD.
“Designed to be simple and useful, the Dashboard summarizes data for each product that you use (when signed in to your account) and provides you direct links to control your personal settings. Today, the Dashboard covers more than 20 products and services, including Gmail, Calendar, Docs, Web History, Orkut, YouTube, Picasa, Talk, Reader, Alerts, Latitude and many more. The scale and level of detail of the Dashboard is unprecedented, and we’re delighted to be the first Internet company to offer this — and we hope it will become the standard.”
This is a good move on Google’s part. But while there’s a lot of information included, it’s not everything that Google knows about you — e.g., data in cookies, click throughs data from search results and information from companies it’s acquired, like Doublclick. Still, it is a big step in a positive direction.
Edit | Bookmark@del.icio.us | Trackback | 2 Comments »
November 4th, 2009, by Tim Finin, posted in Security, Social media
Yesterday was the first time a truly voter verifiable voting system was used in any binding government election, thanks in part to work being carried out at UMBC’s Cyber Defense Lab under the direction of Alan Sherman.
Takoma Park, MD used the Scantegrity system for its municipal election after testing it in a mock election last April. Technology Review has a story, First Test for Election Cryptography, that quotes Anne Sergeant, the chair of the Takoma Park board of elections
“Before trying Scantegrity in an official election, the city held a mock vote in April to work out kinks in the system. In that test, she says, about 30 percent of participants went home and used the system to verify their votes. Sergeant says that Scantegrity representatives talked extensively with voters and election officials after the April test and have improved their system accordingly. “I hope we can provide an experience where people walk away and say, ‘That was awesome,’” she says. “It’s a goal to which we aspire.”
The Scantegrity system was created by a group of universities, including UMBC. A voter uses a paper ballot marked with invisible ink, which is exposed with a special marker. That marker reveals a code, which the voter can then use to check online whether their vote was tabulated correctly.
Ben Adida has been auditing the election and documenting the process on his blog.
See also the ComputerWorld story, E-voting system lets voters verify their ballots are counted, and audio report on WAMU.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
October 30th, 2009, by Tim Finin, posted in Ontologies, RDF, Semantic Web
Like many newspapers, the New York Times links the first mention of well known entitles in its articles to a reference page. For example, a mention of Barack Obama links to a page which is a collection of basic information on President Obama and links to relevant stories and other resources that the Times has created.
Now the Times is also using RDF to publish some of information as linked open data. Yesterday the Times announced the publication of an LOD collection covering about 5,000 people at http://data.nytimes.com/ under under a Creative Commons 3.0 Attribution License and plan to put their full collection of 30K topics online soon.
“Over the last several months we have manually mapped more than 5,000 person name subject headings onto Freebase and DBPedia. And today we are pleased to announce the launch of http://data.nytimes.com and the release of these 5,000 person name subject headings as Linked Open Data.
…
Over the next several months, we plan to expand http://data.nytimes.com to include each of the nearly 30,000 subject headings we use to power Times Topics pages, a collection that includes locations, organizations and descriptors in addition to person names.”
Edit | Bookmark@del.icio.us | Trackback | No Comments »
|  |
|  |