Archive for November, 2009
November 20th, 2009, by Tim Finin, posted in Semantic Web, Social media, Twitter
Twitter turned on its API for geotagging tweets yesterday, as announce in in a post on their blog, Think Globally, Tweet Locally. Currently, geographic information will only be associated with your tweets if you use an application that adds it and will only be used to display your tweets when viewed with an application that can exploit it. Here’s the way Twitter described it.
“This release is unique in that it’s API-only which means you won’t see any changes on twitter.com, yet. Instead, Twitter applications like Birdfeed, Seesmic Web, Foursquare, Gowalla, Twidroid, Twittelator Pro and others are already supporting this new functionality (go try them out now!) in interesting ways that include geotagging your tweets and displaying the location from where a tweet was posted.”
Examining Twitter’s status update API description shows how one associates a location with a Tweet. Pretty simple.
Since disclosing your location raises privacy concerns, Twitter has made geotagging an opt-in service and also allows users to delete all of the location information associated with their tweets. Moreover, their policy, as described here, says
“We require application developers to be upfront and obvious about when they are Geotagging an update. If you ever find that an application is doing it without notifying you, please let us know.”
You can read more on ReadWriteWeb and Techcrunch.
November 18th, 2009, by Tim Finin, posted in Humor
November 15th, 2009, by Tim Finin, posted in KR, Semantic Web, Social media
Wikipedia has an interesting RFC on approaches to achieve and maintain better coherence in its infobox templates. This is significant because Wikipedia is becoming the new CYC — a broad, practical KB filled with general purpose background knowledge. The RFC was kicked off by discussions on dbpedia template annotations. The RFC defines the problem as:
“Wikipedia uses hundreds of infobox templates for describing various entity types like NFL teams, schools in Canada, train stations etc. These infoboxes are separated and do not use a common vocabulary. Several different spellings of attributes are used for them, which all stand for the same meaning (e.g. birth_place, birthPlace, origin). This poses limitations to checking consistency within Wikipedia infoboxes, amongst different language editions, and it makes it hard for external tools to reuse the information in infoboxes.”
The goals mentioned in the RFC include (1) establishing the currently missing links between synonymous template attributes, (2) enabling authors to use template annotations to check for for factual inconsistencies (e.g., outdated population figures), and (3) providing consensus about which properties should be used in templates and what data they should contain.
November 13th, 2009, by Tim Finin, posted in Privacy, Social media, Web
This ought to be fun.
According to an article in the WSJ, Europe Approves New Cookie Law, “the Council of the European Union has approved new legislation that would require Web users to consent to Internet cookies..”
The law could have broad repercussions for online ads. “Almost every site that carries advertising should be seeking its visitors’ consent to the serving of cookies,” wrote Struan Robertson, a lawyer specializing in technology at Pinsent Masons and editor of Out-Law.com. “It also catches sites that count visitors — so if your site uses Google Analytics or WebTrends, you’re caught.”
This hit slashdot (“Breathtakingly Stupid” EU Cookie Law Passes) this morning.
Hmmmm. I wonder how we would implement cookie opt-out. I think setting a cookie to indicate that the user has opted out of your site’s cookies would be a good approach.
November 12th, 2009, by Tim Finin, posted in Humor, Social media
Around here we prefer range voting to approval voting or IRV.
November 12th, 2009, by Tim Finin, posted in Google, Programming
Mark Chu-Carroll is a Google software engineer who’s written a long, detailed and informed review of Google’s new programming language Go. It’s worth a read if you are interested in understanding what it’s like as a programming language. Here’s a few points that I took note of.
“The guys who designed Go were very focused on keeping things as small and simple as possible. When you look at it in contrast to a language like C++, it’s absolutely striking. Go is very small, and very simple. There’s no cruft. No redundancy. Everything has been pared down. But for the most part, they give you what you need. If you want a C-like language with some basic object-oriented features and garbage collection, Go is about as simple as you could realistically hope to get.”
“The most innovative thing about it is its type system. … It ends up giving you something with the flavor of Python-ish duck typing, but with full type-checking from the compiler.”
“Go programs compile really astonishingly quickly. When I first tried it, I thought that I had made a mistake building the compiler. It was just too damned fast. I’d never seen anything quite like it.”
“At the end of the day, what do I think? I like Go, but I don’t love it. If it had generics, it would definitely be my favorite of the C/C++/C#/Java family. It’s got a very elegant simplicity to it which I really like. The interface type system is wonderful. The overall structure of programs and modules is excellent. But it’s got some ugliness. … It’s not going to wipe C++ off the face of the earth. But I think it will establish itself as a solid alternative.”
Go sounds like a language that will help you grow as a computer scientist if you use it. That’s a good enough recommendation for me.
November 11th, 2009, by Tim Finin, posted in AI, sEARCH, Semantic Web
Yong Yu and Rudi Studer are editing a special issue of the Journal of Web Semantics on semantic search that will appear in the summer 2010. The special issue will cover interdisciplinary topics between Semantic Web and search. See the call for papers for a list of relevant topics and details on how to submit papers, which are due by 20 January 2010
November 11th, 2009, by Tim Finin, posted in AI, Google, NLP, sEARCH, Semantic Web
PCWorld has a story, Google VP Mayer Describes the Perfect Search Engine, with some interesting comments on semantic search from Marissa Mayer, Google’s vice president of Search Products & User Experience.
“IDGNS: What’s the status of semantic search at Google? You have said in the past that through “brute force” — analyzing massive amounts of queries and Web content — Google’s engine can deliver results that make it seem as if it understood things semantically, when it really functions using other algorithmic approaches. Is that still the preferred approach?
Mayer: We believe in building intelligent systems that learn off of data in an automated way, [and then] tuning and refining them. When people talk about semantic search and the semantic Web, they usually mean something that is very manual, with maps of various associations between words and things like that. We think you can get to a much better level of understanding through pattern-matching data, building large-scale systems. That’s how the brain works. That’s why you have all these fuzzy connections, because the brain is constantly processing lots and lots of data all the time.
IDGNS: A couple of years ago or so, some experts were predicting that semantic technology would revolutionize search and blindside Google, but that hasn’t happened. It seems that semantic search efforts have hit a wall, especially because semantic engines are hard to scale.
Mayer: The problem is that language changes. Web pages change. How people express themselves changes. And all those things matter in terms of how well semantic search applies. That’s why it’s better to have an approach that’s based on machine learning and that changes, iterates and responds to the data. That’s a more robust approach. That’s not to say that semantic search has no part in search. It’s just that for us, we really prefer to focus on things that can scale. If we could come up with a semantic search solution that could scale, we would be very excited about that. For now, what we’re seeing is that a lot of our methods approximate the intelligence of semantic search but do it through other means.”
I interpret these comments to mean that Google’s management still views the concept of semantic search (and the Semantic Web) as involving better understanding of the intended meaning of text in documents and queries. The W3C’s web of data model is still not on their radar.
November 10th, 2009, by Tim Finin, posted in High performance computing, Privacy, Security, Semantic Web
The Economist has been running a series of online Oxford Union style debates on topical issues — CEO pay, healthcare, climate change, etc. The latest one is on the cloud computing: This house believes that the cloud can’t be entirely trusted.
In his opening remarks, moderator Ludwig Siegele says
“The participants in this debate, including the three guest speakers, all agree that computing is moving into the cloud. “We are experiencing a disruptive moment in the history of technology, with the expansion of the role of the internet and the advent of cloud-based computing”, says Stephen Elop, president of Microsoft’s business division, which generates about a third of the firm’s revenues ($13 billion) and more than half of its profits ($4.5 billion) in the most recent quarter. Marc Benioff, chief executive of Salesforce.com, the world’s largest SaaS provider with over $1.2 billion in sales in the past 12 months, is no less bullish: ‘Like the shift [from the mainframe to the client/server architecture] that roiled our industry in decades past, the transition to cloud computing is happening now because of major discontinuities in cost, value and function.'”
While the debate’s proposition suggests that security or privacy is its focus, it’s really a broader argument about how software services will be delivered in the future in which security is just one aspect.
“Whether and to what extent companies and consumers elect to hand their computing over to others, of course, depends on how much they trust the cloud. And customers still have many questions. How reliable are such services? What about privacy? Don’t I lose too much control? What if Salesforce.com, for instance, changes its service in a way I do not like? Are such web-based services really cheaper than traditional software? And how easy is it to get my data if I want to change providers? Are there open technical standards that would make this easier?”
November 9th, 2009, by Tim Finin, posted in Ontologies, Semantic Web, Social media, Web
The Journal of Web Semantics now has a facebook page and a Twitter account to augment its blog. All three will be used for news and announcements of call for papers, special issues, availability of new papers, etc. As you might expect, the tweets will be terse items, the facebook updates longer notes and the blog posts full of details. Those who are interested can follow @journalWebSem on Twitter, become a fan of the JWS on facebook, and subscribe to the blog’s feed.
November 6th, 2009, by Tim Finin, posted in Semantic Web
UMBC alumnus Joab Jackson has an article in Government Computer News, Tim Berners-Lee: Machine-readable Web still a ways off, reporting on the International Semantic Web Conference help outside of Washington DC at the end of October. The article uses data.gov to illustrate the challenges and opportunities for the Semantic Web. Data.gov is a site whose purpose “is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.”
Jackson quotes Tim Berners-Lee
“When you look at putting government data on the Web, one of the concerns is … to not just put it out there on Excel files on Data.gov,” he said. “You should put these things in” the Resource Description Framework.
and later describes a project at RPI to republish information from data.gov in RDF leaded by another UMBC alumnus, Li Ding.
“Our goal is to make the whole thing shareable and replicable for others to re-use,” said project researcher Li Ding. By rendering data into RDF, it can be more easily interposed with other sets of data to create entirely new datasets and visualizations, Ding said. He showed a Google Map-based graphic that interposed RDF-versions of two different data sources from the Environmental Protection Agency, originally rendered in CSV files.
November 5th, 2009, by Tim Finin, posted in CS, GENERAL
This post on the CACM Blog caught my eye and shows that we still have a long way to go before computing is taken seriously in US secondary education, let alone K-12.
AP CS no Longer Counts for High School Graduation in Georgia (for now)
“Up until September, Georgia and Texas were the (only) two states in the US that accepted a computer science course as fulfilling high school graduation requirements. In Texas, the Advanced Placement Computer Science (AP CS) course fulfilled a mathematics requirement. In Georgia, it fulfilled a fourth science course requirement. As of October, however, Georgia has rescinded that decision. … ”
I wonder how other countries treat computing and informatics in primary and secondary education.