UMBC ebiquity
Google

Archive for the 'Google' Category

Google unemployment index estimates and predicts unemployment

August 20th, 2010, by Tim Finin, posted in Google, Social media

The Google Unemployment Index is an economic indicator based on queries sent to Google’s search engine related to unemployment, social security, welfare, and unemployment benefits. Since some of these search terms are probably leading indicators, it can also be used to predict upcoming changes in the actual unemployment rate.


The index is based on queries tracked via Google Insights for Search that are tuned to different countries and you can also focus on particular regions or metropolitan areas and compare the index in several locations. Here’s an example comparing Florida (blue) and Maryland (red).

Researchers prove Rubics Cube solvable in 20 moves or less

August 13th, 2010, by Tim Finin, posted in AI, Games, GENERAL, Google, Social media

Using a combination of mathematical tricks, good programming and 35 CPU-years on Google’s servers, a group of researchers have proved that every position of Rubik’s Cube can be solved in 20 moves or less. The group consists of Kent State mathematician Morley Davidson, Google engineer John Dethridge, math teacher Herbert Kociemba, and programmer Tomas Rokicki.

This is an amazing result and a testament to more than 30 years of work on the problem. The Cube was invented in 1974 and almost immediately the subject for programs to solve it. In 1981, Morwen Thistlethwaite proved that any configuration could be solved in no more than 52 moves. Periodically, tighter upper bounds for the maximum solution length were found. This result ends the quest — there are some configurations (about 300M) that require 20 moves to solve and there are none that require more than 20 moves.

In their own words, here’s how the group solved all 43,252,003,274,489,856,000 Cube positions:

  • We partitioned the positions into 2,217,093,120 sets of 19,508,428,800 positions each.
  • We reduced the count of sets we needed to solve to 55,882,296 using symmetry and set covering.
  • We did not find optimal solutions to each position, but instead only solutions of length 20 or less.
  • We wrote a program that solved a single set in about 20 seconds.
  • We used about 35 CPU years to find solutions to all of the positions in each of the 55,882,296 sets.

This reminds me of the first program I wrote for my own enjoyment, which used brute force to find all solutions to Piet Hein’s Soma Cube. In 1969 I had a summer job as the night operator for an IBM 360 and I would turn off the clock to run my program so that the management wouldn’t know how much computer time I was consuming.

See this BBC story more more information on this amazing result.

Google acquires Metaweb and Freebase

July 16th, 2010, by Tim Finin, posted in Database, Google, sEARCH, Semantic Web, Social media, Web

Google announced today that it has acquired Metaweb, the company behind Freebase — a free, semantic database of “over 12 million people, places, and things in the world.” This is from their announcement on the Official Google blog:

“Over time we’ve improved search by deepening our understanding of queries and web pages. The web isn’t merely words — it’s information about things in the real world, and understanding the relationships between real-world entities can help us deliver relevant information more quickly. … With efforts like rich snippets and the search answers feature, we’re just beginning to apply our understanding of the web to make search better. Type [barack obama birthday] in the search box and see the answer right at the top of the page. Or search for [events in San Jose] and see a list of specific events and dates. We can offer this kind of experience because we understand facts about real people and real events out in the world. But what about [colleges on the west coast with tuition under $30,000] or [actors over 40 who have won at least one oscar]? These are hard questions, and we’ve acquired Metaweb because we believe working together we’ll be able to provide better answers.”

In their announcement, Google promises to continue to maintain Freebase “as a free and open database for the world” and invites other web companies use and contribute to it.

Freebase is a system very much in the linked open data spirit, even thought RDF is not its native representation. It’s content is available as RDF and there are many links that bind it to the LOD cloud. Moreover, Freebase has a very good wiki-like interface allowing people to upload, extend and edit both its schema and data.

Here’s a video on the concepts behind Metaweb which are, of course, also those underlying the Semantic Web. What the difference — I’d say a combination of representational details and centralized (Metaweb) vs. distributed (Semantic Web).

Search neutrality: Google and Danny Sullivan weigh in

July 16th, 2010, by Tim Finin, posted in Google, Semantic Web, Social media, Web

Web search guru Danny Sullivan has a great response to the NYT editorial on regulating search engine algorithms: The New York Times Algorithm and Why It Needs Government Regulation. Here’s how it starts:

“The New York Times is the number one newspaper web site. Analysts reckon it ranks first in reach among US opinion leaders. When the New York Times editorial staff tweaks its supersecret algorithm behind what to cover and exactly how to cover a story — as it does hundreds of times a day — it can break a business that is pushed down in coverage or not covered at all.”

Google published its own response to the Times piece as a Financial Times op-ed and also posted it to the Google public policy blog: regulating what is “best” in search?

“Search engines use algorithms and equations to produce order and organisation online where manual effort cannot. These algorithms embody rules that decide which information is “best”, and how to measure it. Clearly defining which of any product or service is best is subjective. Yet in our view, the notion of “search neutrality” threatens innovation, competition and, fundamentally,your ability as a user to improve how you find information.”

The penultimate paragraph gives what they say is their strongest argument againt mandating “search neutrality”.

“But the strongest arguments against rules for “neutral search” is that they would make the ranking of results on each search engine similar, creating a strong disincentive for each company to find new, innovative ways to seek out the best answers on an increasingly complex web. What if a better answer for your search, say, on the World Cup or “jaguar” were to appear on the web tomorrow? Also, what if a new technology were to be developed as powerful as PageRank that transforms the way search engines work? Neutrality forcing standardised results removes the potential for innovation and turns search into a commodity.”

This assumes of course, that there is real competition among Internet search engines. Microsoft has been putting a lot of research and development into Bing with good results and it’s been gaining market share. Yahoo is doing very interesting this as well. Consumer choice among a handful of competitors would be the best way to ensure that none abuse their customers.

New York Times editorializes about the Google search ranking algorithm

July 15th, 2010, by Tim Finin, posted in Google, Semantic Web, Social media, Web

In what may be a first, today’s New York Times has an editorial about an algorithm. No, they haven’t waded into the P=NP issue, but commented on Google’s algorithm for ranking search results and accusations that Google unfairly biases it for its own self interest.

“In the past few months, Google has come under investigation by antitrust regulators in Europe. Rivals have accused Google of placing the Web sites of affiliates like Google Maps or YouTube at the top of Internet searches and relegating competitors to obscurity down the list. In the United States, Google said it expects antitrust regulators to scrutinize its $700 million purchase of the flight information software firm ITA, with which it plans to enter the online travel search market occupied by Expedia, Orbitz, Bing and others.”

This issue will become more important as the companies dominating Web search (Google, Microsoft and Yahoo) continue to increase their importance and also broaden their acquisition of companies offering web services.

The NYT’s position is moderate, recommending:

Google provides an incredibly valuable service, and the government must be careful not to stifle its ability to innovate. Forcing it to publish the algorithm or the method it uses to evaluate it would allow every Web site to game the rules in order to climb up the rankings — destroying its value as a search engine. Requiring each algorithm tweak to be approved by regulators could drastically slow down its improvements. Forbidding Google to favor its own services — such as when it offers a Google Map to queries about addresses — might reduce the value of its searches. With these caveats in mind, if Google is to continue to be the main map to the information highway, it concerns us all that it leads us fairly to where we want to go.

Google Open Spot Android app finds parking

July 9th, 2010, by Tim Finin, posted in Google, Mobile Computing, Semantic Web, Social media

sf_retrieving_spotGoogle’s Open Spot Android app lets people leaving parking spots share the information with others searching for parking nearby. Running the app shows you parking spots within a 1.5km. New parking spots are assumed to be gone after 20 minutes and removed from the system.

People who announce open spots gain karma points, while those who report false spots, known as griefers, are on notice:

“We’re watching for behavior that looks like a griefer spoofing parking spots. We have a couple of mechanisms available to make sure someone can’t leave a bunch of fake parking spots. If we see this happening we will take steps to fix it.

This is a simple example of a context-aware mobile app that can further benefit from also knowing that you are driving, as opposed to riding, in your car and likely to want to find a parking spot, as opposed to doing 70mph on I-95 as it goes through Baltimore. Moreover, context would also inform that app that you are probably leaving a public parking spot and mark it automatically. However, such a feature should be smart enough to avoid being tagged by Google as a griefer and finding out what punishment Google has in store for you.

Google list of the 1000 most popular Web sites

May 28th, 2010, by Tim Finin, posted in Google, Semantic Web, Social media

Google publishes a list of the 1000 most popular Web sites based on unique visitors to the top-level domain. The list is compiled by their (DoubleClick) Ad Planner group and shows estimates for the monthly number of unique visitors and pageviews. Not surprisingly, Facebook tops the list with 540M visitors and 570B page views per month.

Each site is categorized (e.g., as social network, web portal, search engine, etc) though some of these are surely wrong — e.g., #985, dropbox.com, is listed as “Myth & Folklore”. They say that the list excludes “adult sites, ad networks, domains that don’t have publicly visible content or don’t load properly, and certain Google sites.”

If you want to play with the data, a Karl Seguin has downloaded the data, added some additional attributes, and made it available in json. That would make it easy to run your own analysis on them — category distribution, country distribution, average load time, etc.

Google Crisis Response and Relief

May 25th, 2010, by Tim Finin, posted in Google, Social media

Google’s Crisis Response team has a landing page for the Gulf oil spill featuring overlays for Google Maps/Earth. This joins their pages for other recent natural disasters, such as the earthquakes in Haiti, Chile and China. Some support ‘crowsourcing’ by allowing people to upload information, data and queries.


Google Crisis Response page for the 2010 Gulf oil spill

Google Crisis Response page for the 2010 Gulf oil spill



Here’s how the Google team describes their work and mission:

“Working with the input of subject matter experts and in conjunction with like-minded organizations and the development community at large, Google Crisis Response facilitates the development and refinement of crisis response technology—with the ultimate goal of helping victims help themselves and helping first responders/relief agencies/governments/citizens help victims.

When a major disaster strikes, the Google Crisis Response team collects fresh high-resolution imagery plus other event-specific data, then publishes this information on a dedicated landing page.

Google Crisis Response Mission

To develop, maintain, and optimize a worldwide, rapid-deployment protocol to speed the dissemination of situational information and increase the efficacy of rescue and humanitarian aid activities in response to quick-onset disasters.

Google Crisis Response will:

  • Coordinate with other platforms, organizations and teams
  • Build tools to surface near-real-time data
  • Support response/relief organizations
  • Respond in times of crisis

There doesn’t seem to be a list of these pages online, but here are a few:

A review of the Google Go programming language

November 12th, 2009, by Tim Finin, posted in Google, Programming

Mark Chu-Carroll is a Google software engineer who’s written a long, detailed and informed review of Google’s new programming language Go. It’s worth a read if you are interested in understanding what it’s like as a programming language. Here’s a few points that I took note of.

    “The guys who designed Go were very focused on keeping things as small and simple as possible. When you look at it in contrast to a language like C++, it’s absolutely striking. Go is very small, and very simple. There’s no cruft. No redundancy. Everything has been pared down. But for the most part, they give you what you need. If you want a C-like language with some basic object-oriented features and garbage collection, Go is about as simple as you could realistically hope to get.”

    “The most innovative thing about it is its type system. … It ends up giving you something with the flavor of Python-ish duck typing, but with full type-checking from the compiler.”

    “Go programs compile really astonishingly quickly. When I first tried it, I thought that I had made a mistake building the compiler. It was just too damned fast. I’d never seen anything quite like it.”

    “At the end of the day, what do I think? I like Go, but I don’t love it. If it had generics, it would definitely be my favorite of the C/C++/C#/Java family. It’s got a very elegant simplicity to it which I really like. The interface type system is wonderful. The overall structure of programs and modules is excellent. But it’s got some ugliness. … It’s not going to wipe C++ off the face of the earth. But I think it will establish itself as a solid alternative.”

Go sounds like a language that will help you grow as a computer scientist if you use it. That’s a good enough recommendation for me.

Google VP on semantic search and the Semantic Web

November 11th, 2009, by Tim Finin, posted in AI, Google, NLP, sEARCH, Semantic Web

PCWorld has a story, Google VP Mayer Describes the Perfect Search Engine, with some interesting comments on semantic search from Marissa Mayer, Google’s vice president of Search Products & User Experience.

“IDGNS: What’s the status of semantic search at Google? You have said in the past that through “brute force” — analyzing massive amounts of queries and Web content — Google’s engine can deliver results that make it seem as if it understood things semantically, when it really functions using other algorithmic approaches. Is that still the preferred approach?

Mayer: We believe in building intelligent systems that learn off of data in an automated way, [and then] tuning and refining them. When people talk about semantic search and the semantic Web, they usually mean something that is very manual, with maps of various associations between words and things like that. We think you can get to a much better level of understanding through pattern-matching data, building large-scale systems. That’s how the brain works. That’s why you have all these fuzzy connections, because the brain is constantly processing lots and lots of data all the time.

IDGNS: A couple of years ago or so, some experts were predicting that semantic technology would revolutionize search and blindside Google, but that hasn’t happened. It seems that semantic search efforts have hit a wall, especially because semantic engines are hard to scale.

Mayer: The problem is that language changes. Web pages change. How people express themselves changes. And all those things matter in terms of how well semantic search applies. That’s why it’s better to have an approach that’s based on machine learning and that changes, iterates and responds to the data. That’s a more robust approach. That’s not to say that semantic search has no part in search. It’s just that for us, we really prefer to focus on things that can scale. If we could come up with a semantic search solution that could scale, we would be very excited about that. For now, what we’re seeing is that a lot of our methods approximate the intelligence of semantic search but do it through other means.”

I interpret these comments to mean that Google’s management still views the concept of semantic search (and the Semantic Web) as involving better understanding of the intended meaning of text in documents and queries. The W3C’s web of data model is still not on their radar.

Dashboard shows data Google has about you

November 5th, 2009, by Tim Finin, posted in Google, Privacy, Semantic Web, Social media, Web

Google added a great new service, Dashboard, that summarizes data stored for a Google account — see MY ACCOUNT>PERSONAL SETTINGS>DASHBOARD.

“Designed to be simple and useful, the Dashboard summarizes data for each product that you use (when signed in to your account) and provides you direct links to control your personal settings. Today, the Dashboard covers more than 20 products and services, including Gmail, Calendar, Docs, Web History, Orkut, YouTube, Picasa, Talk, Reader, Alerts, Latitude and many more. The scale and level of detail of the Dashboard is unprecedented, and we’re delighted to be the first Internet company to offer this — and we hope it will become the standard.”

This is a good move on Google’s part. But while there’s a lot of information included, it’s not everything that Google knows about you — e.g., data in cookies, click throughs data from search results and information from companies it’s acquired, like Doublclick. Still, it is a big step in a positive direction.

WebFinger: a finger protocol for the Web

August 15th, 2009, by Tim Finin, posted in Google, Semantic Web, Social, Social media, Web

Maybe WebFinger will succeed where others have failed. At what? At providing a simple handle for a person that can be easily used to get basic information that the person wants to make available. The WebFinger proposal is to use an email address as the handle.

WebFinger, aka Personal Web Discovery. i.e. We’re bringing back the finger protocol, but using HTTP this time.

Techcrunch has a post on this, Google Points At WebFinger. Your Gmail Address Could Soon Be Your ID with some background.

There’s some excitement around the web today among a certain group of high profile techies. What are they so excited about? Something called WebFinger, and the fact that Google is apparently getting serious about supporting it. So what is it?

It’s an extension of something called the “finger protocol” that was used in the earlier days of the web to identify people by their email addresses. As the web expanded, the finger protocol faded out, but the idea of needing a unified way to identify yourself has not. That’s why you keep hearing about OpenID and the like all the time.

The current focus of the WebFinger group is on developing the spec for accessing a user’s metadata given their handle. Using RDF and the FOAF vocabulary should be a no-brainer for representing the metadata.

You are currently browsing the archives for the Google category.

  Home | Archive | Login | Feed