August 24th, 2010, by Tim Finin, posted in Google, sEARCH, Semantic Web, Social media
Microsoft’s Bing team announced on their blog that that the Bing search engine is “powering Yahoo!’s search results” in the US and Canada for English queries. Yahoo also has a post on their Yahoo! Search Blog.
The San Jose Mercury News reports:
“Tuesday, nearly 13 months after Yahoo and Microsoft announced plans to collaborate on Internet search in hopes of challenging Google’s market dominance, the two companies announced that the results of all Yahoo English language searches made in the United States and Canada are coming from Microsoft’s Bing search engine. The two companies are still racing to complete the transition of paid search, the text advertising links that run beside and above the standard search results, before the make-or-break holiday period — a much more difficult task.”
Combining the traffic from Microsoft and Yahoo will give the Bing a more significant share of the Web search market. That should help them by providing both companies with a larger stream of search related data that can be exploited to improve search relevance, ad placement and trend spotting. It will also help to foster competition with Google focused on developing better search technology.
Hopefully, Bing will be able to benefit from the good work done at Yahoo! on adding more semantics to Web search.
June 7th, 2009, by Tim Finin, posted in Google, sEARCH, Web
Who’s got the best basic web search engine? One way to approach that question is to conduct an experiment in which subjects rank the results returned by several engines without knowing which is which.
BlindSearch is a simple and neat site that collects ‘objective’ opinions on search quality by showing query results from Google, Yahoo and Bing side by side without identifying which is which and inviting you to select the best.
“Type in a search query above, hit search then vote for the column which you believe best matches your query. The columns are randomised with every query.
The goal of this site is simple, we want to see what happens when you remove the branding from search engines. How differently will you perceive the results?”
As of this writing there have been 1679 votes for preferred results with Google getting 39%, Bing 39% and Yahoo: 22%.
update 2:14pm edt 6/7: Google: 45%, Bing: 32%, Yahoo: 22% | 11,130 votes
June 1st, 2009, by Tim Finin, posted in Google, sEARCH, Security, Semantic Web, Social media
Microsoft’s new Bing search engine is getting a lot of interest. Glenn McDonald posts about a nice side-by-side Bing vs Google comparator tat he developed. It makes it easy to compare how the two services do on a range of different types of searches. Here are the ones that Glen said he found useful in developing his initial opinion.
I sense form some of these queries that he is probing the systems where an advanced search engine can exploit a little bit of semantic knowledge. For example, recognizing that a user’s query “boston to asheville” matches a common pattern “ to “, and she probably is interested in information about how to travel from the first location tot he second. It seems like Google has been working on adding more such patterns, at least for the low hanging fruit.
Of course, if everyone hits on this site it may get throttled or blocked by either or both of the search engines. @Glen — would you be willing to share your code?
(spotted on hacker news)
August 3rd, 2008, by Tim Finin, posted in cloud computing, Multicore Computation Center, Semantic Web, Social media
Cloud computing is a hot topic this year, with IBM, Microsoft, Google, Yahoo, Intel, HP and Amazon all offering, using or developing high-end computing services typically described as “cloud computing”. We’ve started using it in our lab, like many research groups, via the Hadoop software framework and Amazon’s Elastic Compute Cloud services.
Bill Poser notes in a post (Trademark Insanity) on Language Log that Dell as applied for a trademark on the term “cloud computing”.
It’s bad enough that we have to deal with struggles over the use of trademarks that have become generic terms, like “Xerox” and “Coke”, and trademarks that were already generic terms among specialists, such as “Windows”, but a new low in trademarking has been reached by the joint efforts of Dell and the US Patent and Trademark Office. Cyndy Aleo-Carreira reports that Dell has applied for a trademark on the term “cloud computing”. The opposition period has already passed and a notice of allowance has been issued. That means that it is very likely that the application will soon receive final approval.
It’s clear, at least to me, that ‘cloud computing’ has become a generic term in general use for “data centers and mega-scale computing environments” that make it easy to dynamically focus a large number of computers on a computing task. It would be a shame to have one company claim it as a trademark. On Wikipedia a redirect for the Cloud Computing page was created several weeks before Dell’s USPTO application. A Google search produces many uses of cloud computing in news articles before 2007, although it’s clear that it’s use didn’t take off until mid 2007.
An examination of a Google Trends map shows that searches for ‘cloud computing’ (blue) began in September 2007 and have increased steadily, eclipsing searches for related terms like Hadoop, ‘map reduce’ and EC2 over the past ten months.
Here’s a document giving the current status of Dell’s trademark application, (USPTO #77139082) which was submitted on March 23, 2007. According to the Wikipedia article on cloud computing, Dell
“… must file a ‘Statement of Use’ or ‘Extension Request’ within 6 months (by January 8, 2009) in order to proceed to registration, and thereafter must enforce the trademark to prevent removal for ‘non-use’. This may be used to prevent other vendors (eg Google, HP, IBM, Intel, Yahoo) from offering certain products and services relating to data centers and mega-scale computing environments under the cloud computing moniker.”
July 28th, 2008, by Anupam Joshi, posted in cloud computing, GENERAL, Policy, Privacy, Security
There is an interesting panel to open the Microsoft faculty research summit featuring Rick Rashid, Daniel Reed, Ed Felten, Howard Schmidt, and Elizabeth Lawley. Lots of interesting ideas, but one that got thrown out was the recent idea that maybe the world does only need five (cloud) computers. If something like this really does happen, then perhaps we’ll need to think even more aggressively about the information sharing issues — is there some way for me to make sure that I only share with (say) Google’s cloud the things that are absolutely needed. Once I have given some information to Google, can I still retain some control over it. Who owns this information now? If I do, how do I know that Google will honor whatever commitments it makes about how it will use or further share that information ? We’ll be exploring some of these questions in our “Assured Information Sharing” Research. Some of the auditing work that MIT’s DIG group has done also ties in .
June 26th, 2008, by Tim Finin, posted in AI, NLP, Semantic Web, Web 2.0
Venture Beat reports that Microsoft will acquire Powerset for a price “rumored to be slightly more than $100 million”. Powerset has been developing a Web search system that uses natural language processing technology acquired from PARC to more fully understand user’s queries and the text of documents indexed.
“By buying Powerset, Microsoft is hoping to close the perceived quality gap with Google’s search engine. The move comes as Microsoft CEO Steve Ballmer continues to argue that improving search is Microsoft’s most important task. Microsoft’s market share in search has steadily declined, dropping further and further behind first-place Google and second place Yahoo.
Google has generally dismissed Powerset’s semantic, or “natural language” approach as being only marginally interesting, even though Google has hired some semantic specialists to work on that approach in limited fashion. Google’s search results are still based primarily on the individual words you type into its search bar, and its approach does very little to understand the possible meaning created by joining two or more words together.”
If you put the query “Where is Mount Kilimanjaro” into the beta version of Powerset, it answers “Mount Kilimanjaro: Contained by Tanzania” in addition to showing web pages extracted from Wikipedia. That’s a pretty good answer.
Its response to “what is the Serengeti” is a little less precise. It reports seven things it knows about Serengeti — that it replaced “desert, Platinum”, twilight and Caribbean Blue”, that it hosted ‘migration’, that it provided ‘draw’, that it gained ‘fame’, that it recorded ‘explorations’, that it rutted ‘season’ and that it boasted ‘Blue Wildebeests’. I’m just glad I don’t have a school report due on the Serengeti due tomorrow!
Asking “Who is the president of Zimbabwe” results only in the fallback answer — which appears to be just the set of Wikipedia pages that the query words produce in an IR query. Compare this with the results of the Google query who is the president of zimbabwe site:wikipedia.org.
By the way, the AskWiki system often does a better job on these kinds of question. Asking “where is the Serengeti” produces the answer “The Serengeti ecosystem is located in north-western Tanzania and extends to south-western Kenya between latitudes 1 and 3 S and longitudes 34 and 36 E. It spans some 30,000 km.” It’s a bit of a hack, though. It seems to work by selecting the sentence or two in Wikipedia that best serves as an answer. See our post on Askwiki from last Fall for more examples.
Still, Powerset is an ambitious system that shows promise. What they are trying to do is important and will eventually be done. They have shown real progress in the past two years, more than I had expected. I hope Microsoft can accelerate the development and find practical ways to improve Web search even if the ultimate goal of full language understanding is many years away.