 | Google 
Archive for the 'Google' Category
April 21st, 2008, by Tim Finin, posted in Google, Social media
Is that a catchy title or what? No, and the story doesn’t involve Philip Marlowe or Sam Spade. See The $25,000,000,000 Eigenvector: The Linear Algebra Behind Google by Kurt Bryan and Tanya Leise. Here’s the abstract.
“Google’s success derives in large part from its PageRank algorithm, which ranks the importance of webpages according to an eigenvector of a weighted link matrix. Analysis of the PageRank formula provides a wonderful applied topic for a linear algebra course. Instructors may assign this article as a project to more advanced students, or spend one or two lectures presenting the material with assigned homework from the exercises. This material also complements the discussion of Markov chains in matrix algebra. Maple and Mathematica files supporting this material can be found at http://www.rose-hulman.edu/~bryan/google.html
“.
These techniques, and the mathematics behind them, are important in modeling many kinds of social phenomena.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
February 24th, 2008, by Tim Finin, posted in Google, Social media, Blogging, Web
Last week I noticed that some of our blog posts took a long time to show up in the Google Blog search index. During the past year, Google has been very fast at indexing blog posts, typically taking less than five minutes from the time is made to when it shows up in their blog search index. But this week it seemed that our posts, or at least some of them, took more than twelve hours to be indexed.
Yesterday I tried to watch a post I made on the IT job market which I wrote just before 11:00am (GMT-5). It showed up in Google Feed Reader quickly enough but had not yet appeared in Google Blog Search when I finally went to bed 14 hours later. When I checked at 9:00am today, it was there, so it took sometime between 14 and 22 hours.
It’s not the case that all posts are being delayed — do a Google Blog search for a popular term (e.g., TV) sorted by date and you’ll see posts made in the past few minutes. Nor do I think it’s related to pageRank — their blog search ingest is based on pings rather than crawling. Besides, our blog enjoys a reasonable rank. Finally, it can’t be the case that Google’s systems are being overwhelmed by new blogs — the growth of the Blogosphere has slowed.
So I’m puzzled about what is going on. (goomtitag)
Update 1: Posted at 9:49, in Google Feed Reader at 10:14, indexed by Google Blog Search by ~19:15 and in Google’s main index about the same time. Maybe this is a clue — it used to be the case that a post hit the blog index within a few minutes and showed up in the main index after about twelve hours. This post hit both indexes around the same time — after about ten hours. Maybe there is now just one (logical) index.
Update 2: Hmmm. Another post seems to have made it into Google’s main index before it got into the blog search index. I imagine that Google revisited our blog home page as part of it’s regular crawl and picked up the new post.
Edit | Bookmark@del.icio.us | Trackback | 5 Comments »
January 9th, 2008, by Tim Finin, posted in Google, Multicore Computation Center, Social media, Semantic Web
The latest CACM has an article by Google fellows Jeffrey Dean and Sanjay Ghemawat with interesting details on Google’s text processing engines. Niall Kennedy summarized it this way on his blog post, Google processes over 20 petabytes of data per day.
“Google currently processes over 20 petabytes of data per day through an average of 100,000 MapReduce jobs spread across its massive computing clusters. The average MapReduce job ran across approximately 400 machines in September 2007, crunching approximately 11,000 machine years in a single month.”
If big numbers numb your mind, 20 petabytes is 20,000,000,000,000,000 bytes (or 22,517,998,136,852,480 for the obsessive-compulsives among us) — enough data to fill up over five million 4G ipods a day, which, if laid end to end would …
Kevin Burton has a copy of the paper on his blog.
Jeffrey Dean and Sanjay Ghemawat, MapReduce: simplified data processing on large clusters, Communications of the ACM, pp 107-113, 51:1, January 2008.
MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google’s clusters every day, processing a total of more than twenty petabytes of data per day.
Dean and Ghemawat conclude their paper by summarizing the key reasons why MapReduce has worked so well for Google.
“First, the model is easy to use, even for programmers without experience with parallel and distributed systems, since it hides the details of parallelization, fault tolerance, locality optimization, and load balancing. Second, a large variety of problems are easily expressible as MapReduce computations. For example, MapReduce is used for the generation of data for Google’s production Web search service, for sorting, data mining, machine learning, and many other systems. Third, we have developed an implementation of MapReduce that scales to large clusters of machines comprising thousands of machines. The implementation makes efficient use of these machine resources and therefore is suitable for use on many of the large computational problems encountered at Google.”
Edit | Bookmark@del.icio.us | Trackback | 4 Comments »
December 18th, 2007, by Anupam Joshi, posted in Google, Ebiquity, Wearable Computing, Pervasive Computing, Mobile Computing
I recently bought a GPS (Garmin Mobile 10) that works with my WM5 Smartphone. In the process of trying to install the Garmin Mobile XT application (which was very problematic and a huge pain, but I digress ….), I ended up uninstalling Google Maps.
When I went to download and reinstall it though, I noticed that they have a new beta feature (My Location) that shows you where you are. It can either use a GPS, or use cell tower information. Basically, it sees which cell tower your phone is signed up to (and what signals it is seeing from others), and uses this to estimate where you are to within a 1000 meters.
This is interesting, because we did it the same way back when there used to be AMPS / CDPD and Palm IIIs and Vs with cellular modems. Our project was called Agents2Go, and we published a paper about this in the MCommerce workshop of Mobicom in 01. I remember that Muthu et al from AT&T had a similar paper in MobiDE that year as well.
The problem at that time was that there was no publicly accessible database of all cell tower locations. Also, we heard informally from at least one telco that while doing this for research was Ok, if anyone ever tried to make money from it they would want to be a part of the loop. I guess Google has found a way to work with the various telcos ? Or maybe in the interim cell tower ids and locations have been made public knowledge ?
Of course Google maps also works with GPS, except that it refuses to work with my Garmin. I’ve tried all the tricks that a search on Google will reveal (mainly, setting the serial port used by Bluetooth to talk to the GPS) , but to no avail 
Edit | Bookmark@del.icio.us | Trackback | No Comments »
December 18th, 2007, by Tim Finin, posted in Google, Social media
We use Google Reader clips as a simple way to share links on a number of our web sites. As I browse feeds and see a post that’s relevant to one of our blogs, UMBC GAIM for example, I can tag it with for-gaim and the link will show up in a sidebar on the GAIM site.
Today I noticed that none of this is working. Checking the javascript console, I see that the browser is complaining that GRC_p is not defined, so it seems like an error in Google’s javascript. I’ve not sen anything on the web about this (yet) except for some old posts from the summer. Does anyone know what’s going on?
Edit | Bookmark@del.icio.us | Trackback | 2 Comments »
October 18th, 2007, by Tim Finin, posted in Google, sEARCH, Web
Techcrunch writes in Cyberwar: China Declares War On Western Search Sites that someone in China is redirecting search engine access to Baidu, China’s top search engine.
“Further to our earlier story on visitors to Google Blogsearch being redirected to Baidu in China, new reports have surfaced that would indicate that China has unilaterally blocked all three major search engines in China and is redirecting all requests to Baidu. Digital Marketing Blog posts that all requests to Yahoo.com and sub-sites are being redirected to Baidu. Google Blogscoped forums indicate that Live.com is also being re-directed to Baidu, as well as confirming the Yahoo story and our earlier Google post. The re-direct would also appear to apply to YouTube.com.” (link)
Can any ebiquity readers in china confirm this? Is so, please leave a comment.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
December 8th, 2005, by li ding, posted in Google, Social media, Web 2.0, Blogging, Web
The popular XKCD had another Web related comic yesterday, but it trned out to be self-negating.

As was noted on Slashdot:
“As I noted yesterday (and was joined by many others)… in an offhand observation xkcd has singlehandedly changed a small section of the Internet. Changing the results from a Google search for “Died in a Blogging Accident” from 2 to (at this writing) over 7,170 in a little more than 24 hours.”
The number of results are now up to 13.3K. I guess something like the Heisenberg uncertainty principle applies to the Internet, too.
Edit | Bookmark@del.icio.us | Trackback | No Comments »
|  | You are currently browsing the archives for the Google category.
  Home
|
Archive
|
Login
|
Feed
Recent postsThe "Missouri Mom" (Lori Drew) case -- Privacy Issues and New Legal Theories ?An account of the Estonian Internet WarPhD proposal: Context and Policies in Declarative Networked SystemsRPI group developing Second Life robotThe Psychology of Social Networking on KQED Forum show
Ebiquity communityFieldmarking data blog
Geospatial Semantic Web
Harry Chen thinks aloud
Planet social media research
Social media research blog
TrackForward by Kolari
UMBC GAIM
|  |