Models? We don’t need no stinking models!

June 26th, 2008

Wired has an interesting article, The End of Theory: The Data Deluge Makes the Scientific Method Obsolete, that discusses the data driven revolution that computers and the Web have unleashed. Science used to rely on developing models to explain and organize the world and make predictions. Now much of that can be done by correlating large amounts of data. It applies equally well to other disciplines (e.g., Linguistics) as well as businesses (think Google).

“All models are wrong, but some are useful.” So proclaimed statistician George Box 30 years ago, and he was right. But what choice did we have? Only models, from cosmological equations to theories of human behavior, seemed to be able to consistently, if imperfectly, explain the world around us. Until now. Today companies like Google, which have grown up in an era of massively abundant data, don’t have to settle for wrong models. Indeed, they don’t have to settle for models at all.

Sixty years ago, digital computers made information readable. Twenty years ago, the Internet made it reachable. Ten years ago, the first search engine crawlers made it a single database. Now Google and like-minded companies are sifting through the most measured age in history, treating this massive corpus as a laboratory of the human condition. They are the children of the Petabyte Age.

Update: And then there is this counterpoint: Why the cloud cannot obscure the scientific method .