Issues with the world’s largest digital library

While Google seems cleared to become an important scholarly destination due to its efforts to create the world’s largest digital library, Geoffrey Nunberg argues the system has some critical problems:

But to pose those [scholarly] questions, you need reliable metadata about dates and categories, which is why it’s so disappointing that the book search’s metadata are a train wreck: a mishmash wrapped in a muddle wrapped in a mess…

But I have the sense that a lot of the initial problems are due to Google’s slightly clueless fumbling as it tried master a domain that turned out to be a lot more complex than the company first realized. It’s clear that Google designed the system without giving much thought to the need for reliable metadata. In fact, Google’s great achievement as a Web search engine was to demonstrate how easy it could be to locate useful information without attending to metadata or resorting to Yahoo-like schemes of classification. But books aren’t simply vehicles for communicating information, and managing a vast library collection requires different skills, approaches, and data than those that enabled Google to dominate Web searching.

I’m sure Google is interested in correcting some of these issues – even their famous search algorithm is under constant scrutiny as they search for more optimal ways to present information.

Even as these problems are ironed out, it does seem like having this kind of digital library could transform scholarly research. Just as I can’t imagine a world where all sociology articles are online (and I can access many of them), years from now we may look back and wonder how people operated without a vast online library of digital books.