ALA TechSource Logo


curve Home spacer Publications spacer Subscribe spacer Blog spacer About  

UC Libraries Join the Google Books Library Project

Submitted by Tom Peters on August 9, 2006 - 5:36pm

Tom PetersSoon after Google announced in late 2004 the collaborative project—currently called the "Google Books Library Project," involving the five research libraries of Stanford, Michigan, Harvard, Oxford, and the New York Public Library—to scan millions of books, the five libraries became known as the "G5 Group."

When the G5 Group was mentioned in conversations among librarians, often there was at least a note of despair or hostility toward the G5 Group for collaborating with the "enemy," or for, perhaps, significantly altering the future course of research and research librarianship without consulting with representative samples of these communities.

Longing to be a member of the G5 Group was not one of the dominant emotions I detected in the conversations I heard about their involvement in Google's massive digitization project. But evidently there is some longing out there.

First, the Library of Congress began to dabble with Google in this project, which is not surprising. Then the University of California Libraries System—which collectively, across its 100 libraries, claims to have the largest research collection in the world—began serious discussions with Google, as evidenced by an article that appeared in the Daily Californian in July. Reports in other publications, such as the Los Angeles Times and The Chronicle of Higher Education, followed, culminating in the official August 9 press release that an agreement had been struck.

The press release emphasizes:

  • the public trust upon which this massive collection has been built;
  • the need for quick and easy means of discoverability;
  • the possibilities for new and accelerated forms of scholarly inquiry; and
  • the archival need for massive digital back-up copies of printed books in collections on or near areas of frequent seismic activity.

The UC Libraries' System already is a member of the Open Content Alliance, which has also undertaken a very large-scale book-digitization project. The OCA, however, has intentionally limited its scanning efforts to books clearly in the public domain, which is estimated to be only twenty percent of the books held by the various libraries across the UC System. By becoming a major player in the Google Project, the UC System could potentially have thirty-four million books scanned, with one master copy going to Google and another staying with the UC System.

The Chronicle article quotes Adam Smith from Google stating that they are in active conversation with other research libraries with interesting special collections. Smith hinted that some of these libraries are located outside the United States, with Oxford University as the precedent. Perhaps before long the majority of the member libraries of the Association of Research Libraries will be Google partners.

There has been much speculation about how Google plans to generate revenue from this massive digitization project. For the libraries involved, avoiding the cost of a massive digitization project seems to be a major motivation. In the Daily Californian article, Dan Greenstein is attributed to have said, but is not directly quoted, that through the use of Google's top-secret and proprietary scanning process, the cost to scan a book will be only $1 or $2, compared to approximately $30 via the process being used by the Open Content Alliance.

Other monetary advantages may redound to the Google libraries as well. For example, Google recently announced plans to build an advertising office that will employ approximately 1,000 people in Ann Arbor, which coincidentally is the home of the University of Michigan, one of the original G5 Group members. All politics, as they say, are local.Technorati tags: book scanning, books, Copyright, digitizing books, E-Books, Google_Book_Search, google book search, librarians, librarianship, Libraries, Open_Content_Alliance, research libraries

Comments (9)

Ever get those *gut*

Ever get those *gut* feelings that persist? When I first heard of it, I thought it was \'Gobble\' : ) Then as time has passed, this org has turned into a monster that seems to be \'gobbling\' up all the info it can get...why? For what? And that little *gut* feeling still won\'t go away.

True it's not convenient to

True it's not convenient to read books 'on monitor,' but if there's only a few copies in the world, it's a lot more convenient and cheaper to read at home than to travel hundreds or thousands of miles!

It's not convenient to read

It's not convenient to read books on monitor.

They surprised me by simply

They surprised me by simply scanning all those books. But I don't see how they managed to save the half of expences.

So Google is heavily

So Google is heavily underwriting this, which means it's very valuable to them. I hope we bargained well.

Jennifer Colvin from the

Jennifer Colvin from the California Digital Library, part of the Office of the President of the University of California System, phoned me back earlier this afternoon. She said the reporter from the Daily Californian may have misunderstood some cost figures that Dan Greenstein presented the UC Board of Regents during their July meeting. Dan was trying to illustrate to the UC BOR how the costs to the UC if they participated in the Google book scanning project would compare to UC costs if they decided to go it alone. According to Jennifer, Dan reported that he estimated the cost to the UC if it decided to participate in the Google project would be $1 to $2 per scanned book for the first five years, followed by an estimated 10 cents per book per year thereafter. I haven't seen the actual signed agreement between UC and Google, but evidently UC would need to contribute some not-insignificant server capacity to the project (if for no other reason than to store and serve their copies of the forthcoming scanned books), plus perhaps some labor and other costs. For comparison purposes, Jennifer said that Dan reported to the BOR a cost of $30 to $40 per scanned book if the UC libraries undertook and financed this project on their own.

That's unbelievable. How

That's unbelievable. How could they shave $27-29 dollars off the price of scanning each book? Does the OCA use overseas outsourcing or something? They just keep finding ways to amaze me. It is Larry Page that went to U of M. He is the brains behind the infamous "Page Rank" Google hangs their hat on.

This secret, proprietary

This secret, proprietary scanning process is one of the more intriguing aspects of this entire project. I just phoned Dan Greenstein's office to learn if the price differential between OCA scanning ($30 per volume) and Google scanning ($1 to $2) is indeed what Dan said, or if the Daily Californian reporter misunderstood what he was saying. After being shunted around a bit, I left a voice message with someone in his office. If I receive a clarification, I will post another comment. If the cost differential is even close to being accurate, how does Google do it? Have they developed a technology that scans the entire book without having to manually flip the pages?

I think he was referring to

I think he was referring to Google shaving $27-$29 off the price of a scan. My guess is they will use unemployed librarians. I know we're supposed to be excited about this brave new world, but I can't help feel cautious about the monopoly commercial digital library hegemony we seem all too happy to help create. I mean, translate to any other format: "Hey, kids, let's help RCA make money!"