October 7 is shaping up to be a big day for bookworms. On that day, a federal judge is expected to rule on a proposed agreement between Google and book publishers and authorsthat has been called the “most significant book industry development in the modern era” by Berkeley Professor Pamela Samuelson. If the settlement is approved, Google will gain the rights to scan and index tens of millions of currently books to which it currently lacks access, and upload them to its Google Books site. In exchange, Google will make a lump sum payment to authors and share its future book search revenues with them.
But not everyone is happy about the settlement. Several academics, trade groups, and activists have criticized the deal’s scope, and the Department of Justice is investigating the deal’s effects on competition. But user privacy has been perhaps the biggest sticking point of the agreement.
Activist groups including the Electronic Frontier Foundation and the ACLU argue that the settlement will give Google unprecedented access to individuals’ reading habits, endangering privacy and creating a tempting treasure trove of data for unscrupulous government officials and computer hackers. These groups have launched a campaign aimed at persuading Google to keep data collected about book search users to a bare minimum.
Sensible as this proposal may seem, it ignores the critical role that data collection plays in today’s information economy—including cutting-edge industries like online advertising and website design.
As any competent Web developer will tell you, detailed user-level analytics are indispensable in figuring out what works and what doesn’t for a particular site. Just as Google uses search query data to fine-tune its search engine, user data will likely be important to Google’s book search service.
Even if the settlement goes through, Google Books won’t be the only game in town. Bookworms worried about privacy will still be able to visit libraries, bookstores, and online retailers. And, importantly, the proposed settlement is non-exclusive. WhileGoogle may end up being the only outlet for some hard-to-find books, other companies will still be able to negotiate similar deals with authors and publishers. Few firms are likely to jump at the prospect of taking on the Google juggernaut, but the threat of market entry from the likes of Microsoft and Amazon will help to discipline Google.
For all the hullabaloo over Google Book Search, let’s not forget that Google already logs users’ search queries, which often are even more sensitive than book histories. Knowing exactly what you’re looking for is arguably more intimate than knowing which books you’re reading.
Despite logging users’ search histories, Google has a solid user privacy track record. This week’s big cybercrime story involved criminals stealing over 130 million credit card numbers. But when was the last time you heard of a hacker breaking into Google’s servers and making off with personally identifiable information?
Like any Web firm that depends on having a positive image among users, Google has a huge incentive to be straightforward about how it stores and uses personal data. Google realizes what is at stake if it fails to protect its users’ privacy. Consequently, Google has implemented robust security measures designed to prevent any privacy breach.
What about authors? Under the Google book deal, authors are set to earn a sizable chunk of the revenues—63 percent—that Google will earn from showing ads next to books. Limiting data collection means fewer relevant ads, which in turn means that authors make less money.
Unfortunately, some politicians have glommed on to this vilification of data collection. Recently, both regulators and members of Congress have floated a number of ill-conceived proposals to limit what kinds of information Web firms can collect. That is the wrong way to go. Data collection deserves a break. As Google itself has illustrated, data mining and robust privacy safeguards can not only coexist, but thrive.