View Single Post
Old 06-01-2012, 10:01 AM   #2
Ninjalawyer
Guru
Ninjalawyer ought to be getting tired of karma fortunes by now.Ninjalawyer ought to be getting tired of karma fortunes by now.Ninjalawyer ought to be getting tired of karma fortunes by now.Ninjalawyer ought to be getting tired of karma fortunes by now.Ninjalawyer ought to be getting tired of karma fortunes by now.Ninjalawyer ought to be getting tired of karma fortunes by now.Ninjalawyer ought to be getting tired of karma fortunes by now.Ninjalawyer ought to be getting tired of karma fortunes by now.Ninjalawyer ought to be getting tired of karma fortunes by now.Ninjalawyer ought to be getting tired of karma fortunes by now.Ninjalawyer ought to be getting tired of karma fortunes by now.
 
Ninjalawyer's Avatar
 
Posts: 826
Karma: 18573626
Join Date: Jun 2011
Location: Canada
Device: Kobo Touch, Nexus 7 (2013)
One of the commenters to the article had an interesting anecdote:

Quote:
My first job out of college was working IT at Questia, which at the time was in start-up mode. The company was building a digital research library with a launch goal of having 50-60k digitized books and another 100-200k digitized magazines and scholarly journals. The books would be scanned and OCR'd and XML tagged, with the pagination and images preserved, and would be full-text searchable.

The thing is, Questia had about 300 people JUST doing copyright research. There were a large number of public domain books that they included, but they employed a set of professional librarians to do the book curation, and the vast majority of the books to be included were under copyright. Those 300 copyright researchers worked 10-12 hours a day tracking down who held the copyright for each individual book and then attempting to negotiate with the copyright holder. I believe when we actually hit launch, there were about 30k books digitized and ready to go.

Let me say that again: 300 people working 10+ hours a day, for almost two years, managed to only secure the rights to ~30,000 books.

When Google announced their book scanning project, the first thing I thought was that they were entering into a world of pain with the copyright negotiations--every publisher wants its own set of terms and few want the same things. I remembered those poor researchers at Questia, and wondered how Google was going to do it all.

Turns out Google went with the path of least resistance: "Fuck it, we're Google, just start scanning." Blows my mind.
Ninjalawyer is offline   Reply With Quote