MobileRead Forums
Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book General > News and Commentary

Welcome to the MobileRead Forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community today, you will have fewer ads, access to post topics, communicate privately with other members, respond to polls, upload content and access many other special features.

If you have any problems with the registration process or your account login, please contact us.

Hint: Don't have time to visit us daily? Subscribe to our main RSS feed to receive our frontpage posts at your convenience.

Notices

News and Commentary Latest on e-books, e-paper, DRM and related technologies

Reply
 
Thread Tools Search this Thread Display Modes
Old 03-22-2007, 09:31 PM   #1
Alexander Turcic
Fully Converged
Alexander Turcic knows who John Galt is.Alexander Turcic knows who John Galt is.Alexander Turcic knows who John Galt is.Alexander Turcic knows who John Galt is.Alexander Turcic knows who John Galt is.Alexander Turcic knows who John Galt is.Alexander Turcic knows who John Galt is.Alexander Turcic knows who John Galt is.Alexander Turcic knows who John Galt is.Alexander Turcic knows who John Galt is.Alexander Turcic knows who John Galt is.
 
Alexander Turcic's Avatar
 
Posts: 12,730
Karma: 71589
Join Date: Oct 2002
Location: Switzerland
Device: Sony Portable Reader
Google scanning 27'000 books per day - at least, says Economist

The Economist runs an interesting story according to which Google is scanning the staggering number of 27'000 books on average per day:

Quote:
Google will not divulge exact numbers, but Daniel Clancy, the project's lead engineer, gives enough guidance for an educated guess: Google's contract with one university library, Berkeley's, stipulates that it must digitise 3,000 books a day. The minimum for the other 12 universities involved may be lower, but the rate for participating publishers is higher. So a conservative estimate has Google digitising at least 10m books a year. The total number of titles in existence is estimated to be about 65m.
With the immense amount of digitized books, the author contemplates how people will read the book in the future.

Quote:
As books go digital, new questions, both philosophical and commercial, arise. How, physically, will people read books in future? Will technology “unbind” books, as it has unbundled other media, such as music albums? Will reading habits change as a result? What happens when books are interlinked? And what is a book anyway?
At the end of the day, he doesn't believe that e-books will replace print books, but that, like paperbacks and audiobooks, they are a form that is here to stay.
Alexander Turcic is offline   Reply With Quote
Old 03-22-2007, 11:19 PM   #2
Steve Jordan
Onuissance Man
Steve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to behold
 
Steve Jordan's Avatar
 
Posts: 5,397
Karma: 19608
Join Date: Jan 2006
Location: Germantown, MD USA
Device: HP iPaq 110
"What is a book, anyway?"

Does anybody really know what a book is? Does anybody really care? If so, I can't imagine why... we all have read enough to cry.

Anyway...

Maybe if everyone didn't think about replacing print books, and thought about augmenting Literature instead, we wouldn't be debating the emergence of e-books at all.
__________________

Darwin would read e-books!
SteveJordanBooks.com
Steve Jordan is offline   Reply With Quote
Old 03-23-2007, 02:36 AM   #3
Azayzel
Cache Ninja!
Azayzel knows what time it isAzayzel knows what time it isAzayzel knows what time it isAzayzel knows what time it isAzayzel knows what time it isAzayzel knows what time it isAzayzel knows what time it isAzayzel knows what time it isAzayzel knows what time it isAzayzel knows what time it isAzayzel knows what time it is
 
Azayzel's Avatar
 
Posts: 644
Karma: 2300
Join Date: Jan 2007
Location: Tokyo, Japan
Device: PRS-500, HTC Shift, iPod Touch, iPaq 4150, TC1100, Panasonic WordsGear
That's a staggering number a day to be digitizing, wonder how much their cost expenditures are (esp. considering when they get into scanning really old books that have to be delicately handled)? Now let's just hope that the "books unbound" remain that way and will be updated to new formats as they evolve; i.e., don't get stuck in a format that's unusable/inaccessible in the future.

There are quite a few issues I can think of off the top of my head that will keep books in print long after new formats emerge/evolve, what would be nice is if they printed a set number of "master" copies of a given book on some highly rugged or nigh-indestructable medium that would ensure it will be around for generations to come. I mention this because I'm a mild book collector and I've quite a few in my library that it wouldn't be feasible to leave out for the young 'uns to play with. It would be a shame to see a work disappear due to aging... and back to the thread, Google will at least keep most of this accessible to people with online access and the right credentials.
Azayzel is offline   Reply With Quote
Old 03-23-2007, 08:58 AM   #4
Steve Jordan
Onuissance Man
Steve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to behold
 
Steve Jordan's Avatar
 
Posts: 5,397
Karma: 19608
Join Date: Jan 2006
Location: Germantown, MD USA
Device: HP iPaq 110
I'm guessing that, because they have college libraries (and therefore probably students) doing the work, the cost to Google must be minimal. Still, if one library is digitizing 3,000 books a day, even with the latest and fastest scanning equipment (the slowest part of the process), that has to be multiple workstations and quite a number of students working on that project!

The article suggests they are simply making images of each page ("fingers are visible in the corners of many pages on books.google.com"), and that's a shame. If they're not being text-reco'd, they're missing a great opportunity.

In re-reading the article, I realize again they should have given that story to someone who actually knows something about e-books. In comparing e-books to CDs, the author says "The simplest difference is that transferring one's old music CDs onto iPods is easy, whereas transferring one's old books onto an e-book is impossible."

Really?
__________________

Darwin would read e-books!
SteveJordanBooks.com

Last edited by Steve Jordan; 03-23-2007 at 09:07 AM.
Steve Jordan is offline   Reply With Quote
Old 03-23-2007, 10:04 AM   #5
igorsk
Wizard
igorsk has exceeded all limitations known to mankindigorsk has exceeded all limitations known to mankindigorsk has exceeded all limitations known to mankindigorsk has exceeded all limitations known to mankindigorsk has exceeded all limitations known to mankindigorsk has exceeded all limitations known to mankindigorsk has exceeded all limitations known to mankindigorsk has exceeded all limitations known to mankindigorsk has exceeded all limitations known to mankindigorsk has exceeded all limitations known to mankindigorsk has exceeded all limitations known to mankind
 
Posts: 3,111
Karma: 17432
Join Date: Sep 2006
Location: Belgium
Device: PRS-500/505/700, Kindle, Cybook Gen3, Words Gear
Storing images of pages does not prevent them from being OCRed later, while storing just text loses quite a bit from the complete presentation, not speaking about possible OCR errors which can be hard to correct without checking the originals.
Anyway, if you do a search on books.google.com, you will see the search result highlighted on the image of the page. So I guess they store both the image and the text of it.
igorsk is offline   Reply With Quote
Old 03-23-2007, 10:06 AM   #6
NatCh
Gizmologist
NatCh knows the complete value of PI to the endNatCh knows the complete value of PI to the endNatCh knows the complete value of PI to the endNatCh knows the complete value of PI to the endNatCh knows the complete value of PI to the endNatCh knows the complete value of PI to the endNatCh knows the complete value of PI to the endNatCh knows the complete value of PI to the endNatCh knows the complete value of PI to the endNatCh knows the complete value of PI to the endNatCh knows the complete value of PI to the end
 
NatCh's Avatar
 
Posts: 11,484
Karma: 31590
Join Date: Jan 2006
Location: Republic of Texas Embassy at Jackson, TN
Device: PRS600
Quote:
Originally Posted by Steve Jordan
If they're not being text-reco'd, they're missing a great opportunity.
Doesn't Google's project allow searching the text of the books they've scanned? I was under the impression that it does, which would suggest ....
NatCh is offline   Reply With Quote
Old 03-23-2007, 10:53 AM   #7
The GreatGonzo
Old Yeller
The GreatGonzo is on a distinguished road
 
The GreatGonzo's Avatar
 
Posts: 177
Karma: 67
Join Date: May 2006
Device: Iliad & Kindle - The Best of Both Worlds
...and how many of those books are actually scanned WELL?

I seem to run across quite a lot that have pages missing or so skewed that text is missing; some contain colored pages instead of black and white text; some files won't open or display correctly ... I mean, this is almost fully automated scanning we're talking about, right? Somebody jamming a bunch of pages into a sheet-fed scanner and uploading the result within minutes?

I've all but given up on Google books...
The GreatGonzo is offline   Reply With Quote
Old 03-25-2007, 09:43 AM   #8
CommanderROR
eink fanatic
CommanderROR is fluent in JavaScript as well as Klingon.CommanderROR is fluent in JavaScript as well as Klingon.CommanderROR is fluent in JavaScript as well as Klingon.CommanderROR is fluent in JavaScript as well as Klingon.CommanderROR is fluent in JavaScript as well as Klingon.CommanderROR is fluent in JavaScript as well as Klingon.CommanderROR is fluent in JavaScript as well as Klingon.CommanderROR is fluent in JavaScript as well as Klingon.CommanderROR is fluent in JavaScript as well as Klingon.CommanderROR is fluent in JavaScript as well as Klingon.CommanderROR is fluent in JavaScript as well as Klingon.
 
CommanderROR's Avatar
 
Posts: 2,005
Karma: 4900
Join Date: Mar 2006
Location: Germany
Device: STAReBOOK, iRex Iliad, Sony 505, Kindle 2
I hope they really get serious about this project. Scanning all the books, correcting errors and then OCR and proofreading are a lot of work and only a really big company like google could even hope to manage that...
CommanderROR is offline   Reply With Quote
Old 03-26-2007, 11:15 PM   #9
Steve Jordan
Onuissance Man
Steve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to beholdSteve Jordan is a splendid one to behold
 
Steve Jordan's Avatar
 
Posts: 5,397
Karma: 19608
Join Date: Jan 2006
Location: Germantown, MD USA
Device: HP iPaq 110
Actually, I never thought Google was right for this job. Given the apparent inconsistencies of the results, there should be a dedicated organization doing this... a group that will put more effort into properly scanned, reco'd and saved texts.

Don't ask me what organization. It should be the Library of Congress, but I don't think we can expect them to do it.
__________________

Darwin would read e-books!
SteveJordanBooks.com
Steve Jordan is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
OpticBook 3600 Book Scanner Review - Part I Bob Russell Workshop 65 01-28-2009 05:51 PM
Google Download: No iTunes for Books - BusinessWeek SoCal Bob News and Commentary 5 01-24-2007 08:31 PM
Google Book Search to search full-text books online Bob Russell Deals, Freebies, and Resources 1 08-19-2006 01:13 PM
Why Dr. Eric Schmidt (Google CEO) may be wrong and right about click fraud Bob Russell Lounge 0 07-09-2006 02:35 PM
Scanning books from your own library Alexander Turcic Deals, Freebies, and Resources 13 06-16-2006 01:28 AM


All times are GMT -4. The time now is 04:30 PM.


MobileRead.com is a privately owned, operated and funded community.