Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 03-08-2011, 03:46 PM   #1
CallOfCth'reader
Cultist
CallOfCth'reader ought to be getting tired of karma fortunes by now.CallOfCth'reader ought to be getting tired of karma fortunes by now.CallOfCth'reader ought to be getting tired of karma fortunes by now.CallOfCth'reader ought to be getting tired of karma fortunes by now.CallOfCth'reader ought to be getting tired of karma fortunes by now.CallOfCth'reader ought to be getting tired of karma fortunes by now.CallOfCth'reader ought to be getting tired of karma fortunes by now.CallOfCth'reader ought to be getting tired of karma fortunes by now.CallOfCth'reader ought to be getting tired of karma fortunes by now.CallOfCth'reader ought to be getting tired of karma fortunes by now.CallOfCth'reader ought to be getting tired of karma fortunes by now.
 
CallOfCth'reader's Avatar
 
Posts: 195
Karma: 8624438
Join Date: Jun 2009
Location: UK
Device: Sony PRS 505, Kobo Mini, Kobo Glo, Kobo Forma, Kindle DX
Archiving books (and magazines) - best way?

I've got a lot of old paper books which I don't want to keep (mainly for space reasons), but I'd love to have them in electronic format. Unfortunately, I've yet to find any of them in digital print. The only option open to me, for the moment at least, is to archive them myself. The ultimate aim, when I get round to it, is to convert the archived books to LRF and/or EPUB format.

What is the best way to initially archive them, for when I (eventually, maybe) convert them to LRF/EPUB in the future? I'm thinking scan the pages, and then stitch them up into a PDF file. I've read that PDF's aren't good to OCR from though, so what are my options? Save each page as a JPG?

A lot of the books are 20-40 years old, and the magazines about 20+ years old, so the pages are various shades of...um...sepia. When I save the page scans, do I just save them 'as is' and do contrast tweaking when I eventually OCR them?
CallOfCth'reader is offline   Reply With Quote
Old 03-08-2011, 04:08 PM   #2
pholy
Booklegger
pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.
 
pholy's Avatar
 
Posts: 1,801
Karma: 7999816
Join Date: Jun 2009
Location: Toronto, Ontario, Canada
Device: BeBook(1 & 2010), PEZ, PRS-505, Kobo BT, PRS-T1, Playbook, Kobo Touch
My Opticbook3600 software saves page images in png format. I think that or tiff is recommended for OCR purposes, as jpg is a lossy format that doesn't do well with sharp edges, if I recall correctly. It takes me about an hour to scan a 250 page book, and less than five more minutes to do the OCR. I save both so that when I come back and proof the OCR text I can look at the page image to figure out the bad conversions. I've found that ABBYY cheap edition works best with grey-scale images, not b/w - I've never felt the need to make any manual adjustments to the scanned images.
pholy is offline   Reply With Quote
Advert
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
bad dates and incorrect periodical archiving from Calibre alison87 Amazon Kindle 1 11-21-2011 03:36 PM
Buying Amazon.com books/magazines in the UK tomblond Amazon Kindle 8 08-14-2010 06:51 AM
Archiving ? HELP cjnelson Amazon Kindle 1 01-28-2010 08:12 AM
Help with archiving books on a DX nwinter Amazon Kindle 2 12-18-2009 04:30 PM
Moving eBooks across Libraries, archiving earthq Calibre 8 06-05-2009 05:00 PM


All times are GMT -4. The time now is 08:12 PM.


MobileRead.com is a privately owned, operated and funded community.