![]() |
#1 |
Cultist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 196
Karma: 8624438
Join Date: Jun 2009
Location: UK
Device: Sony PRS 505, Kobo Mini, Kobo Glo, Kobo Forma, Kindle DX
|
Best way to archive a book - PDF?
I'm at the point where I need to get rid of a number of books, due to space considerations. They are almost all second-hand books, in a condition where they wouldn't grace anyone's bookshelf, and almost every one is not available in e-book format to buy.
![]() I don't mind destroying the books, so that I can get flat scans of the pages (and at least it makes the paper easier to recycle), but what is the best format to archive them to? My immediate thought was archive to PDF, and then convert to EPUB as and when I have the time and inclination to do it. How good is OCRing from PDFs? Or should I just archive the scans themselves, and OCR as and when? |
![]() |
![]() |
![]() |
#2 | |
Samurai Lizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 14,794
Karma: 69500000
Join Date: Nov 2009
Device: NookColor, Nook Glowlight 4
|
Quote:
As far as PDF goes, I view PDF as a good final destination format since it preserves all of the formatting. However, I don't keep PDFs as my archive copy. Rather, I save my archive copy in other formats (mostly OpenDocument Text) and generate the PDF from that. The main reason for this is that PDFs tend to be good at one size, and not so good at other sizes. But I can take the archive source, adjust the formatting, and make a PDF appropriate for whatever use I need (such as on my computer screen, on my reader, or on paper). I hope this helps. |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Groupie
![]() ![]() ![]() ![]() ![]() Posts: 195
Karma: 414
Join Date: Jan 2010
Location: Bend, OR
Device: Sony PRS-600
|
I really suggest RTF or EPub for archival purposes. PDF will never convert well to EPub. As Solitaire said, PDFs should be viewed as a final destination; not something you generate other formats from.
Robert |
![]() |
![]() |
![]() |
#4 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
In addition to the text versions, I would keep a PDF made from the raw (or minimally pre-processed) scanned images, at least until you've read the book and fix all possible issues. Too often the OCR gets screwed and you really need to check the printed book (or the scans) to see what's really there.
|
![]() |
![]() |
![]() |
#5 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 210
Karma: 1000659
Join Date: Jan 2009
Location: Sunnyvale, CA
Device: Kindle Voyage, Kobo Aura H2O, PRS-650 (black), Kindle 3G
|
In addition to what Jellby has already said, that raw pdf (no ocr performed) can then later on be run through a program (Abbyy FineReader, for example) that takes pdf input and does its recognition to do the ocr. You have the archive, can trash (err, recycle) the book and move onto another one then come back for the ocr process at a later point.
|
![]() |
![]() |
Advert | |
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Remove from Archive (book already "deleted" in Amazon account) | kindletommy | Amazon Kindle | 9 | 08-09-2012 06:17 PM |
PRS-500 Manga2Ebook, initial release. Convert your manga/comics-archive to PDF (.net2.0) | athlonkmf | Sony Reader Dev Corner | 48 | 02-22-2011 09:44 AM |
E-book Interview the Second: The Tainted Archive | Steven Lyle Jordan | News | 1 | 10-09-2009 10:34 AM |
Internet Archive wants book copyright indemnity like Google | anurag | News | 0 | 04-19-2009 11:40 PM |
iLiad Manga2Ebook, initial release. Convert your manga/comics-archive to PDF (.net2.0) | athlonkmf | iRex Developer's Corner | 0 | 06-02-2007 11:39 AM |