|12-05-2008, 09:21 AM||#1|
Join Date: Dec 2008
Device: Sony PRS-700
ebook noob with a workflow strategy question.
I'm a newcomer to your wonderful world of e-books with a question of how to design a workflow for a decent-sized project. I need to recreate the content of four 1200-1400 page reference volumes for access on my Sony-700 (which arrives today). My purpose is to save myself from lugging around 2 or 3 hardcover reference texts (1000-1200 pages each, so 20+ lbs total) on a daily basis. I'm miserable enough to be willing to pay hundreds of dollars for publisher-created e-book editions but they don't exist and won't for at least one to two years by most estimations. I’ve spent much of the last week testing various ideas and software from these forums, but can’t figure out what’s the most realistic approach.
I’ve abandoned my original plan to de-bind, duplex scan, OCR to pdf, and hazard the obvious legal exposure (even though I’d only be format shifting books for which I have receipts and would retain the front covers, etc). I’ve mostly been dissuaded by reading on these boards about the practical difficulties of such an enterprise: i.e., difficulty getting satisfactory OCR and subsequent slow and poor pdf rendering with unmanageable file sizes.
The good news is that the vast majority (90%) of the info is in the public domain in some combination of text-based pdf/html/rtf/txt/etc. Unfortunately the public and private domain material is sufficiently interspersed across each volume that I cannot just keep a separate binder of that 10%. I need a single format with the ability to search across all pages, on-the-fly bookmaking and highlighting, and support for a deep hyperlinked TOC.
So: Plan B –- investing weeks scavenging the 90% from the far reaches of the public domain, and scanning only the 10% remainder (the 10% will have to be searchable, but not perfect at all). Given my choice of the Sony 700 and my need for on-the-fly-bookmarks and highlighting (all reviewable/transferable to my computer), I’m debating between epub and pdf. I toss pdf in there because I’m assuming that imperfect scanning will require reference to some page images (and because it, admittedly is what I already know best). I don’t think I’ll need to worry about resizing fonts for reflow if I get it right the first time around.
What are my best options on how to attack this? Should I assemble the pieces in sequence in html and run them through eCub (I’d need to learn more about stripping the complexity out of html – eCub turned the html into 100-page epub monstrosities that crashed Adobe DE), OR do I go with RTF and loose some of the formatting elements (including several hundred internal links within the document that I don’t need but very much would prefer to retain via Calibre or Bookcreator (also, I don’t know how badly they’ll slow down the reader, page turning, etc), OR pursue a hybrid PDF (90% text extremely simple html vector graphics and 10% ocr and artifacts of what doesn’t transpose correctly), OR do I need to layout in inDesign (I’d like to avoid doing so, but I have the software…)
I know that each of these approaches has been addressed and that most of you have strong preferences to the exclusion of others, but I hadn’t read much comparative analysis. Obviously, this is a big 50+ hour undertaking ahead of me, so I’d like to proceed with confidence, and soon so that I can start digging for the public domain options in the best formats.
Thank you in advance! This board has already been incredibly helpful for me.
Last edited by Bierkonig; 12-05-2008 at 09:28 AM.
|Thread Tools||Search this Thread|
|Thread||Thread Starter||Forum||Replies||Last Post|
|WSJ on Apple Ebook Pricing Strategy||Moejoe||News||17||01-27-2010 08:47 AM|
|EBook Strategy for small non-fiction Publisher||n8r||Workshop||0||12-09-2009 05:43 PM|
|Question on Workflow||emellaich||Calibre||1||07-16-2009 12:08 PM|
|Opinion on workflow (and enhancing it) - research-type workflow||TheDarkTrumpet||Which one should I buy?||8||03-02-2009 11:41 AM|
|Need workflow for creating EBook||venkan||Introduce Yourself||2||11-13-2008 01:24 PM|