View Single Post
Old 11-12-2014, 09:04 PM   #6
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by MaudlinHaus View Post
[...] We don't create our epubs in house (we send PDFs to conversion house), but since I have a background in HTML/CSS, I have the opportunity to work closely with our vendor to get the kind of output we want, standardize design, etc.

[...]

RE: PDF, I would guess that most presses send PDF for conversion. What format would you send instead?
I then recommend exporting directly to HTML or EPUB from whatever program you are creating. I am assuming you are using InDesign/Quark?

You will then have to either do the HTML/EPUB cleaning in-house, or find a conversion place (or independent contractor) that DOES handle InDesign/EPUB/HTML files directly.

(Note: These conversion houses are most likely going to be more expensive than the dirt-cheap "Indian"/"Chinese" conversion companies, but you will pay slightly more for much higher quality).

The closer you can work to the original source material, the better!

Look, you ALREADY have the exact digital text... leading it through PDF is just going to create a whole host of extra problems. PDF is meant as a final OUTPUT format, it is just about THE WORST format to ever work backwards from.

Workflow A (Correct):
  • InDesign/Quark/Word
    • Output to EPUB/HTML
  • Clean the EPUB
    • Since all of the text matches EXACTLY, all your time is just spent on cleaning up ugly code.
    • Then you just have to spend some time making sure everything was exported correctly
      • Making sure captions went below all the images
      • Footnotes are working and are in the right place, etc. etc.
  • Final EPUB
    • Test on your device, and fix up any minor mistakes you catch.

Workflow B (Horrible, Inefficient, Waste):
  • Indesign/Quark
    • Output to PDF
  • OCR the PDF
    • This is where you run it through a program which takes the image, and tries to "guess" what the character is.
    • This leads to A TON of extra steps... and depending on the complexity of the book, HOURS AND HOURS.
    • For example, here is one of my posts explaining the entire PDF -> EPUB process: https://www.mobileread.com/forums/sho...72&postcount=6
    • Also, the text might not be 100% correct, typos might will be introduced.
  • Clean the EPUB
    • Many many hours of work, you have to make sure all the paragraphs are attached, plus all of the same work as Workflow A (captions, footnotes, etc. etc.).
    • Now, since the text might not be 100%, you also have to spend a lot of time spellchecking, and looking for typos, fixing hyphenation, finding weird symbols that might have been introduced due to the OCR.
      • Missing accents on letters
      • Extra apostrophes.
      • Squiggly brackets instead of normal brackets/parenthesis.
      • [...]
  • Final EPUB
    • Test on your device, and fix up any minor mistakes you catch.

It is maybe turning a "few hour" job of just cleaning code, into a "many hour" job.

Quote:
Originally Posted by MaudlinHaus View Post
One issue we initially had with the conversion house was image resolution. We proof our books on an iPad, which has a higher density display than a standard screen, and the display of images in the books was really poor.
You have the source files... you have the original image files. Just plop those right into the EPUB, and fix up whatever needs to be tweaked (filenames, file size, etc. etc.).

For example, you don't want your original 3-5MB+ cover file in your EPUB. You might want to save that as a lower quality JPG.

What typically happens is the method used to pull the image from the PDF -> EPUB probably degraded it. Again, this is one of the flaws with working as PDF as the INPUT format.

You have the advantage, because you guys already have all the source files.

So what you would typically do, is hand over your original InDesign files, PLUS, hand over a ZIP file of all the original images. (This is how I handle InDesign -> EPUB conversions).

Quote:
Originally Posted by MaudlinHaus View Post
However, the hi-def images look crap in Adobe Digital Editions for desktop PC (1280x1024 screen--I know, not the most modern setup). The article warned that there'd be some downsampling and possible loss of quality on standard def screens, but the difference is really stark.
Hmmm, would it be possible to show any examples? I personally haven't seen things get TOO bad from high resolution -> lower resolution images. Only spot I can think of is if text is in the images.

And as Toxaris said, it is really up to the resizing algorithms of the device.

Quote:
Originally Posted by MaudlinHaus View Post
Some images that have text in them are almost unreadable in ADE. What to do? Have any other posters wrestled with this issue?
Text....... in images... you say? That is one of my biggest pet peeves! What sort of data is being displayed here... is this Tables saved as images? Is this charts/graphs?

My personal philosophy is to avoid text in images as much as possible, and try to pull as much of that into the HTML equivalent as possible. Where you HAVE to use it, save as PNG (AVOID JPG IN THAT CASE).

If it is a vector chart/graph, already in InDesign/Illustrator, then it would probably be best to go back to the source material, and generate a "lower resolution" PNG directly from the vector files!

Side Note: I wrote about advantages/disadvantages of HTML/Image Tables here: https://www.mobileread.com/forums/sho...d.php?t=223062

Quote:
Originally Posted by MaudlinHaus View Post
I want to future proof the books but I don't want to alienate desktop readers or make them think we have a crap product. Thanks in advance for any help you can offer.
So say we all... you should be releasing high quality stuff... not crap! Which is why you should avoid those dirt-cheap conversion companies... that way will only bring headaches (and you will have a lot more overhead/problems in the long-run). The correct way is to get it done RIGHT the first time. Not, get it done cheaply/crappily, and then pay someone to come around AGAIN, to have to double-check all the work, and clean up the file and get it done right.

The entire reason I got into this in the first place was being of EPUBs that were HORRIBLY converted. Tons of typos, tons of mistakes, horrible code, low-resolution images, etc. etc.

Anyway, I work on non-fiction economics books mostly, and I do a lot of work doing PDF -> EPUB conversions (mostly from scanned books).

When I work on newer books though, where we have the original InDesign files, that is DEFINITELY the way to go. Avoid PDF completely if you can.

Side Note: This isn't getting into the discussion of perhaps having to change the entire "print book" workflow. Typically, the companies do a "print book FIRST" mentality, and then ebook is just a dirty side-thought. What has to start happening, is shifting to an "HTML/ebook FIRST" mentality. And start designing the books in InDesign in ways which will make it easier to generate both (consistent usage of styles/classes, etc. etc.).

Last edited by Tex2002ans; 11-12-2014 at 09:35 PM.
Tex2002ans is offline   Reply With Quote