Quote:
Originally Posted by jimdays
My friend has a book that he wrote and self published (print) about 12 years ago. He has the pdf file which is accurate, but when I tried various methods (Calibre, on-line converters) to convert to mobi, some systemic problems appeared:
1) In the original pdf, at the top of each page is the name of the chapter. In the mobi, this name gets mixed into the text (instead of appearing at the top of the page). One solution (maybe) is just to delete the chapter name at the top of each page on the pdf.
2) In the original pdf, page number is in margin. In the mobi, this number gets mixed into the text. I suppose the page numbers could just be deleted in the pdf.
3) Pictures have captions. In the mobi, these captions seem to get mixed into the text.
I'm not sure what to do about this because the captions are important.
4) An Android ebook reader (not Kindle) showed the pictures out of order.
I would like that different brands of e-book readers display the pictures in the correct order.
The original pdf is here (51MB):
http://files.videohelp.com/u/61125/Buer.pdf
If someone could take a look at the pdf and give me some ideas how to correct these problems, please let me know.
|
These are perfectly typical scan/OCR errors. There's nothing new here.
Other folks will say that you can use Calibre, or this, or that, but in my experience, the best method is AbbyyFineReader. There are settings in Abbyy that are specifically to look for running headers (your 1), page numbers (your 2), and captions should be retained with the images (your 3).
This--pretty much exactly--is why conversion houses charge a good whack to do what you are trying to do. Your choice is simply to bite down, and pay to have it done with Abbyy, or to do the manual clean up yourself, in the HTML, to clean up the eBook file.
You will need to insert the images into the correct place, and anchored correctly to either a) in-between paragraphs or, b) specific paragraphs. Then you'll create the captions, as a class of element, and add it.
What you've taken on isn't simple, if you are not experienced in this already. Particularly if you are preparing it for public retail distribution.
Someone around here has mentioned an OS alternative to Abbyy, but honestly, I don't remember the name. Hopefully someone will post it here.
Good luck. You have a lot of work in front of you, but it can be done with a lot of elbow grease--particularly if you don't have the knowledge or tools to do the AbbyyFineReader-->option.
Hitch