01-31-2016, 09:49 PM | #1 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2014
Device: kindle touch WP63GW
|
Specific problem converting pdf to mobi
My friend has a book that he wrote and self published (print) about 12 years ago. He has the pdf file which is accurate, but when I tried various methods (Calibre, on-line converters) to convert to mobi, some systemic problems appeared:
1) In the original pdf, at the top of each page is the name of the chapter. In the mobi, this name gets mixed into the text (instead of appearing at the top of the page). One solution (maybe) is just to delete the chapter name at the top of each page on the pdf. 2) In the original pdf, page number is in margin. In the mobi, this number gets mixed into the text. I suppose the page numbers could just be deleted in the pdf. 3) Pictures have captions. In the mobi, these captions seem to get mixed into the text. I'm not sure what to do about this because the captions are important. 4) An Android ebook reader (not Kindle) showed the pictures out of order. I would like that different brands of e-book readers display the pictures in the correct order. The original pdf is here (51MB): http://files.videohelp.com/u/61125/Buer.pdf If someone could take a look at the pdf and give me some ideas how to correct these problems, please let me know. |
01-31-2016, 11:06 PM | #2 |
Bookaholic
Posts: 14,391
Karma: 54969924
Join Date: Oct 2007
Location: Minnesota
Device: iPad Mini 4, AuraHD, iPhone XR +
|
I guess my first question is do you have to use the PDF as your source file? Does he not have the original Word (or whatever it was written in) doc for a starting point? PDF is one of the worst source formats you can begin with.
If I was formatting this and my only choice was to use the PDF I'd use Acrobat or some other tool to export the PDF to HTML and use that as a base to edit and build an ePub using either Sigil or the Calibre Editor. Auto converters are generally not going to cut it. Once I had the ePub how I wanted it I'd have a good source format to either upload to stores as is or convert for the Kindle store using KindleGen/Kindle Previewer. The easiest way to avoid those page numbers and headers is to crop them out of the PDF before doing anything else with it. Just be sure it's really cropped as some programs will just hide the cropped area and those things will still show up in a conversion. |
Advert | |
|
01-31-2016, 11:30 PM | #3 | |
Bookmaker & Cat Slave
Posts: 11,460
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Other folks will say that you can use Calibre, or this, or that, but in my experience, the best method is AbbyyFineReader. There are settings in Abbyy that are specifically to look for running headers (your 1), page numbers (your 2), and captions should be retained with the images (your 3). This--pretty much exactly--is why conversion houses charge a good whack to do what you are trying to do. Your choice is simply to bite down, and pay to have it done with Abbyy, or to do the manual clean up yourself, in the HTML, to clean up the eBook file. You will need to insert the images into the correct place, and anchored correctly to either a) in-between paragraphs or, b) specific paragraphs. Then you'll create the captions, as a class of element, and add it. What you've taken on isn't simple, if you are not experienced in this already. Particularly if you are preparing it for public retail distribution. Someone around here has mentioned an OS alternative to Abbyy, but honestly, I don't remember the name. Hopefully someone will post it here. Good luck. You have a lot of work in front of you, but it can be done with a lot of elbow grease--particularly if you don't have the knowledge or tools to do the AbbyyFineReader-->option. Hitch |
|
02-01-2016, 12:02 AM | #4 |
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Hitch -- definitely not calibre, as calibre doesn't have any OCR capabilities at all.
Google's open-source Tesseract engine (with various frontends) is supposed to be the best freely-available OCR and is also cross-platform. But there is a reason why ABBYY Finereader costs so much money, as you said, it really is better and is worth the money. (Gah! PDF conversions... here is calibre's official warning about PDF.) |
02-01-2016, 01:43 AM | #5 | ||
Bookmaker & Cat Slave
Posts: 11,460
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Quote:
Hitch |
||
Advert | |
|
02-01-2016, 01:57 AM | #6 | |
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
I mean, converting in calibre is better than nothing, and there are some tools to help extract the right text (and do line unwrapping)... but at the end of the day, you probably want professional software. IIRC, ABBYY can actually export a (relatively) good EPUB/HTML/DOCX -- no idea what is best, but lots nicer than PDF as a conversion source. |
|
02-04-2016, 02:30 PM | #7 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2014
Device: kindle touch WP63GW
|
I ended up using Foxit pdf editor to delete the page numbers and chapter name at top of each page. Then I used Abby Finereader 12 to convert to epub and then I used Sigil epub editor to make some adjustments to the epub. Finally I used an online converter to convert to mobi. Only problem is the captions for the pictures. I ended up just typing in the captions below the picture. The problem with this is - depending how the user adjusts the size of font on the Kindle - the captions could be separated from the picture, ie caption on the next page. The Abbyy program strangely put just 3 of the captions IN the picture at the bottom of the picture. (This occurred in captions: Ted Orange, Pops, Exit tunnel where Lucy came from.) This is good, but the font of the captions was too small and light. When the captions are part of the picture, then the captions cannot be separated from the picture if the user selects a large font. The ebook looks good on my Kindle Touch as long as I choose a normal size font.
The result is here: http://files.videohelp.com/u/61125/Buer.mobi Last edited by jimdays; 02-04-2016 at 03:17 PM. |
02-04-2016, 06:02 PM | #8 | |
Bookmaker & Cat Slave
Posts: 11,460
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Just saying. However, it sounds like you did a GREAT job. Hitch |
|
02-04-2016, 07:25 PM | #9 |
Wizard
Posts: 1,613
Karma: 6718479
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
|
This may not be the result of something Abbyy did but instead something that was done by the application that created the PDF (one of the many many reasons why PDF is a horrid source for ebook conversion). The app that created the PDF my have reacted to some layout difficulty (text over top of a picture, ...) by rasterizing the text with the picture. This leaves no text, per se, in the PDF and if Abbyy can't distinguish the text component as separate from the image portion it simply ends up with a picture.
|
02-05-2016, 02:42 AM | #10 | |
Bookmaker & Cat Slave
Posts: 11,460
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Hitch |
|
02-05-2016, 09:59 AM | #11 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2014
Device: kindle touch WP63GW
|
I noticed that when I look at the above ebook file that I made:
http://files.videohelp.com/u/61125/Buer.mobi -when viewed on a Kindle Touch, the Kindle Touch honors the double space I put in some places (like right after the book cover picture, so that the book cover picture remains separate from any text), but when viewed on an Android device with the app called Cool Reader, that app doesn't honor the double spaces. Also the Kindle Touch automatically put a double space between paragraphs, while the Android device indents the paragraphs. The double space between paragraphs (and other places) makes the ebook easier on the eyes. Can someone take a look at the above mobi file on an ebook reader (Kindle or other kind) and give comments/suggestions on appearance (formatting)? Also, is there a software that can be used on a PC to simulate a Kindle? In other words, the appearance will be exactly the same as a Kindle. The purpose here is not to read the book on a PC, but to know exactly how the book will appear on a Kindle. Another question: Is there an official protocol to insert captions when viewed on a Kindle? This protocol would not allow captions to get separated from the picture if the user chooses a large font. And what software would do this? (Though I could, I don't really want to put the captions IN the bottom of the picture - like with photo editor Irfanview.) Lastly, this book has some misspellings and other mistakes. My purpose is not to correct these problems - that will be left to the author to do. In some way of thinking, I think the mistakes add to the overall effect of the book. Last edited by jimdays; 02-05-2016 at 10:16 AM. |
02-05-2016, 10:24 AM | #12 | ||
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Quote:
Quote:
|
||
02-05-2016, 03:31 PM | #13 | ||||||
Bookmaker & Cat Slave
Posts: 11,460
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Quote:
Quote:
Speaking of: your cover is not accessible from the GoTo menu. That will earn you a gig. Quote:
Quote:
Quote:
And, I assume you already know that the author can't live-edit a MOBI. Not that this is the point of this subforum or thread, but, you should come up with a process surrounding a form, that the author can use, from which you will make the edits. I recommend that you pop this puppy through the KDP Bookshelf upload process, at the least, to the spell-checker. Download/email the results to the author, and get the edits made BEFORE you put this on sale. Just my $.02. Hitch |
||||||
02-05-2016, 04:36 PM | #14 | |
Wizard
Posts: 1,539
Karma: 6613969
Join Date: Mar 2013
Location: Rosario - Santa Fe - Argentina
Device: Kindle 4 NT
|
Quote:
https://www.mobileread.com/forums/sho...d.php?t=270351 and especially these posts: https://www.mobileread.com/forums/sho...04&postcount=7 https://www.mobileread.com/forums/sho...9&postcount=11 But remember, that is for Kindles that support .kf8, not for old models (that is, K1, K2 and KDX). By using a svg wapper to include a caption, this last one will remain with the image no matter the font size chosen by the user. Regards |
|
02-15-2016, 08:57 PM | #15 |
Member
Posts: 15
Karma: 10
Join Date: Dec 2014
Device: kindle touch WP63GW
|
OK, I got the ebook up on the Amazon site. As the author didn't want to spend any time with details, I just took the quick default when given options. The price for the ebook is $3.49 (default number). My friend wanted to be the first to buy (his) the book, so he bought the ebook and it was immediately sent to and appeared on his Kindle Fire home screen.
My question is: I noticed that only the Kindle version is available and this is sent wirelessly to a Kindle. What if somebody wants the epub file? When putting the ebook on Amazon, is there an option to have both Kindle and epub available? How would someone put this Kindle book on an Android phone, for example? When I get free public library ebooks from Overdrive, I am given the option to get epub or Kindle version. I always get the epub version because I refuse to register my own Kindle Touch. So I get the epub version, strip the DRM, convert to mobi and then put the file on my Kindle Touch via USB. When I'm done reading the book, I just delete the mobi and index (sdr) file. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Justification problem in converting PDF to MOBI | Andrew Forbes | Conversion | 1 | 03-29-2012 09:15 AM |
problem converting pdf to mobi | typex1 | Conversion | 3 | 12-23-2011 04:50 PM |
Error converting pdf to mobi, and also chm to mobi | Neo139 | Conversion | 10 | 08-12-2011 09:55 AM |
Problem converting ePub to Mobi | Wordslinger | Conversion | 3 | 03-12-2011 11:04 PM |