Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 05-02-2013, 07:06 AM   #1
bizzybody
Addict
bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.
 
Posts: 286
Karma: 7742186
Join Date: Apr 2007
Location: Idaho, USA
Device: Various PalmOS PDAs, Android Phones, Sharper Image Literati
Converting a PDF to mobi and having it come out right?

I bought "the girl on the dock" PDF and want to convert it to Mobi format.

There's a few issues with that. Every page has an image background for a line at the top and bottom, some pages have a separator graphic as part of the background and some pages are just an illustration with a caption.

I want to keep the illustrations and lose the rest.

I tried running it through Calibre and it looks like every single line in the PDF is a paragraph. The output has everything double-spaced and broken sentences.

Then there's the page numbers to get rid of and the author's name and book title alternating at the top of every page. (That at the page tops has always annoyed me, even with dead tree books. I'm not likely to forget what book I'm reading or who wrote it while I'm reading it.)

Finally, no table of contents. That should be the easiest thing to do. There's only 5 chapters. Might not even bother with adding one.

Is there an easy way to delete the first and last lines of every page (to remove the page numbers and the author name and book title) then remove all paragraph marks except where there's indents or a line begins with a " mark, which are standalone lines of dialog? Also need to delete all carriage returns except at the ends of each paragraph so the text can flow with different screen or font sizes.

The PDF could be a case study in "How to format a PDF in order to make it as difficult as possible to convert to another format." I suppose it'd work decently on a large tablet or reader but not on a 4.3" Android phone screen.

Last edited by bizzybody; 05-02-2013 at 07:11 AM.
bizzybody is offline   Reply With Quote
Old 05-02-2013, 09:17 AM   #2
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,547
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
PDF is destination format--and usually a final destination. It would be hard to imagine a worse source format for conversion. There is no "poof!" The very nature of PDF precludes being "poofed" to anothef format. There's really nothing to be done except diving in and cleaning things up by hand. Regex can help if you're capable, but any such search & replace would have to be tailored to each individual document. There is no "regex X will do Y."

Good luck.
DiapDealer is online now   Reply With Quote
Advert
Old 05-03-2013, 01:17 AM   #3
Aerys
Connoisseur
Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.
 
Aerys's Avatar
 
Posts: 51
Karma: 29994
Join Date: Nov 2011
Location: Manila, Philippines
Device: iPad 2 & Nexus 7
Quote:
Originally Posted by bizzybody View Post
I bought "the girl on the dock" PDF and want to convert it to Mobi format.

Is there an easy way to delete the first and last lines of every page (to remove the page numbers and the author name and book title) then remove all paragraph marks except where there's indents or a line begins with a " mark, which are standalone lines of dialog? Also need to delete all carriage returns except at the ends of each paragraph so the text can flow with different screen or font sizes.
You can crop the PDF pages removing the FIRST and LAST lines (if they are all positioned in the same location throughout all the pages) before loading/converting it to "Calibre". I usually do this for page numbers and header/footers that I don't want included in the ebook.

Just a word of advice, I usually export the PDF first into an HTML file then hand building the EPUB via Sigil before I convert it to a MOBI. It makes for a cleaner code if you know a bit of HTML/CSS.
Aerys is offline   Reply With Quote
Old 05-03-2013, 02:51 AM   #4
bizzybody
Addict
bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.
 
Posts: 286
Karma: 7742186
Join Date: Apr 2007
Location: Idaho, USA
Device: Various PalmOS PDAs, Android Phones, Sharper Image Literati
What software do you use for the cropping and exporting? There's quite a lot of PDF to HTML converters available.

PDF is the only electronic format in which this book is available, so that's what I'm stuck with for a source.
bizzybody is offline   Reply With Quote
Old 05-05-2013, 10:53 PM   #5
Aerys
Connoisseur
Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.
 
Aerys's Avatar
 
Posts: 51
Karma: 29994
Join Date: Nov 2011
Location: Manila, Philippines
Device: iPad 2 & Nexus 7
I use Adobe Acrobat.
Aerys is offline   Reply With Quote
Advert
Old 05-06-2013, 06:09 AM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,547
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Acrobat is just about the only software that will allow you "really" crop PDFs. The scads of other cropping utilities only keep the cropped portions from displaying. Meaning all the info is still part of the PDF and comes back to haunt you when you try and convert.
DiapDealer is online now   Reply With Quote
Old 05-21-2013, 05:20 PM   #7
Marok
Member
Marok is a glorious beacon of lightMarok is a glorious beacon of lightMarok is a glorious beacon of lightMarok is a glorious beacon of lightMarok is a glorious beacon of lightMarok is a glorious beacon of lightMarok is a glorious beacon of lightMarok is a glorious beacon of lightMarok is a glorious beacon of lightMarok is a glorious beacon of lightMarok is a glorious beacon of light
 
Posts: 21
Karma: 12376
Join Date: Oct 2010
Device: kindle G3
The program PDFtoEpub is handy for cropping headers and footers.
Then use Sigil to tidy up the epub, and covert to mobi using Calibre.
Marok is offline   Reply With Quote
Old 08-12-2014, 02:20 PM   #8
PHC
Member
PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.
 
Posts: 21
Karma: 15000
Join Date: Feb 2014
Device: iPhone, iPad, Macbook Pro, Mac Pro
Thumbs down

I have this same damned PDF and converted it to epub: Converting PDF to epub using Acrobat and Calibre CLI .

It involves a few steps and the result is far from perfect. The PDF pagination is still there and splits the paragraph at the page break. I could further edit it in Sigil but THIS book is not worth the bother.

I converted it to mobi but the images are gone.
PHC is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting PDF to MOBI Killiney Colm Workshop 1 07-15-2012 09:59 AM
Can someone tell me why these PDF files are not converting to MOBI? ReaderEater Conversion 2 05-06-2012 09:48 AM
converting pdf to mobi BeccaPrice Conversion 2 01-03-2012 05:40 AM
Error converting pdf to mobi, and also chm to mobi Neo139 Conversion 10 08-12-2011 09:55 AM
Converting PDF to Mobi with bookmarks lordofazeroth Kindle Formats 2 01-12-2009 03:46 PM


All times are GMT -4. The time now is 08:55 AM.


MobileRead.com is a privately owned, operated and funded community.