Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 05-02-2013, 07:06 AM   #1
bizzybody
Groupie
bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.
 
Posts: 173
Karma: 1437622
Join Date: Apr 2007
Location: Idaho, USA
Device: Photon Q, LifeDrive, Tungsten E2
Converting a PDF to mobi and having it come out right?

I bought "the girl on the dock" PDF and want to convert it to Mobi format.

There's a few issues with that. Every page has an image background for a line at the top and bottom, some pages have a separator graphic as part of the background and some pages are just an illustration with a caption.

I want to keep the illustrations and lose the rest.

I tried running it through Calibre and it looks like every single line in the PDF is a paragraph. The output has everything double-spaced and broken sentences.

Then there's the page numbers to get rid of and the author's name and book title alternating at the top of every page. (That at the page tops has always annoyed me, even with dead tree books. I'm not likely to forget what book I'm reading or who wrote it while I'm reading it.)

Finally, no table of contents. That should be the easiest thing to do. There's only 5 chapters. Might not even bother with adding one.

Is there an easy way to delete the first and last lines of every page (to remove the page numbers and the author name and book title) then remove all paragraph marks except where there's indents or a line begins with a " mark, which are standalone lines of dialog? Also need to delete all carriage returns except at the ends of each paragraph so the text can flow with different screen or font sizes.

The PDF could be a case study in "How to format a PDF in order to make it as difficult as possible to convert to another format." I suppose it'd work decently on a large tablet or reader but not on a 4.3" Android phone screen.

Last edited by bizzybody; 05-02-2013 at 07:11 AM.
bizzybody is offline   Reply With Quote
Old 05-02-2013, 09:17 AM   #2
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,251
Karma: 42123822
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
PDF is destination format--and usually a final destination. It would be hard to imagine a worse source format for conversion. There is no "poof!" The very nature of PDF precludes being "poofed" to anothef format. There's really nothing to be done except diving in and cleaning things up by hand. Regex can help if you're capable, but any such search & replace would have to be tailored to each individual document. There is no "regex X will do Y."

Good luck.
DiapDealer is offline   Reply With Quote
 
Enthusiast
Old 05-03-2013, 01:17 AM   #3
Aerys
Connoisseur
Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.
 
Aerys's Avatar
 
Posts: 51
Karma: 29994
Join Date: Nov 2011
Location: Manila, Philippines
Device: iPad 2 & Nexus 7
Quote:
Originally Posted by bizzybody View Post
I bought "the girl on the dock" PDF and want to convert it to Mobi format.

Is there an easy way to delete the first and last lines of every page (to remove the page numbers and the author name and book title) then remove all paragraph marks except where there's indents or a line begins with a " mark, which are standalone lines of dialog? Also need to delete all carriage returns except at the ends of each paragraph so the text can flow with different screen or font sizes.
You can crop the PDF pages removing the FIRST and LAST lines (if they are all positioned in the same location throughout all the pages) before loading/converting it to "Calibre". I usually do this for page numbers and header/footers that I don't want included in the ebook.

Just a word of advice, I usually export the PDF first into an HTML file then hand building the EPUB via Sigil before I convert it to a MOBI. It makes for a cleaner code if you know a bit of HTML/CSS.
Aerys is offline   Reply With Quote
Old 05-03-2013, 02:51 AM   #4
bizzybody
Groupie
bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.bizzybody ought to be getting tired of karma fortunes by now.
 
Posts: 173
Karma: 1437622
Join Date: Apr 2007
Location: Idaho, USA
Device: Photon Q, LifeDrive, Tungsten E2
What software do you use for the cropping and exporting? There's quite a lot of PDF to HTML converters available.

PDF is the only electronic format in which this book is available, so that's what I'm stuck with for a source.
bizzybody is offline   Reply With Quote
Old 05-05-2013, 10:53 PM   #5
Aerys
Connoisseur
Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.Aerys is faster than slow light.
 
Aerys's Avatar
 
Posts: 51
Karma: 29994
Join Date: Nov 2011
Location: Manila, Philippines
Device: iPad 2 & Nexus 7
I use Adobe Acrobat.
Aerys is offline   Reply With Quote
Old 05-06-2013, 06:09 AM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,251
Karma: 42123822
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Acrobat is just about the only software that will allow you "really" crop PDFs. The scads of other cropping utilities only keep the cropped portions from displaying. Meaning all the info is still part of the PDF and comes back to haunt you when you try and convert.
DiapDealer is offline   Reply With Quote
Old 05-21-2013, 05:20 PM   #7
Marok
Member
Marok began at the beginning.
 
Posts: 17
Karma: 10
Join Date: Oct 2010
Device: kindle G3
The program PDFtoEpub is handy for cropping headers and footers.
Then use Sigil to tidy up the epub, and covert to mobi using Calibre.
Marok is offline   Reply With Quote
Old 08-12-2014, 02:20 PM   #8
PHC
Junior Member
PHC can extract oil from cheesePHC can extract oil from cheesePHC can extract oil from cheesePHC can extract oil from cheesePHC can extract oil from cheesePHC can extract oil from cheesePHC can extract oil from cheesePHC can extract oil from cheese
 
Posts: 7
Karma: 1000
Join Date: Feb 2014
Device: iPhone, iPad
Thumbs down

I have this same damned PDF and converted it to epub: Converting PDF to epub using Acrobat and Calibre CLI .

It involves a few steps and the result is far from perfect. The PDF pagination is still there and splits the paragraph at the page break. I could further edit it in Sigil but THIS book is not worth the bother.

I converted it to mobi but the images are gone.
PHC is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting PDF to MOBI Killiney Colm Workshop 1 07-15-2012 09:59 AM
Can someone tell me why these PDF files are not converting to MOBI? ReaderEater Conversion 2 05-06-2012 09:48 AM
converting pdf to mobi BeccaPrice Conversion 2 01-03-2012 05:40 AM
Error converting pdf to mobi, and also chm to mobi Neo139 Conversion 10 08-12-2011 09:55 AM
Converting PDF to Mobi with bookmarks lordofazeroth Kindle Formats 2 01-12-2009 03:46 PM


All times are GMT -4. The time now is 04:56 AM.


MobileRead.com is a privately owned, operated and funded community.