Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 05-16-2012, 09:39 AM   #1
pocketsprocket
Junior Member
pocketsprocket began at the beginning.
 
Posts: 2
Karma: 10
Join Date: May 2012
Device: Kindle
PDF to Mobi with text and images

I am new to ebooks. I got a Kindle 4 and this is awesome because I can view my PDF textbooks, search them, and take notes. However, searching and notes is very impractical with the soft keyboard, and I was able to obtain a Kindle 2 with the hardware keyboard. Unfortunately, this Kindle does not allow notes on PDFs. So I am looking at converting my PDFs to Mobi, and I need some kind of PDF conversion anyway so I don't have to constantly scroll left and right to read the book.

My first trial of the Calibre PDF to Mobi conversion simply removed all the images from the PDF. I have also tried k2pdfopt with varying degrees of success, but neither of these methods had a table of contents in the resulting output (the original PDF does have a TOC).

Copying the entire contents of the PDF and pasting into a Word processor does not copy the images, which I need because it's a math book.

What can I do to convert a PDF to Mobi, KEEPING the images where they were in the PDF, and KEEPING the TOC? I think I can get k2pdfopt to work enough to reformat the PDF for viewing on the Kindle screen.

If there are guides that describe how to do exactly what I need, please point me to them. I have already done more than a bit of research and haven't found what I need and I am out of time for the moment. Thank you!
pocketsprocket is offline   Reply With Quote
Old 05-19-2012, 02:14 AM   #2
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,447
Karma: 157030631
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by pocketsprocket View Post
I am new to ebooks. I got a Kindle 4 and this is awesome because I can view my PDF textbooks, search them, and take notes. However, searching and notes is very impractical with the soft keyboard, and I was able to obtain a Kindle 2 with the hardware keyboard. Unfortunately, this Kindle does not allow notes on PDFs. So I am looking at converting my PDFs to Mobi, and I need some kind of PDF conversion anyway so I don't have to constantly scroll left and right to read the book.

My first trial of the Calibre PDF to Mobi conversion simply removed all the images from the PDF. I have also tried k2pdfopt with varying degrees of success, but neither of these methods had a table of contents in the resulting output (the original PDF does have a TOC).

Copying the entire contents of the PDF and pasting into a Word processor does not copy the images, which I need because it's a math book.

What can I do to convert a PDF to Mobi, KEEPING the images where they were in the PDF, and KEEPING the TOC? I think I can get k2pdfopt to work enough to reformat the PDF for viewing on the Kindle screen.

If there are guides that describe how to do exactly what I need, please point me to them. I have already done more than a bit of research and haven't found what I need and I am out of time for the moment. Thank you!
Well, I hate to be the one to tell you this, but there's no magic PDF-->MOBI bullet, no software that does it automatically (no matter what they claim), etc. Whatever you do will require a LOT of hand-fixing.

You have several choices:
  1. You can export the entire PDF to html (or to Word) from within Adobe Acrobat Pro X. This will produce a mess, but it's a mess you can work with, and all the images will export. You'll have to open the result either in Word or a good html editor (I use NoteTabPro) and clean up all the junk that Acrobat will export.
  2. You can export the images, again using AA Pro X, and then copy-and-paste all the text, as you already mentioned, and insert the images where needed in your (I presume) Word file, and then try to convert THAT with Calibre, since that seems to be the weapon of choice here, or,
  3. You can try exporting it as XML (again AA Pro X) and use regex to turn that into usable HTML, and make THAT into a MOBI.
  4. Lastly, you can do what the vast majority of companies actually do, and scan/OCR the entire PDF, output the OCR'd mess into HTML or Word, and convert THAT, after going through and cleaning it up.

Yes, I know--none of that sounds like fun, but with any type of PDF that is anything other than plain narrative paragraphical text, those are pretty much your options. We get "former-PDF's" in here ALL the time, from people who used some type of online "converter" or some $19.95 "magic converter" that they bought, and the output is utter garbage. Some files, you can get decent xml and use regex and scripts to turn it into viable html, if you have AA Pro X, but the bottom line is: PDF and ePUB/MOBI are inimical. Night and Day, two completely different TYPES of formats.

You should also be prepared, if you don't have access to something like ABBYY Fine Reader or Acrobat, for the fact that you'll manually have to go through and remove all the Verso and Recto page headers and page footers, by hand.

Sorry to sound like a downer--but honestly, PDF is the hardest format (okay--maybe LaTex is harder) to convert to real eBook formats (ePUB, MOBI).

Hope this helped somewhat,
Hitch
Hitch is offline   Reply With Quote
Advert
Old 05-19-2012, 04:48 AM   #3
pocketsprocket
Junior Member
pocketsprocket began at the beginning.
 
Posts: 2
Karma: 10
Join Date: May 2012
Device: Kindle
Thanks for your reply. I was afraid of something like this. Some of the programs I've tried have come very close, but scrolling through the book I always find some graph or image that's totally messed up. Because it's a math book, it's very important that everything be in precisely the right place. I can't have question 3 in front of where question 4 goes, and tons of different images have to line up right. It's a 1200+ page textbook and there are probably 10,000 images I'd have to go through and correct, which is utterly impractical. I've OCRd it and tried some of the methods you suggested but I guess this is just not going to happen for me. Thanks again for your time.
pocketsprocket is offline   Reply With Quote
Old 05-19-2012, 05:55 AM   #4
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by pocketsprocket View Post
Thanks for your reply. I was afraid of something like this. Some of the programs I've tried have come very close, but scrolling through the book I always find some graph or image that's totally messed up. Because it's a math book, it's very important that everything be in precisely the right place. I can't have question 3 in front of where question 4 goes, and tons of different images have to line up right. It's a 1200+ page textbook and there are probably 10,000 images I'd have to go through and correct, which is utterly impractical. I've OCRd it and tried some of the methods you suggested but I guess this is just not going to happen for me. Thanks again for your time.
This is precisely what something like an iPad or an equivalent Android tablet is generally recommended for textbooks, because they can display the PDF well, and correctly. It just can't be done on an 6" device.
HarryT is offline   Reply With Quote
Old 05-19-2012, 06:47 AM   #5
amgoforth
Groupie
amgoforth ought to be getting tired of karma fortunes by now.amgoforth ought to be getting tired of karma fortunes by now.amgoforth ought to be getting tired of karma fortunes by now.amgoforth ought to be getting tired of karma fortunes by now.amgoforth ought to be getting tired of karma fortunes by now.amgoforth ought to be getting tired of karma fortunes by now.amgoforth ought to be getting tired of karma fortunes by now.amgoforth ought to be getting tired of karma fortunes by now.amgoforth ought to be getting tired of karma fortunes by now.amgoforth ought to be getting tired of karma fortunes by now.amgoforth ought to be getting tired of karma fortunes by now.
 
amgoforth's Avatar
 
Posts: 195
Karma: 3142469
Join Date: Oct 2007
Location: Odessa, Texas
Device: 2 Kindles, 2 Nooks, 2 Kobos, Ipad.
Quote:
Originally Posted by HarryT View Post
This is precisely what something like an iPad or an equivalent Android tablet is generally recommended for textbooks, because they can display the PDF well, and correctly. It just can't be done on an 6" device.
That is very true, and even if you have a 10 inch tablet text PDFs don't display very well, and if you reflow the text, that usually doesn't work so well either. Scanned PDFs work really good, so at this point I am trying to find a way to convert text PDFs to photo based PDFs without the file becomming to large. Does anyone here know how to do that?
amgoforth is offline   Reply With Quote
Advert
Old 05-19-2012, 08:40 PM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,465
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I've been having decent luck by:

1) Cropping with Acrobat 10 Pro (other croppers only hide headers and footers so they can come back and bite you in the ass later... Acrobat gets rid of them).

2) PDF->HTML with ABBYY FineReader 11 (manually massaging as much as possible before saving HTML).

3) Regex the poop out of that html file in Notepad++ or EditPad-Lite (it's never clear exactly what kind of reg-exps you'll need until you start wading into each file) to get it as clean as possible. Mark/create chapter headers and chapter-points so Sigil can coast later on (it can get cranky, so if I can spare it a bit of grinding... I always try to). Get a basic external stylesheet going and link it in.

3) Open in Sigil. Finalize/tweak CSS and formatting. Split at predefined chapter-markers. Generate/tweak ToC (ncx).... create an HTML ToC if necessary.

4) Proof my eyes bloody.

5) Convert to MOBI with kindlegen.

6) Apply Visine to eyeballs. See if the sun's up yet. If not, crack a beer—if so, put coffee on brew.
DiapDealer is online now   Reply With Quote
Old 05-20-2012, 03:59 AM   #7
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,447
Karma: 157030631
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by DiapDealer View Post
I've been having decent luck by:

1) Cropping with Acrobat 10 Pro (other croppers only hide headers and footers so they can come back and bite you in the ass later... Acrobat gets rid of them).

2) PDF->HTML with ABBYY FineReader 11 (manually massaging as much as possible before saving HTML).

3) Regex the poop out of that html file in Notepad++ or EditPad-Lite (it's never clear exactly what kind of reg-exps you'll need until you start wading into each file) to get it as clean as possible. Mark/create chapter headers and chapter-points so Sigil can coast later on (it can get cranky, so if I can spare it a bit of grinding... I always try to). Get a basic external stylesheet going and link it in.

3) Open in Sigil. Finalize/tweak CSS and formatting. Split at predefined chapter-markers. Generate/tweak ToC (ncx).... create an HTML ToC if necessary.

4) Proof my eyes bloody.

5) Convert to MOBI with kindlegen.

6) Apply Visine to eyeballs. See if the sun's up yet. If not, crack a beer—if so, put coffee on brew.
As I said to Diap in a Karma, yup--those are pretty much our procedures. Either 1 or 2, and then 3-6, inclusive. That's the only way we have found that is remotely viable (some PDF files really DO export rather loverly in xml, Diap--it's worth a try if you have something relatively simple, FWIW.)

Hitch
Hitch is offline   Reply With Quote
Old 05-21-2012, 07:06 AM   #8
delta007
Junior Member
delta007 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Mar 2012
Device: kindle touch
I am also trying to follow the procedure suggested above with some success.
I have to convert my medicine books to mobi either form pdf or chm format.
I have encountered two problems.

1. The size of images sometime become so small that it is difficuilt to see them.
2. some times the the poblem lies with text. After conversion to mobi the text do not fit well to screen (even after zooming in and zooming out). So some text disappear.
Can somebody help me? how to increase size of images while converting and how best to adjust the text? I am using kindle touch.

thanks.

Last edited by delta007; 05-21-2012 at 07:08 AM.
delta007 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Pdf to MOBI as pictures not as text Rikkaruohimus Conversion 4 01-28-2012 08:54 AM
PDF to Mobi - can't resize text on kindle ernesto50 Conversion 2 11-03-2011 10:20 AM
Missing text while converting PDF to mobi jhkaplan Calibre 1 12-12-2010 08:54 PM
pdf to mobi... creating images rather than text Dumhed Calibre 5 11-06-2010 12:08 PM
PDF to Epub - Images with Text ebahm Calibre 2 09-19-2010 03:23 PM


All times are GMT -4. The time now is 07:01 PM.


MobileRead.com is a privately owned, operated and funded community.