View Single Post
Old 05-19-2012, 02:14 AM   #2
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by pocketsprocket View Post
I am new to ebooks. I got a Kindle 4 and this is awesome because I can view my PDF textbooks, search them, and take notes. However, searching and notes is very impractical with the soft keyboard, and I was able to obtain a Kindle 2 with the hardware keyboard. Unfortunately, this Kindle does not allow notes on PDFs. So I am looking at converting my PDFs to Mobi, and I need some kind of PDF conversion anyway so I don't have to constantly scroll left and right to read the book.

My first trial of the Calibre PDF to Mobi conversion simply removed all the images from the PDF. I have also tried k2pdfopt with varying degrees of success, but neither of these methods had a table of contents in the resulting output (the original PDF does have a TOC).

Copying the entire contents of the PDF and pasting into a Word processor does not copy the images, which I need because it's a math book.

What can I do to convert a PDF to Mobi, KEEPING the images where they were in the PDF, and KEEPING the TOC? I think I can get k2pdfopt to work enough to reformat the PDF for viewing on the Kindle screen.

If there are guides that describe how to do exactly what I need, please point me to them. I have already done more than a bit of research and haven't found what I need and I am out of time for the moment. Thank you!
Well, I hate to be the one to tell you this, but there's no magic PDF-->MOBI bullet, no software that does it automatically (no matter what they claim), etc. Whatever you do will require a LOT of hand-fixing.

You have several choices:
  1. You can export the entire PDF to html (or to Word) from within Adobe Acrobat Pro X. This will produce a mess, but it's a mess you can work with, and all the images will export. You'll have to open the result either in Word or a good html editor (I use NoteTabPro) and clean up all the junk that Acrobat will export.
  2. You can export the images, again using AA Pro X, and then copy-and-paste all the text, as you already mentioned, and insert the images where needed in your (I presume) Word file, and then try to convert THAT with Calibre, since that seems to be the weapon of choice here, or,
  3. You can try exporting it as XML (again AA Pro X) and use regex to turn that into usable HTML, and make THAT into a MOBI.
  4. Lastly, you can do what the vast majority of companies actually do, and scan/OCR the entire PDF, output the OCR'd mess into HTML or Word, and convert THAT, after going through and cleaning it up.

Yes, I know--none of that sounds like fun, but with any type of PDF that is anything other than plain narrative paragraphical text, those are pretty much your options. We get "former-PDF's" in here ALL the time, from people who used some type of online "converter" or some $19.95 "magic converter" that they bought, and the output is utter garbage. Some files, you can get decent xml and use regex and scripts to turn it into viable html, if you have AA Pro X, but the bottom line is: PDF and ePUB/MOBI are inimical. Night and Day, two completely different TYPES of formats.

You should also be prepared, if you don't have access to something like ABBYY Fine Reader or Acrobat, for the fact that you'll manually have to go through and remove all the Verso and Recto page headers and page footers, by hand.

Sorry to sound like a downer--but honestly, PDF is the hardest format (okay--maybe LaTex is harder) to convert to real eBook formats (ePUB, MOBI).

Hope this helped somewhat,
Hitch
Hitch is offline   Reply With Quote