View Single Post
Old 01-18-2011, 02:18 PM   #7
heddhunter
Junior Member
heddhunter began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jan 2011
Device: Kindle 3
Well, I've come up with a solution that works for my particular situation but I'm pretty sure it is not going to work for any random pdf. I wrote a perl script that takes pdftohtml's XML output and rewrites it into HTML. The XML is fairly easy to clean up. There are a few simple rules I use to detect paragraph breaks. I load the HTML output into Calibre and then let Calibre do its normal conversion stuff to get the final book onto my Kindle. It's not a simple drag n drop procedure though. And, as I say, I don't think it will work generically.

I guess if people have specific pdf's they want me to take a look at I could do that and see if there's a way to make the conversion procedure somewhat simpler.
heddhunter is offline   Reply With Quote