View Single Post
Old 10-14-2011, 06:18 PM   #13
Snorkledorf
Blue. Not sad...just blue
Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.
 
Snorkledorf's Avatar
 
Posts: 218
Karma: 1267018
Join Date: Oct 2009
Location: Japan
Device: Ridibooks Paper Pro
My recent tool of choice has been PDFMasher http://www.hardcoded.net/pdfmasher/ which converts PDF to epub and mobi, via Markdown and thus HTML formats. I've been finding the relative simplicity of editing the Markdown file (using BBEdit) to be convenient enough that I've actually converted a half-dozen or so books into mobi, instead of procrastinating on them like I've been doing for ages.

While PDFMasher doesn't seem to retain the bookmarks like the OP wanted, it does have the ability to remove extraneous PDF stuff like headers & footers that would interfere with reflowed text. E.g. you can sort all the elements it finds on all the pages at once, by how high/low they are on the page. The highest/lowest elements are likely to be page numbers and you can select them all and say "Ignore" and they're gone from the output text.

Still takes a lot of massaging to get clean output, but it's progress...
Snorkledorf is offline   Reply With Quote