I've been looking for an easy way to convert pdfs. Until now I was using a pdf2html program and processing the result, with mixed results. For the curious, this is what I used to convert some pdfs so they become nice to read on the Iliad (11cmx15cm, etc):
pdftohtml (
http://pdftohtml.sourceforge.net ), some ad-hoc scripts, tidy (
http://tidy.sourceforge.net/ ), gnuhtml2latex (
http://packages.debian.org/unstable/text/gnuhtml2latex ) and lyx (
http://www.lyx.org ). The results are acceptable but it's a lengthy process (about an hour for each book, mostly to adapt the ad-hoc scripts so they join lines correctly and detect chapter headings).
I've found an alternative: a plug-in for Abiword (a lean and portable wordprocessor) that imports pdf with some heuristics (and the heuristics seems to be well chosen, as to be general aplicable). It supports styles, multiple columns, etc.
It's incredible. As an example the author posts some images of before (pdf) importing and after (Abiword), see the attached images.
For a description of what it does:
http://www.abisource.com/twiki/bin/v...luginWithStyle
To download the sources of the pdf import plug-in and try it:
http://jauco.nl/blog/
Caution: I've just found it, so I have not tested it yet. As I have some spare time I'll try it ;-).
Tell me what you think about about it ;-).