libprs500 relies on the pdftohtml library to first convert pdf to html and then process it. I'm afraid that pdftohtml already works as well as Adobe's own PDF to HTML conversion, so I doubt it can be improved more without considerable effort, but you're welcome to try.
I'd recommend eclipse + pydev since you're coming from the java world.
|