View Single Post
Old 05-28-2012, 09:07 AM   #225
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,455
Karma: 27757438
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
It's not a priority for me, the parts of the poppler api that pdfreflow uses are not stable, they change with pretty much every poppler 0.x release, which makes maintaining them a pain. I am switching the new pdf engine to use pdftohtml -xml which produces the same kind of output as pdfreflow, the upside being that I no longer have to maintain pdfreflow's C++ code. The downside, from your perspective, is that pdftohtml does not support specifying a pdf page range for conversion. You have four choices:

1) Maintain pdfreflow yourself, i'm happy to accept patches.

2) Ask the poppler people to implement page ranges for pdftohtml

3) Use another pdf library (calibre has both podofo and pypdf) to first extract the relevant pages and then run pdftohtml on them.

4) Live with the reduced performance
kovidgoyal is offline   Reply With Quote