Thread: PDF line unwrap
View Single Post
Old 05-26-2010, 05:35 PM   #16
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,450
Karma: 27757438
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The pdfreflow.so module is a C module that takes a PDF and returns an XML. The XML is not quite PDF draw commands (the C code does a little bit of cleanup/consolidation).

The calibre.ebooks.pdf.reflow python module then takes that XML file and tries to "reflow" it (i.e. do things like unwrap analysis, identifying structure and so on).

So the best place for you to do hacking in in calibre.ebooks.pdf.reflow
kovidgoyal is offline   Reply With Quote