View Single Post
Old 11-20-2007, 12:40 AM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,553
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I just meant to give you an idea how to do it. Basically pdftohtml preserves line breaks using <br> elements. These need to be removed intelligently (based on line length) and two consecutive <br> elements become a new paragraph.
kovidgoyal is offline   Reply With Quote