Thread
:
Reading PDF files
View Single Post
11-20-2007, 12:40 AM
#
7
kovidgoyal
creator of calibre
Posts: 45,400
Karma: 27756918
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I just meant to give you an idea how to do it. Basically pdftohtml preserves line breaks using <br> elements. These need to be removed intelligently (based on line length) and two consecutive <br> elements become a new paragraph.
kovidgoyal
View Public Profile
Visit kovidgoyal's homepage!
Find More Posts by kovidgoyal
Track Posts by kovidgoyal via RSS