Attached is a page excerpted by the html file created in the input directory during debugging.
On my parsed file i found wrong "end of paragraph" on the second line (this is understandable as is an esclamation mark within a spoken sentence), on line 4 (ending with word "necessità") and line 12 (ending with word "inchinò").
This is just an example: please note that i tried with several .pdf files from different sources and behaviour is always the same.
Last edited by wildbilly; 01-29-2010 at 09:18 AM.
|