Thread: PDF line unwrap
View Single Post
Old 05-23-2010, 07:25 PM   #8
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by miquel View Post
Hi again,
Starson17, thanks a lot for your help on this thread. I've submitted a patch with a different approach to line unwrapping, let's see what people think!
Patch's here: http://bugs.calibre-ebook.com/ticket/5597
I have no knowledge in this area but could the method this fellow took in creating this extension for openoffice.org's Writer be applied to cleaning up PDF file conversions.

From his page:

Quote:
Do you have problems with
texts having unwanted
line breaks like
this one?

This happens because there are some unwanted paragraph marks along the text. If we take the text from a PDF, inevitably we will get a paragraph mark at each end of line.

Now, or you delete them one by one with a lot of patience, or you can use the macro MyTXTcleaner that will do the work for you.
DoctorOhh is offline   Reply With Quote