I'm not sure it's fully possible in calibre. There is the pdf line-unwrap option, but if it's not doing it for you, there's no way in calibre to manually edit them.
I did do it once for a problematic pdf by converting it to rtf, then opening it in office.
Using regex in the search and replace box there, you can find all VALID paragraph breaks, temporarily replace them with a placeholder code (12345 or something), then replace all remaining paragraph breaks with a space, then re-replace the placeholder with paragraph breaks.
It's been a while, but if I recall, I decided that valid paragraph breaks were those that were preceded by a period, question mark, exclamation point, or quotation mark.
You may end up with a few mistakes using this method, but by and large, it unwrapped the hard line endings from the pdf.
Afterwards, just add the rtf to calibre and continue your conversion process.
While you're busy editing the rtf, you can go ahead and make sure chapter headings are using the 'heading' style, too, so calibre will correctly find them.
|