MobileRead Forums - View Single Post - Removing unnecessary line breaks in books.

DoctorOhh · 08-20-2010, 06:10 AM

Let me start by saying up front I might not understand the problem, but I'm guessing one of my suggestions will work for you

Quote:

Originally Posted by Wintersdark

I tried converting to text and back, but the way it's formatted I basically get each paragraph followed by a pair of CR/LF's. So, converting directly back to epub doesn't help.

According to this thread calibre looks for two consecutive CR/LFs to identify a paragraph. When converting back to ePub you might have to change the settings in the text input area while converting.

Quote:

Originally Posted by Wintersdark

However, as it's not every book, I'm just addressing it on a case by case basis with Notepad++ as I go. If I were still running linux, I'd mass convert them all to text and figure out how to script applying the regex replace to them, but I have no idea of how to go about that in windows.

When I have text I need to put back into form with proper word-wrapped paragraphs so I can get a clean epub upon conversion I use Openoffice.org's Writer program with the My Text Cleaner extension installed. I select the whole document and run the extension. It does a good job of reassembling the paragraphs.

Alternatively I can often use Sigil's find and replace to fix ePubs as long as there is something unique to denote paragraph breaks in the output. Often I find a

Quote:

or similar marking between what should be each paragraph. If this exists then it is only 2 or 3 steps to clean it up.

Good Luck.