09-07-2010, 10:53 AM | #1 |
Junior Member
Posts: 2
Karma: 10
Join Date: Jul 2010
Device: Nook
|
Paragraph breaks
Is there a way for the RegEx structure detection to remove any line break that isnt preceded by a chapter title or period?
The biggest formating issue I have at the moment is paragraphs that end one page and begin another (in the paper book) are being split into two separate paragraphs in the digital copy and sentences are getting forced new lines at the point where they normally reach the 'edge' of the page. Last edited by thedevilsjester; 09-07-2010 at 10:59 AM. |
09-07-2010, 11:25 AM | #2 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
What's the source format of the file you're trying to convert?
Try enabling the preprocess option under Structure detection. |
Advert | |
|
09-07-2010, 12:26 PM | #3 |
Evangelist
Posts: 473
Karma: 15000
Join Date: Jul 2008
Device: Various and sundry
|
I have one I use that detects an end of paragraph character that precedes a lower case letter and then replaces the end of paragraph with a space. That catches most of mine. Many of the instances I find end with a quote mark, so just detecting a period won’t do it for me.
Regex search: \p([a-z]) Replace with : " \1\2" (without the quotes, there's a space at the beginning) |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
PDF to EPUB - spurious paragraph breaks | RichieTheK | Calibre | 2 | 09-08-2010 11:27 AM |
scanned PDF has weird paragraph breaks. Possible to fix | lunixer | 0 | 08-30-2010 10:47 PM | |
Odd line/paragraph breaks in epub and FB2? | PKFFW | Calibre | 4 | 10-01-2009 07:49 AM |
Create proper paragraph breaks in ereader2html | acj412 | Workshop | 2 | 08-10-2009 11:02 PM |
convert to lrf : paragraph indents, line breaks | karo02 | Calibre | 4 | 01-27-2009 09:19 AM |