![]() |
#1 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Jun 2011
Device: Kindle
|
Closing up line endings that occur in the middle of a sentence
I have been converting a pdf book series and the only thing left to do to clean it up properly, without using Sigil line by line, would be an expression that would find the line endings that are in the middle of a sentence, thus not having punctuation, except for hyphens, usually caused by a page break in the pdf.
An example of this would be: The line ends here but there was a page break or something else that caused the sentence to be split. Having an expression that would ignore punctuation that would either be a natural line ending or at least be natural looking (excepting hyphens of course) and then a replacement with a word space that closes the line up. Any ideas? Last edited by remltr; 06-22-2011 at 10:42 PM. |
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Try reading the sticky at the top of this sub-forum - it covers this and many other points.
pdf conversion already does this for you, but there is a setting called the line unwrap factor in the pdf conversion options - for some books the unwrap factor isn't aggressive enough, just reduce the number a bit. |
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,889
Karma: 59840450
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
I cheat and just fix problems like that in Sigil ![]() REGEX in Code view Code:
([a-z])</p>\s+<p.+> matches lower case letter just before a closing P tag followed by white spaces (newline incl) and a opening P tag does not work with closing Quote marks,closing Span or DIV tags, Code:
\1 Not perfect, you need to tune to what you see in your code view |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Converting RTFs with "\" line endings to Epub. | Archon | Calibre | 3 | 01-16-2011 01:13 PM |
Suggestions for Happy Endings? | jenieliser | Reading Recommendations | 27 | 10-06-2010 11:07 AM |
Punctuation | Dresden | Calibre | 7 | 08-31-2010 05:14 AM |
removing hard line endings | Mostly Math | Calibre | 2 | 06-01-2010 11:18 PM |
Punctuation | jgray | Workshop | 10 | 04-14-2010 07:38 AM |