View Single Post
Old 06-20-2014, 12:20 PM   #13
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,153
Karma: 60406498
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by R71986 View Post
The longer lines are acceptable now.

I managed to remove all the inbuilt page numbers by repeating the same regex with one less \d each time I ran the replace all. The replacements are shockingly fast.

I don't really understand how to do the other cleaning up of the page I show you at comment 10 above. What type of regex will distinguish between a single word that should be the only one the line such as "Hello" and in other cases the single word should be joined to the next line?

One way would be to join all words to one continuous line until a full stop is found, but is that level of control possible?
(NB Work in code view when I use Sigil, Calibre removes the BV temptation )

I have 5 different 'Join' saved searches. 3 are basic, run as is. 2 more need to be tweaked by case-by-case because they need to match the current class= portion to fine tune greediness

Almost all are run in Replace Next (Find to skip this one) mode
Line ending in Hyphen removal is a plague.
It could be a hyphenated word: join with no space or it could be pseudo em dash (--) where context is everything in the break/nobreak decision

There are examples in the stickies (and other places) over in Sigil. For the most part they also work in Calibre
theducks is offline   Reply With Quote