Quote:
Originally Posted by R71986
The longer lines are acceptable now.
I managed to remove all the inbuilt page numbers by repeating the same regex with one less \d each time I ran the replace all. The replacements are shockingly fast.
I don't really understand how to do the other cleaning up of the page I show you at comment 10 above. What type of regex will distinguish between a single word that should be the only one the line such as "Hello" and in other cases the single word should be joined to the next line?
One way would be to join all words to one continuous line until a full stop is found, but is that level of control possible?
|
(NB Work in code view when I use Sigil, Calibre removes the BV temptation

)
I have 5 different 'Join' saved searches. 3 are basic, run as is. 2 more need to be tweaked by case-by-case because they need to match the current
class= portion to fine tune greediness
Almost all are run in Replace Next (Find to skip this one) mode
Line ending in Hyphen removal is a plague.
It could be a hyphenated word: join with no space or it could be pseudo em dash (--) where context is everything in the break/nobreak decision
There are examples in the stickies (and other places) over in Sigil. For the most part they also work in Calibre