Quote:
Originally Posted by theducks
Among other things, * is a wildcard and will need to be escaped.
|
I think he was just trying to emphasize the portion he was speaking of with asterisks, not that the actual source document had them!
Quote:
Originally Posted by Johann Cat
I have a simple text-block book that has within it, as some gutenberg.org books do, page numbers within the text block (not coded footers, etc.).
|
Mind just linking to the specific Gutenberg example?
If I am understanding correctly, I am thinking it might just be using a Regex as simple as this:
Search: \s+― [0-9]+ ―\s+
Replace: (insert a single space here)
What this says in English is "look for one or more blank space characters" + "look for an em dash followed by a space" + "look for a number" + "look for a space followed by an em dash" + "look for one or more blank space characters". Replace with "a single space".
What I would then do is just clean up the file in a Text Editor using the above Regex, and then feed that document through Calibre for conversion.