MobileRead Forums - View Single Post

pietro99 · 03-05-2011, 04:33 PM

Quote:

Originally Posted by bob_tm

This should be doable using the regular expression replacement feature of Calibre (you can replace 3 expressions - here all of them should be replaced by the empty string). From the top of my head and from the example you have provided, I would guess the 3 expressions would be:

\[B\]\d+

\d+TEXT\.indd \d+

\d+\/\d+\/+d+ \d+:+d+:+d+ PM\[\/B\]

Since this isn't Perl (which is the variation of regexps I usually use), you may not have to put a "\" behind a "/" as I have done above. Try to experiment with these strings and if supported by Calibre, put "^" in front of the expressions to denote beginning of line and "\s*$" at the end of the expressions to denote end of line with possible trailing white space. If the date and time strings are the same in all instances of the unwanted strings, you can use the actual numbers rather than "\d+" (which denotes one or more digits).

Experimentation is the key here and you will learn how to do this. Regexps are great stuff, though looks like Greek to the uninitiated (except for the Greek uninitiated

).

-- bob_tm

You are spot-on! It certainly looks like Greek when you start but I am starting to see how it works. I managed to get rid of page numbers 1-9 with \d but page numbers 10 onwards were still there. So I tried \ddd but that didn't work. What is the secret for that please?

I am slowly getting through the tutorial; just hope I have the patience.

Edit: just worked it out....\d\d\d for all the page numbers.