View Single Post
Old 03-05-2011, 04:33 PM   #8
pietro99
Connoisseur
pietro99 has learned how to buy an e-book online
 
Posts: 55
Karma: 76
Join Date: Sep 2010
Location: Australia
Device: Kindle 3
Quote:
Originally Posted by bob_tm View Post
This should be doable using the regular expression replacement feature of Calibre (you can replace 3 expressions - here all of them should be replaced by the empty string). From the top of my head and from the example you have provided, I would guess the 3 expressions would be:

\[B\]\d+

\d+TEXT\.indd \d+

\d+\/\d+\/+d+ \d+:+d+:+d+ PM\[\/B\]

Since this isn't Perl (which is the variation of regexps I usually use), you may not have to put a "\" behind a "/" as I have done above. Try to experiment with these strings and if supported by Calibre, put "^" in front of the expressions to denote beginning of line and "\s*$" at the end of the expressions to denote end of line with possible trailing white space. If the date and time strings are the same in all instances of the unwanted strings, you can use the actual numbers rather than "\d+" (which denotes one or more digits).

Experimentation is the key here and you will learn how to do this. Regexps are great stuff, though looks like Greek to the uninitiated (except for the Greek uninitiated ).

-- bob_tm
You are spot-on! It certainly looks like Greek when you start but I am starting to see how it works. I managed to get rid of page numbers 1-9 with \d but page numbers 10 onwards were still there. So I tried \ddd but that didn't work. What is the secret for that please?

I am slowly getting through the tutorial; just hope I have the patience.

Edit: just worked it out....\d\d\d for all the page numbers.

Last edited by pietro99; 03-05-2011 at 04:36 PM. Reason: update
pietro99 is offline   Reply With Quote