View Single Post
Old 08-29-2009, 03:26 PM   #7
JvdW
Zealot
JvdW doesn't litterJvdW doesn't litter
 
Posts: 115
Karma: 150
Join Date: Jul 2008
Location: Netherlands Veenendaal
Device: Palm T5, Sony PRS-505, Nook Color
Just to let you know that I might have found something that might help you too regarding the removal of headers/footers.
The following is what I copied from the debug output of Calibre (.6.10) and that I want removed:

Code:
<br>
5<br>
<hr>
<A name=7></a>
After playing around with the remove footer regexp I came up with the following:
Code:
(?ims)<br>\s*\d{1,3}\s*<br>\s<hr>\s<a name=\d{1,3}></a>
This could probably be improved but it works for me.
It isn't perfect because sentences that continue on the other page aren't always strung together but it beats manually removing pagenumbers ;-)

Googling for some help I found two programs that really helped me, YMMV:
Regex Coach : http://weitz.de/regex-coach/
Kodos : http://kodos.sourceforge.net/
Where I found Regex Coach the better one with more possibilities and better info on what is happening.

Regards,

Joop
JvdW is offline   Reply With Quote