View Single Post
Old 12-17-2010, 04:13 PM   #1
winterminute
Junior Member
winterminute began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Dec 2010
Device: NookColor, maybe
My RegEx isn't doing what I hoped to remove page numbers and a fixed string

The person who built the PDF I'm using used a trial version of some XML formatter which spits out some text on every page, but this is hidden in the PDF, but when I convert to ePUB it shows up. I figured I could just remove this using a RegEx on the Header/Footer, but no luck.

Code:
String:
<a href="http://www.antennahouse.com">Antenna House XSL Formatter (Evaluation)  http://www.antennahouse.com</a><br>

RegEx:
<a href="http://www.antennahouse.com">Antenna House XSL Formatter (Evaluation)  http://www.antennahouse.com</a><br>
I'd also like to remove page numbers and page titles, here's an example

Code:
String:
<A name=13></a><IMG src="index-13_1.jpg"><br>Title <br>11 <br>

RegEx:
<A name=[0-9][0-9][0-9]></a><IMG src="index-[0-9][0-9][0-9]_1.jpg"><br>Title <br>[0-9][0-9][0-9] <br>

Did I completely misunderstand how regular expressions work?
winterminute is offline   Reply With Quote