MobileRead Forums - View Single Post

oddsoxx · 03-09-2010, 11:45 AM

Quote:

Originally Posted by ATimson

Yes, it's probably going to need to be tweaked on a per-book basis.

The originals:

Header:

Code:

(?i)(?<=<hr>)((\s*<a name=\d+></a>((<img.+?>)*<br>\s*)?\d+<br>\s*.*?\s*)|(\s*<a name=\d+></a>((<img.+?>)*<br>\s*)?.*?<br>\s*\d+))(?=<br>)

Footer:

Code:

(?i)(?<=<hr>)((\s*<a name=\d+></a>((<img.+?>)*<br>\s*)?\d+<br>\s*.*?\s*)|(\s*<a name=\d+></a>((<img.+?>)*<br>\s*)?.*?<br>\s*\d+))(?=<br>)

If you want to learn regular expressions... may $deity have mercy on your soul.

Thanks so much for the original code for removing headers and footers. If I can ever figure out what to do with it, I'll be in great shape. Yeah...I've tried the "learn regular expressions" route. Ha! I was almost able to make it work. I could take out the title of the book OR the date in the header but not both and I could get rid of the author's name in the footer but never the page numbers.

I came up with a plan that involves converting to rtf, opening in Word, searching and replacing a couple times, saving, opening in Calibre, converting to MOBI and then emailing the resultant file to my Kindle. Labor intensive but doing it on one book at a time as I go to read it, isn't that big a deal. It's better than having to skip over all the extra stuff in the text anyway.

Thanks again!