View Single Post
Old 08-19-2009, 02:26 PM   #2
DerSchwarzePrinz
Enthusiast
DerSchwarzePrinz began at the beginning.
 
Posts: 25
Karma: 16
Join Date: Aug 2009
Device: Pocketbook 360, Sony PRS-T1
Quote:
Originally Posted by irisclara View Post
If this question is answered elsewhere, I apologize for wasting time. I did look.

I have a number of pdf files with footers (and sometimes headers) like

file://quickbrownfox/(65 of 296)

so the page number counts up to the total.

I've looked at python regular expressions but I can't figure out how to tell Calibre to leave these lines out when I convert to rtf.

Alternately, does anyone know of a way to remove sequential page numbers in a rtf? Then I could remove the parts of the line that stay the same with the extended replace function in TED notepad and the page numbers some other way.

Thanks.
Try the following:

file://.+\)

This should remove all occurrences starting with "file://" than any characters ".+" up to a closing bracket "\)".

Removing numbers is very easy with "\d+", but it removes every number in the document.
Perhaps someone out there knows a solution?
DerSchwarzePrinz is offline   Reply With Quote