04-14-2010, 09:00 PM | #1 |
Junior Member
Posts: 6
Karma: 10
Join Date: Apr 2010
Device: nook
|
regex request for specific header removal
Hi,
new to Calibre but reasonably familiar with regex. I've obtained a bunch of pdfs (not DRMd) that have headers and footers that start with file:/// followed by a long directory string. Can someone give me the correct regex to delete, basically, any string that starts like that and continues until the end of the line? (occasionally there will be white space in these headers/footers). Many thanks; I promise to start writing my own once I get some startup help. Carl |
04-14-2010, 09:44 PM | #2 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I dont recall if calibre compiles regexes with the . matching eol of lines or not, but if not
file:///.* |
Advert | |
|
04-15-2010, 02:42 PM | #3 | |
Junior Member
Posts: 6
Karma: 10
Join Date: Apr 2010
Device: nook
|
Quote:
file://.*[0-9]\) or something similar. I did play with a 'workaround for dummies,' by converting pdf to txt, editing the text file in Word (global search/replace), then converting txt to epub. Love Calibre! Carl |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Structure Detection - Remove Header (or Footer) Regex | DarkKipper | Conversion | 69 | 11-09-2013 12:21 PM |
Regex to remove header from PDF | neonbible | Calibre | 4 | 09-07-2010 10:08 AM |
header removal fails, even though test identifies the pattern | hpep | Calibre | 2 | 08-09-2010 12:40 PM |
Header/Footer removal | Solicitous | Calibre | 2 | 03-30-2010 05:53 AM |