Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 04-14-2010, 09:00 PM   #1
cellocgw
Junior Member
cellocgw began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Apr 2010
Device: nook
regex request for specific header removal

Hi,
new to Calibre but reasonably familiar with regex. I've obtained a bunch of pdfs (not DRMd) that have headers and footers that start with

file:/// followed by a long directory string. Can someone give me the correct regex to delete, basically, any string that starts like that and continues until the end of the line? (occasionally there will be white space in these headers/footers).

Many thanks; I promise to start writing my own once I get some startup help.

Carl
cellocgw is offline   Reply With Quote
Old 04-14-2010, 09:44 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,617
Karma: 4998447
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I dont recall if calibre compiles regexes with the . matching eol of lines or not, but if not

file:///.*
kovidgoyal is offline   Reply With Quote
Old 04-15-2010, 02:42 PM   #3
cellocgw
Junior Member
cellocgw began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Apr 2010
Device: nook
Quote:
Originally Posted by kovidgoyal View Post
I dont recall if calibre compiles regexes with the . matching eol of lines or not, but if not

file:///.*
Thanks! I'll play with that and a couple variations, as some of the headers at least end with numeral and parenthesis, so regex would be

file://.*[0-9]\) or something similar.
I did play with a 'workaround for dummies,' by converting pdf to txt, editing the text file in Word (global search/replace), then converting txt to epub.
Love Calibre!

Carl
cellocgw is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Structure Detection - Remove Header (or Footer) Regex DarkKipper Conversion 69 11-09-2013 12:21 PM
Regex to remove header from PDF neonbible Calibre 4 09-07-2010 10:08 AM
header removal fails, even though test identifies the pattern hpep Calibre 2 08-09-2010 12:40 PM
Header/Footer removal Solicitous Calibre 2 03-30-2010 05:53 AM


All times are GMT -4. The time now is 06:47 AM.


MobileRead.com is a privately owned, operated and funded community.