01-15-2011, 11:25 PM | #1 |
Member
Posts: 13
Karma: 10
Join Date: Jan 2011
Device: Nook
|
Removing Headers help requested
Ok, I'm kinda dumb when things get really technical. I have a few books that have headers that I would like to remove, but even after reading tutorials I can't make any sense out of how to do it. I was wondering if there was some kind soul out there who is good at this sort of thing who wouldn't mind to tell me exactly what I need to enter to remove the following headers
1. Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com) Using the wizard it looks like this: <a href="http://www.novapdf.com" class="calibre3">Create PDF</a> files without this message by purchasing novaPDF printer (<a href="http://www.novapdf.com" class="calibre3">http://www.novapdf.com</a>) 2. Generated by ABC Amber LIT Converter, http://www.processtext.com/abclit.html I would appreciate it so very much! |
01-16-2011, 05:58 AM | #2 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
The Amber LIT Converter stuff is tricky, there are very many incarnations of that message floating around that subtly vary and make removal a pain. I believe this to be by design.
If every instance of the novaPDF message has the same code as the above, I suggest trying this: Code:
<a href=.*?\s*class="calibre\d+">Create\s*PDF</a>\s*files\s*without\s*this\s*message\s*by\s*purchasing\s*novaPDF\s*printer\s*\(<a href=.*?\s*class="calibre\d+">.*?</a>\) |
01-16-2011, 06:08 AM | #3 |
Zealot
Posts: 122
Karma: 164
Join Date: Aug 2010
Location: Old Ynysybwl
Device: Sony PRS-300
|
When converting books with the ABC Lit range of headers, I use an intermediate step. I convert to RTF first then use a search and replace in Word to remove all the text at once, then convert to epub or mobie etc.
|
01-16-2011, 09:52 AM | #4 |
Member
Posts: 13
Karma: 10
Join Date: Jan 2011
Device: Nook
|
Thank you so much! That worked perfectly for the NOVA header. I did try reading the regex section of the manual, but couldn't make sense of it.
|
01-16-2011, 09:56 AM | #5 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
You're welcome. As for the regex tutorial, was there any part specifically you had trouble with? If so, maybe there's something there that can be improved.
|
01-16-2011, 01:53 PM | #6 |
Member
Posts: 13
Karma: 10
Join Date: Jan 2011
Device: Nook
|
There wasn't any one thing about it that I didn't understand. It was the entire thing. Trying to figure it out made my brain hurt! I don't understand where the slashes go, what to put at the beginning, etc. I tried pulling the phrase out of the wizard and pasting it into the regex field and then testing it, but nothing happened. Apparently you can't just pull the stuff from the wizard, you have to add all those extra characters and that's what I don't get.
ETA: I actually did manage to remove the name of the book that was listed on every page using the wizard, but that's the only thing I've been successful with so far. After having success with that, I tried to remove the Page number that is posted at the end of each page. For example, Page #'s look like this : Page 1 Page 2 Page 3 etc. Ok in the wizard it looks like this: <i class="calibre5">Page 1</i> Ok so according to the tutorial I should make it look like this to remove the page numbers <i class="calibre5">Page [0-9]</i> and according to the test, this actually works for the first 9 pages. Ok so now I want to delete the two digit and beyond pages. So it should look like this: <i class="calibre5">Page [0-9][0-9][0-9]</i> according to the tutorial, but that doesn't work. What exactly am I doing wrong? I tried <i class="calibre5">Page [0-9]+</i> and this actually worked, but I still don't understand how or why. Lol, sorry for rambling. I'm just trying to learn this. Last edited by Arainais; 01-16-2011 at 02:11 PM. |
01-16-2011, 03:26 PM | #7 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
The '+' is a quantifier telling Calibre that you want to match one or more characters from the set, thus, one or more numerals between 0 and 9.
|
01-16-2011, 04:50 PM | #8 |
Wizard
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
I find it easier to use \d* to remove a number of any length, and \s* to specify White space of any length ( this include tabs and newlines as well as spaces).
|
01-16-2011, 04:53 PM | #9 |
Member
Posts: 13
Karma: 10
Join Date: Jan 2011
Device: Nook
|
I've been playing around with this wizard and I think I may have actually figured most of it out. Thanks for all the help!
|
01-16-2011, 05:03 PM | #10 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
That's good to hear.
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Removing Headers/Footers Help? | Anarel | Workshop | 10 | 11-09-2010 12:53 PM |
Removing headers/page numbers | greycobalt | Calibre | 3 | 10-10-2010 01:57 PM |
Pls help with removing headers /footers | Mamaijee | Calibre | 0 | 09-19-2010 01:29 PM |
Removing headers from pdf file | fotobox | Calibre | 2 | 08-30-2010 03:59 AM |
Removing Headers - yet again | jjansen | Calibre | 1 | 02-18-2010 05:24 PM |