01-15-2011, 10:11 AM | #1 | |
Junior Member
Posts: 6
Karma: 10
Join Date: Jan 2011
Device: kindle
|
Regexp and Alternate Page Header/Footer
I am new to ebooks and Calibre. I am only starting to understand some of the great things it can do. I am still on the simple parts. I do have some Unix background -- but more the grep error than python. Clearly not used to the message board formatting yet either.
I am stuck on the regexp to remove the alternating headers. I did more or less dump elegance and went to brute force as I don't understand [I actually have the title and the author manually typed in]. The original PDF has different header on odd and even pages -- information in top corners only. Quote:
On the display, what I want to delete is highlighted in yellow when I test. I have the delete header box checked but when I run the conversion (to MOBI) I get effectively the page numbers inserted in place of the header. Is the string returning some sort of value or match number, and how do I stop it from inserting? |
|
01-15-2011, 10:51 AM | #2 |
Wizard
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
I am guessing that your regex is incomplete and the headers actually have the page number there as well - so you are deleting all except the page number.
I always use the wizard button next to the regex field so that I can check that I have matched sufficient text. |
Advert | |
|
01-15-2011, 10:53 AM | #3 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
From what I see, that shouldn't be happening. Did you make sure the page number is highlighted as well?
You could also try to use the header removal for one type of the headers and the footer removal for the other. There may be some funky effect in that regex that I failed to notice. |
01-15-2011, 08:40 PM | #4 | |
Junior Member
Posts: 6
Karma: 10
Join Date: Jan 2011
Device: kindle
|
No, definitely used the wizard -- I am glad I found it. The page numbers are definitely with the yellow background.
It is of any help, if I go back to MOBI version it has translated to: [quote] <p class="calibre_11">2 </p><p class="calibre_11"> </p><p class="calibre_11"> [\quote] -- where two is the page number, and [quote]<p class="calibre_11"> </p><p class="calibre_11">3 </p> [\quote] when three is the page number. They are reflexive, but actually do look the same in the kindle reader. Quote:
|
|
01-15-2011, 08:51 PM | #5 |
Junior Member
Posts: 6
Karma: 10
Join Date: Jan 2011
Device: kindle
|
Page numbers are included in the highlighted yellow.
Tried the footnote and header thing. I took out the (?im) and then could drop all the brackets. With a little care on the case, everything highlighted in the wizard (for the appropriate footer and header) but the result was the same. The eliminated most of the rules I was not familiar with -- and that it is a pretty straightforward regexp. Shame I cannot use grep. |
Advert | |
|
01-15-2011, 09:03 PM | #6 | |
Junior Member
Posts: 6
Karma: 10
Join Date: Jan 2011
Device: kindle
|
Thank you for the suggestion. I tried it and it did not work -- but you got me to thinking. I eventually ended up deleting the old MOBI file and then it did work.
I looked at some of my other conversions, and it may be that the Calipre is resetting the input file to the old Mobi file if it exists. When I used the wizard, I had a choice and chose the PDF file. Perhaps it was actually running the regexp against the old Mobi file, not the PDF file. Something else I will have to see in future efforts. Thank you both for the assistance. Quote:
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Add Header/Footer | hrwriter | Calibre | 3 | 12-08-2010 05:11 AM |
Removing header and footer | radicalnomad | Calibre | 2 | 08-26-2010 10:34 AM |
Cropping a header and footer from a PDF (Page numbers etc) | NickS | 2 | 06-09-2010 11:31 AM | |
Header/Footer removal | Solicitous | Calibre | 2 | 03-30-2010 05:53 AM |
Regexp and header/footer problems | concern | Calibre | 0 | 02-07-2010 03:35 AM |