03-18-2012, 07:20 AM | #1 |
Guru
Posts: 718
Karma: 1085610
Join Date: Mar 2009
Location: Bristol, England
Device: PRS-T1, 1825PT, Galaxy Tab, One X, TF700T, Aura HD, Nexus 7
|
RegEx Help
I've got a load of ePubs that have their text loaded with tags written as:
Code:
<a id="p2"></a> I'd like to remove them but unfortunately I can't figure out the RegEx command for this, previously in sigil I would have used the wildcard mode, but now with it removed, my only recourse is RegEx. Any idea on what to use? Also looking in the toc.ncx there is a pagelist lection referencing all those tags. So if I do remove all the those tags, can I delete the pagelist section from the toc.ncx file? |
03-18-2012, 08:31 AM | #2 |
frumious Bandersnatch
Posts: 7,515
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Something like "p\d+" for the page numbers (\d = any digit, + = 1 or more times)?
You not only can remove the pagelist, but must, if you have removed all the referenced anchors in the text. |
Advert | |
|
03-22-2012, 01:30 AM | #3 |
Connoisseur
Posts: 61
Karma: 12096
Join Date: Sep 2010
Location: Tasmania
Device: Sony PRS 650
|
Find: <a[^>]*>
Repl: blank Sigil takes care of the </a> |
03-22-2012, 08:08 AM | #4 |
Fanatic
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
|
find
<a id="p\d\+"><\/a> (Depends a bit on which regex flavour you use; you might have to remove a blackslash or two) Faster's solution should work as well, but it will remove all anchors. |
03-22-2012, 09:24 AM | #5 |
Guru
Posts: 718
Karma: 1085610
Join Date: Mar 2009
Location: Bristol, England
Device: PRS-T1, 1825PT, Galaxy Tab, One X, TF700T, Aura HD, Nexus 7
|
Thanks for all the assistance on this, I'll be having a go at it tonight.
BTW, I'll be doing this in Sigil, so the RegEx engine will be PCRE. |
Advert | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Regex | Gunnerp245 | Conversion | 5 | 03-05-2012 04:15 PM |
New help with a regex | txckie | Calibre | 2 | 08-29-2011 08:46 PM |
Help me with regex please. | eVrajka | Library Management | 5 | 08-15-2011 12:17 PM |
regex help please | thevoiceofcheese | Calibre | 2 | 08-01-2011 11:27 PM |
Regex | Faster | Sigil | 2 | 04-24-2011 09:08 PM |