12-05-2011, 02:43 PM | #1 |
Addict
Posts: 320
Karma: 56788
Join Date: Jun 2011
Device: Kindle
|
Reg-ex help...?
Hi, all. I'm back for some additional reg-ex instruction...
I'm trying to remove <div> tags from a document in bulk, but can't seem to figure out what expression I should be using to find them. Here's a sample of the code I'm working on: Spoiler:
Now, the expression I used (with the intent of replacing it with "\1") was: Code:
<div class="calibre1">([^<]*)</div> Can someone let me know where exactly my brain is letting me down? |
12-05-2011, 03:26 PM | #2 |
Jr. - Junior Member
Posts: 586
Karma: 2000358
Join Date: Aug 2010
Location: Alabama
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
|
Find: <div class="calibre1">(.*)</div>
Replace: \1 Should do it. Regard- John |
12-05-2011, 03:28 PM | #3 |
Evangelist
Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
|
If you want to get rid of all divs, its easy enough to just delete all div tags. This is most likely fine for most uses, since paragraphs and such dont need to be contained in a div, but be careful if it causes title/forward pages to get a bit strange if there's lots of silly CSS.
Anyway, just use: Code:
</?div\b[^<>]*> |
12-05-2011, 03:42 PM | #4 |
Addict
Posts: 320
Karma: 56788
Join Date: Jun 2011
Device: Kindle
|
Thanks, guys!
@Jabby - Unfortunately, I tried that one, too. It just selects the entire document! Yikes! @Serpentine - You say it's easy enough to just delete all the div tags. And I noticed that that reg-ex will do just that, but did you mean there's an easy way to delete all div tags without writing reg-ex? As always, if i could impose on you to explain part of your code, too, I'd be most grateful. Specifically: "\b[^<>]*". Thanks |
12-05-2011, 04:13 PM | #5 | |
Evangelist
Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
|
Quote:
\b[^<>]* is just a 'nice' ways of dealing with tags and attributes. \b matches either end of a word. A word is just anything that matches \w+ generally. The \b stops matches where there's only a partial tag, it's a habit from when you are searching for something like a <p> tag, you need to be careful to avoid <pre> tags. Code:
<p([^<>]*)> // will match both <p yup="1"> and <pre something="wat"> <p\b[^<>]*> // will match p's but not pre's. Code:
Using the sample : <p Some text here</p> </?p\b[^<>]*> : <p Some text here</p> </?p\b[^>]*> : <p Some text here</p> Last edited by Serpentine; 12-05-2011 at 04:16 PM. Reason: better example |
|
12-05-2011, 04:28 PM | #6 |
Addict
Posts: 320
Karma: 56788
Join Date: Jun 2011
Device: Kindle
|
@Serpentine - Thank you for your patience. Sincerely. I make a real effort not to ask questions whose answers I can extrapolate from previous answers to previous questions that I've asked. The only way to reliably do that is to understand why those previous answers worked the way they did. When the more experienced users (such as yourself) break down the reg-ex logic, it's truly invaluable to me. So, again: sincerely grateful.
|
12-05-2011, 07:40 PM | #7 |
Guru
Posts: 696
Karma: 150000
Join Date: Feb 2010
Device: none
|
|
12-05-2011, 07:44 PM | #8 |
Addict
Posts: 320
Karma: 56788
Join Date: Jun 2011
Device: Kindle
|
@ st_albert - nope. I'll try that on the next file i'm messing with. I remember looking up what "minimal matching" meant in the Sigil tutorial, but I guess I didn't/don't really understand. Could you explain it to me?
|
12-05-2011, 08:30 PM | #9 | |
Guru
Posts: 696
Karma: 150000
Join Date: Feb 2010
Device: none
|
Quote:
Code:
find: <div>(.*)</div> Probably Serpentine could explain this more succinctly. |
|
12-05-2011, 08:30 PM | #10 | |
Well trained by Cats
Posts: 29,981
Karma: 56143930
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
I leave it ticked, only Case gets enabled when I only want the exact case to match |
|
12-05-2011, 10:29 PM | #11 |
Addict
Posts: 320
Karma: 56788
Join Date: Jun 2011
Device: Kindle
|
iiiiiiiiiiiiiiiiiiiiis
@st_albert - That was perfectly clear. Thank you. I had always simply assumed it was selecting the entire document, but your explanation makes total sense. I'll be sure keep that box checked from now on.
--- For the record, I am also not agree with topicstarter. Topicstarter excessive political opinionation on regular expressions. I am disappoint with topicstarter. Last edited by ElMiko; 12-06-2011 at 10:19 AM. Reason: changed grammar in title for the sake of consistency |
12-06-2011, 12:41 AM | #12 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Adobe Reg problem on PE | Gremalkin | enTourage eDGe | 5 | 09-02-2011 03:01 PM |
Reg Validate EPUB documents Errors. | gsp | ePub | 3 | 08-13-2011 05:02 AM |
Reg expression for importing | Debby | Library Management | 2 | 02-17-2011 11:20 AM |
eBooks: What to read on which reader? El Reg | m-reader | News | 4 | 11-23-2009 12:50 PM |
Reg reviews iRex DR1000S | HarryT | News | 5 | 07-24-2009 05:32 PM |