![]() |
#1 |
Bookish
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,011
Karma: 2003162
Join Date: Jun 2011
Device: PC, t1, t2, t3, Clara BW, Clara HD, Libra 2, Libra Color, Nxtpaper 11
|
![]()
I try to remove unwanted "span's" of the form:
Code:
<span class="font1 font2 something">interesting text</span> Code:
find: <span(.*)>(.*)</span> replace: \2 It seems that regex behaves now greedier (while dotall is *not* enabled) then before. What am I missing? |
![]() |
![]() |
![]() |
#2 |
Interested in the matter
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
find: <span.*?>(.*?)</span>
replace: \1 |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Bookish
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,011
Karma: 2003162
Join Date: Jun 2011
Device: PC, t1, t2, t3, Clara BW, Clara HD, Libra 2, Libra Color, Nxtpaper 11
|
@jbacelar: Thanks, but no, that is not the solution. A combination of '*' (zero or more) and '?' (zero or one) together does not make sense here.
I just tried my original regex within calibre 1.31 and it works as intended. However, it does not seem to work in calibre v1.32, which is odd. |
![]() |
![]() |
![]() |
#4 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 142
Karma: 669192
Join Date: Nov 2013
Device: Kindle 4.1.1 no touch
|
I don't get what you want to expres... Could you post some examples?
|
![]() |
![]() |
![]() |
#5 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,268
Karma: 27111060
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
* - means match zero or more matching as many as possible. i.e. be greedy
*? - means match zero or more matching as few as necessary, i.e. dont be greedy |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 142
Karma: 669192
Join Date: Nov 2013
Device: Kindle 4.1.1 no touch
|
If I understood correct, he's saying that
Code:
<span(.*)>(.*)</span> Code:
<span>something<p><crlf> |
![]() |
![]() |
![]() |
#7 |
Bookish
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,011
Karma: 2003162
Join Date: Jun 2011
Device: PC, t1, t2, t3, Clara BW, Clara HD, Libra 2, Libra Color, Nxtpaper 11
|
I checked and it appears the text is/was very bad formatted in which </span> is sometimes missing, thus causing the weird behavior by selecting unexpectedly large sections. And that my original regex did worked in the past for other texts may just be pure luck
![]() Oh well, at least I did learned a new regex trick and refreshed my knowledge about "Lazy quantifiers" versus "Greedy quantifiers". Thanks all! Last edited by DrChiper; 04-16-2014 at 11:11 AM. |
![]() |
![]() |
![]() |
#8 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,517
Karma: 204127028
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Just remember; using regex to remove only certain span|div tags will usually blow up in your face wherever spans|divs are nested.
|
![]() |
![]() |
![]() |
#9 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
The modify epub plugin is going in interesting places regarding junk spans. Might be worh a look...
Using a negative lookahead you can avoid nesting issues, there is an example by me in that thread. Last edited by eschwartz; 04-17-2014 at 03:17 AM. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Paperwhite behaving strangely | route66 | Amazon Kindle | 6 | 03-25-2013 12:42 AM |
Authors Behaving Badly | mr ploppy | Writers' Corner | 2 | 08-27-2012 05:55 PM |
PRS-300 Reader behaving badly! | docusk | Sony Reader | 14 | 04-03-2012 06:46 PM |
PRS-300 PRS-300 behaving badly | docusk | Sony Reader | 2 | 03-22-2012 11:46 AM |
Kindle behaving oddly after reset | ficbot | Amazon Kindle | 2 | 09-11-2010 02:10 PM |