|
|
#1 |
|
Bookish
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,049
Karma: 2006208
Join Date: Jun 2011
Device: PC, t1, t2, t3, Clara BW, Clara HD, Libra 2, Libra Color, Nxtpaper 11
|
I try to remove unwanted "span's" of the form:
Code:
<span class="font1 font2 something">interesting text</span> Code:
find: <span(.*)>(.*)</span> replace: \2 It seems that regex behaves now greedier (while dotall is *not* enabled) then before. What am I missing? |
|
|
|
|
|
#2 |
|
Interested in the matter
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
find: <span.*?>(.*?)</span>
replace: \1 |
|
|
|
| Advert | |
|
|
|
|
#3 |
|
Bookish
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,049
Karma: 2006208
Join Date: Jun 2011
Device: PC, t1, t2, t3, Clara BW, Clara HD, Libra 2, Libra Color, Nxtpaper 11
|
@jbacelar: Thanks, but no, that is not the solution. A combination of '*' (zero or more) and '?' (zero or one) together does not make sense here.
I just tried my original regex within calibre 1.31 and it works as intended. However, it does not seem to work in calibre v1.32, which is odd. |
|
|
|
|
|
#4 |
|
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 142
Karma: 669192
Join Date: Nov 2013
Device: Kindle 4.1.1 no touch
|
I don't get what you want to expres... Could you post some examples?
|
|
|
|
|
|
#5 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609
Karma: 28549044
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
* - means match zero or more matching as many as possible. i.e. be greedy
*? - means match zero or more matching as few as necessary, i.e. dont be greedy |
|
|
|
| Advert | |
|
|
|
|
#6 |
|
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 142
Karma: 669192
Join Date: Nov 2013
Device: Kindle 4.1.1 no touch
|
If I understood correct, he's saying that
Code:
<span(.*)>(.*)</span> Code:
<span>something<p><crlf> |
|
|
|
|
|
#7 |
|
Bookish
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,049
Karma: 2006208
Join Date: Jun 2011
Device: PC, t1, t2, t3, Clara BW, Clara HD, Libra 2, Libra Color, Nxtpaper 11
|
I checked and it appears the text is/was very bad formatted in which </span> is sometimes missing, thus causing the weird behavior by selecting unexpectedly large sections. And that my original regex did worked in the past for other texts may just be pure luck
![]() Oh well, at least I did learned a new regex trick and refreshed my knowledge about "Lazy quantifiers" versus "Greedy quantifiers". Thanks all! Last edited by DrChiper; 04-16-2014 at 12:11 PM. |
|
|
|
|
|
#8 |
|
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,892
Karma: 207182180
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Just remember; using regex to remove only certain span|div tags will usually blow up in your face wherever spans|divs are nested.
|
|
|
|
|
|
#9 |
|
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
The modify epub plugin is going in interesting places regarding junk spans. Might be worh a look...
Using a negative lookahead you can avoid nesting issues, there is an example by me in that thread. Last edited by eschwartz; 04-17-2014 at 04:17 AM. |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Paperwhite behaving strangely | route66 | Amazon Kindle | 6 | 03-25-2013 01:42 AM |
| Authors Behaving Badly | mr ploppy | Writers' Corner | 2 | 08-27-2012 06:55 PM |
| PRS-300 Reader behaving badly! | docusk | Sony Reader | 14 | 04-03-2012 07:46 PM |
| PRS-300 PRS-300 behaving badly | docusk | Sony Reader | 2 | 03-22-2012 12:46 PM |
| Kindle behaving oddly after reset | ficbot | Amazon Kindle | 2 | 09-11-2010 03:10 PM |