Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 01-02-2015, 05:04 PM   #451
amrac
Junior Member
amrac seems famous, but is in fact legendary.amrac seems famous, but is in fact legendary.amrac seems famous, but is in fact legendary.amrac seems famous, but is in fact legendary.amrac seems famous, but is in fact legendary.amrac seems famous, but is in fact legendary.amrac seems famous, but is in fact legendary.amrac seems famous, but is in fact legendary.amrac seems famous, but is in fact legendary.amrac seems famous, but is in fact legendary.amrac seems famous, but is in fact legendary.
 
Posts: 2
Karma: 75742
Join Date: Jan 2015
Location: Lees, United Kingdom
Device: none
This will work:
<span class="italics">\w+ [\w+ ,]{1,}</span>

You may want to add all punctuation characters or any character a sentence may include. You can change this to:
<span class="italics">\w+ [\w+ ,\.\?\-]{1,}</span>
amrac is offline   Reply With Quote
Old 01-02-2015, 05:56 PM   #452
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,546
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Whose question are you answering?
DiapDealer is offline   Reply With Quote
Advert
Old 01-12-2015, 10:34 AM   #453
1v4n0
Groupie
1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.
 
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
Removing empty html elements

How about a regex that finds everything that has the structure of

<AAAwhatever></AAA>

i.e. all empty html elements.

Or, even better, all elements that either are empty or that contain just a space.

EDIT Looks like this one is working, though I'm not entirely sure why.

Code:
<[^/>]+>[ \n\r\t]*</[^>]+>

Last edited by 1v4n0; 01-12-2015 at 10:51 AM.
1v4n0 is offline   Reply With Quote
Old 01-12-2015, 10:55 AM   #454
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Code:
<(\w+)( [^<>]+)?>(\s|&nbsp;)*</\1>

Last edited by eschwartz; 01-12-2015 at 11:00 AM.
eschwartz is offline   Reply With Quote
Old 01-12-2015, 11:06 AM   #455
mzmm
Groupie
mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.
 
mzmm's Avatar
 
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
Quote:
Originally Posted by 1v4n0 View Post
How about a regex that finds everything that has the structure of

<AAAwhatever></AAA>

i.e. all empty html elements.

Or, even better, all elements that either are empty or that contain just a space.
i use something like this to catch paragraph tags, either empty or containing only whitespace. it also catches nested tags, so things like

<p><i><b><br/></b></i></p>

Code:
(?s)<p[^>]*?>\s*?(?:<\w[^>/]*?>)*?\s*?(?:&nbsp;|*|<br(?:\s|\s/|/)?>)*?\s*?(?:</\w[^>/]*?>)*?\s*?</p>
Quote:
Originally Posted by 1v4n0 View Post
EDIT Looks like this one is working, though I'm not entirely sure why.

Code:
<[^/>]+>[ \n\r\t]*</[^>]+>
because it's looking for

Code:
<[^/>]+>
an opening and closing bracket containing one or more of anything except closing brackets or forward slashes

Code:
[ \n\r\t]*
followed by none or more spaces, newlines or tabs

Code:
</[^>]+>
followed by an opening and closing bracket containing one or more of anything except closing brackets or forward slashes

Last edited by mzmm; 01-12-2015 at 11:09 AM.
mzmm is offline   Reply With Quote
Advert
Old 01-12-2015, 11:09 AM   #456
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,897
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by 1v4n0 View Post
How about a regex that finds everything that has the structure of

<AAAwhatever></AAA>

i.e. all empty html elements.

Or, even better, all elements that either are empty or that contain just a space.

EDIT Looks like this one is working, though I'm not entirely sure why.

Code:
<[^/>]+>[ \n\r\t]*</[^>]+>
Don't use this. It won't work. Sorry, but you cannot guarantee that it won't mess something up.

Take a look at the following line...

<p><span>This is some text.<span class="smallcaps">THIS IS MORE TEXT</span>. This is yet more text.</span> And finally the last bit of text.</p>

Can you use regex to get rid of the empty span without messing up the span that actually does something? I don't see how you can.
JSWolf is offline   Reply With Quote
Old 01-12-2015, 11:11 AM   #457
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Didn't notice you found a solution.

Note that mine should find html entity nbsp's as well.

Additionally, it makes sense to ensure the two tags match, which I have done.

Last edited by eschwartz; 01-12-2015 at 11:16 AM.
eschwartz is offline   Reply With Quote
Old 01-12-2015, 11:14 AM   #458
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by JSWolf View Post
Don't use this. It won't work. Sorry, but you cannot guarantee that it won't mess something up.

Take a look at the following line...

<p><span>This is some text.<span class="smallcaps">THIS IS MORE TEXT</span>. This is yet more text.</span> And finally the last bit of text.</p>

Can you use regex to get rid of the empty span without messing up the span that actually does something? I don't see how you can.
The reason it won't do anything... is because there is no empty tag set, span or otherwise.
eschwartz is offline   Reply With Quote
Old 01-12-2015, 12:58 PM   #459
1v4n0
Groupie
1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.
 
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
Quote:
Originally Posted by mzmm View Post
Thanks for the explanation. Your code doesn't work though.
1v4n0 is offline   Reply With Quote
Old 01-12-2015, 01:17 PM   #460
mzmm
Groupie
mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.
 
mzmm's Avatar
 
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
had some trouble pasting this in, but anyway, it's fixed

Code:
(?s)<p[^>]*?>\s*?(?:<\w[^>/]*?>)*?\s*?(?:&nbsp;|<br(?:\s|\s/|/)?>)*?\s*?(?:</\w[^>/]*?>)*?\s*?</p>

Last edited by mzmm; 01-12-2015 at 01:21 PM.
mzmm is offline   Reply With Quote
Old 01-12-2015, 02:25 PM   #461
1v4n0
Groupie
1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.
 
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
hmm still doesn't work. Only finds the tags with &nbsp; inside.

Last edited by 1v4n0; 01-12-2015 at 03:21 PM.
1v4n0 is offline   Reply With Quote
Old 01-12-2015, 03:44 PM   #462
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Use mine. It will remove matched tag pairs with no content or containing only a space or html entity space.

regex is not a programming language. The appropriate way to remove multiple nested sets is by repeating the Replace All until there are none left.
eschwartz is offline   Reply With Quote
Old 01-12-2015, 04:06 PM   #463
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,897
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by eschwartz View Post
The reason it won't do anything... is because there is no empty tag set, span or otherwise.
There is an empty span set. It's the span that does nothing. The other span does something so it's not empty. But regex will not recognize which </span> is the closing for empty span.
JSWolf is offline   Reply With Quote
Old 01-12-2015, 04:11 PM   #464
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by JSWolf View Post
There is an empty span set. It's the span that does nothing. The other span does something so it's not empty. But regex will not recognize which </span> is the closing for empty span.
Perhaps the fundamental point of this particular question has escaped you. The OP is asking for something to find tags which do not enclose rendered text.

It is also worth pointing out that the span tag *can* be styled without attributes. Which would be the only reason to have an attributeless span tag anyway.
eschwartz is offline   Reply With Quote
Old 01-23-2015, 12:24 PM   #465
mzmm
Groupie
mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.
 
mzmm's Avatar
 
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
Quote:
Originally Posted by 1v4n0 View Post
Thanks for the explanation. Your code doesn't work though.
works for me in Sigil

it matches

<p></p>
<p><br/></p>
<p><span><br/>&nbsp;</span></p>
<p><i><b><br />&nbsp;</b></i></p>
...

but not when the tags are unevenly distributed, as in
<p><i>&nbsp;</i><span></span></p>

anyway. grave-digging threads here...
mzmm is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Examples of Subgroups emonti8384 Lounge 32 02-26-2011 06:00 PM
Accessories Pen examples Gunnerp245 enTourage Archive 15 02-21-2011 03:23 PM
Stylesheet examples? Skitzman69 Sigil 15 09-24-2010 08:24 PM
Examples kafkaesque1978 iRiver Story 1 07-26-2010 03:49 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM


All times are GMT -4. The time now is 07:22 PM.


MobileRead.com is a privately owned, operated and funded community.