Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 08-30-2014, 02:30 PM   #406
ReaderRabbit
Member
ReaderRabbit began at the beginning.
 
ReaderRabbit's Avatar
 
Posts: 24
Karma: 10
Join Date: Mar 2011
Location: Colorado
Device: Cruz Tablet
Quote:
Originally Posted by Steadyhands View Post
You need to be in Regex mode in the search box for this to work.
I am in Mode: Regex and 'All HTML Files'
ReaderRabbit is offline   Reply With Quote
Old 08-31-2014, 12:56 AM   #407
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by Steadyhands View Post
Note you will have to change the xxxx to whatever your paragraph style and there is a space after the \1
No, change the xxxx\d+ . (This may be why ReaderRabbit was no successful, as it certainly would lead to failure if the class does not have trailing numbers. Like in the example given: "indent".)

It woud be better to use a catchall like

Find:
Code:
((Mr|Mrs|Dr|other)\.)</p>\s*<p( [^>]*)?>
Replace:
Code:
\1
(space after the 1)

Last edited by eschwartz; 08-31-2014 at 01:01 AM.
eschwartz is offline   Reply With Quote
Advert
Old 08-31-2014, 06:48 AM   #408
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,441
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by eschwartz View Post
It woud be better to use a catchall like

Find:
Code:
((Mr|Mrs|Dr|other)\.)</p>\s*<p( [^>]*)?>
Replace:
Code:
\1
(space after the 1)
Which would exclude paragraphs that had no attributes. Why not:
Code:
((Mr|Mrs|Dr|other)\.)</p>\s*<p[^>]*?>
Replace:
Code:
\1
(space after the 1)
instead?
DiapDealer is offline   Reply With Quote
Old 08-31-2014, 10:51 AM   #409
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by DiapDealer View Post
Which would exclude paragraphs that had no attributes. Why not:
Code:
((Mr|Mrs|Dr|other)\.)</p>\s*<p[^>]*?>
Replace:
Code:
\1
(space after the 1)
instead?
Reread my regex. I did think of that.

Nitpick: I used an optional capture group, which includes a space (because as a general idea, I like demanding spaces after the tag name and before the attribute, to avoid matching the wrong tags) -- but your regex does not need a ? because the star already covers that.

Last edited by eschwartz; 08-31-2014 at 10:57 AM.
eschwartz is offline   Reply With Quote
Old 08-31-2014, 11:23 AM   #410
ReaderRabbit
Member
ReaderRabbit began at the beginning.
 
ReaderRabbit's Avatar
 
Posts: 24
Karma: 10
Join Date: Mar 2011
Location: Colorado
Device: Cruz Tablet
Smile YEA! That worked :o}

Quote:
Originally Posted by eschwartz View Post
Reread my regex. I did think of that.

Nitpick: I used an optional capture group, which includes a space (because as a general idea, I like demanding spaces after the tag name and before the attribute, to avoid matching the wrong tags) -- but your regex does not need a ? because the star already covers that.
Thank you all for looking into this error for me.
ReaderRabbit is offline   Reply With Quote
Advert
Old 08-31-2014, 11:25 AM   #411
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,441
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by eschwartz View Post
Reread my regex. I did think of that.
My bad. I didn't notice the space was inside the optional grouping. Part of the reason I don't like to use a lot of extraneous grouping in my regex.

Quote:
Originally Posted by eschwartz View Post
Nitpick: I used an optional capture group, which includes a space (because as a general idea, I like demanding spaces after the tag name and before the attribute, to avoid matching the wrong tags)
If you're worried about matching the wrong tags (like <image for <i), just make sure there's a word break after the element name.
Code:
<p\b[^>]*>
Or if you're in calibre's editor, even the end-of-word match:
Code:
<p\M[^>]*>
DiapDealer is offline   Reply With Quote
Old 08-31-2014, 12:55 PM   #412
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by DiapDealer View Post
My bad. I didn't notice the space was inside the optional grouping. Part of the reason I don't like to use a lot of extraneous grouping in my regex.


If you're worried about matching the wrong tags (like <image for <i), just make sure there's a word break after the element name.
Code:
<p\b[^>]*>
Or if you're in calibre's editor, even the end-of-word match:
Code:
<p\M[^>]*>
That works too.

Po-tay-to po-tah-to.

Doesn't confuse me, I got used to doing it this way (seems in my subjective opinion to make more sense), etc. Never really fell in love with word boundaries.
eschwartz is offline   Reply With Quote
Old 08-31-2014, 02:03 PM   #413
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,441
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by eschwartz View Post
Po-tay-to po-tah-to.

Doesn't confuse me, I got used to doing it this way (seems in my subjective opinion to make more sense), etc. Never really fell in love with word boundaries.
Oh, absolutely. Lots of ways to skin the same cat. I don't love any of it, to tell the truth. I just find word boundaries to useful NOT to use. *shrug*
DiapDealer is offline   Reply With Quote
Old 09-20-2014, 06:09 PM   #414
JimmyG
Zealot
JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.
 
Posts: 119
Karma: 28454
Join Date: Apr 2011
Location: Yuma, AZ
Device: Kindle Touch, Voyage
What am I doing wrong?

Okay this regex $(.+)^ finds whole lines with text in them.

But I don't want lines that start with <, so I tried this $([^<].+)^
but that includes any preceding blank line and the next line, whether it starts with < or not, and I don't know why.

I want whole lines (not empty) that don't start with a tag.

Last edited by JimmyG; 09-20-2014 at 06:13 PM. Reason: clarify
JimmyG is offline   Reply With Quote
Old 09-20-2014, 08:42 PM   #415
mzmm
Groupie
mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.
 
mzmm's Avatar
 
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
Quote:
Originally Posted by JimmyG View Post
Okay this regex $(.+)^ finds whole lines with text in them.

But I don't want lines that start with <, so I tried this $([^<].+)^
but that includes any preceding blank line and the next line, whether it starts with < or not, and I don't know why.

I want whole lines (not empty) that don't start with a tag.
it might help if you post the code you're trying to parse and the output you expect, because the regex you're using doesn't really fit with what you're describing.

if you're trying to find lines in an html document that don't begin with a < (which would be very rare), then you could use something like

Code:
^\s*[^<\s]+
but i don't think that's really what you're after...
mzmm is offline   Reply With Quote
Old 09-20-2014, 08:53 PM   #416
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,441
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by JimmyG View Post
Okay this regex $(.+)^ finds whole lines with text in them.

But I don't want lines that start with <, so I tried this $([^<].+)^
but that includes any preceding blank line and the next line, whether it starts with < or not, and I don't know why.

I want whole lines (not empty) that don't start with a tag.
You got your anchors flipped ^ is the beginning of a string(or line) and $ is the end.

But even ^([^<].+)$ isn't going to be very useful in a Sigil formatted file. "Lines" get very hairy in a file. A paragraph is typically on one "line" (meaning no line-break characters) from <p> to </p>. Same with just about any block-level element. And many lines are likely to be indented, so they don't start with "<" they start with a space. That's probably why they're getting included in your search.

It's including blank lines because blank lines DON'T start with a "<", they start with line-break character(s).

There really shouldn't be any (or very, very few anyway) "lines" that don't begin with a "<" (or an indent before a "<"). Some css styling in the header and the like maybe.

If it's these relatively rare instances you're looking for perhaps something like:
Code:
^\w.+$
might come close?
DiapDealer is offline   Reply With Quote
Old 09-21-2014, 12:12 PM   #417
JimmyG
Zealot
JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.JimmyG solves Fermat’s last theorem while doing the crossword.
 
Posts: 119
Karma: 28454
Join Date: Apr 2011
Location: Yuma, AZ
Device: Kindle Touch, Voyage
Quote:
Originally Posted by DiapDealer View Post
You got your anchors flipped ^ is the beginning of a string(or line) and $ is the end.

But even ^([^<].+)$ isn't going to be very useful in a Sigil formatted file. "Lines" get very hairy in a file. A paragraph is typically on one "line" (meaning no line-break characters) from <p> to </p>. Same with just about any block-level element. And many lines are likely to be indented, so they don't start with "<" they start with a space. That's probably why they're getting included in your search.

It's including blank lines because blank lines DON'T start with a "<", they start with line-break character(s).

There really shouldn't be any (or very, very few anyway) "lines" that don't begin with a "<" (or an indent before a "<"). Some css styling in the header and the like maybe.

If it's these relatively rare instances you're looking for perhaps something like:
Code:
^\w.+$
might come close?
Yeah, I caught that $^. I was making my problem too complicated by trying to do it in Sigil. The lines without p tags are all contiguous, so I just copy them to EditPadPro (where ^([^<].*)$ does not include the blank line) and then copy them back to Sigil. Actually, doing it that way, ^(.+)$ should work, but the original worked so I just left it.

Last edited by JimmyG; 09-21-2014 at 12:14 PM.
JimmyG is offline   Reply With Quote
Old 10-03-2014, 02:30 AM   #418
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,582
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
off-topic: I saw this Regex T-Shirt yesterday in the subway and it took me at least 5 minutes to figure it out.

Doitsu is offline   Reply With Quote
Old 10-03-2014, 04:12 AM   #419
canpolat
Connoisseur
canpolat for a long time would go to bed early.canpolat for a long time would go to bed early.canpolat for a long time would go to bed early.canpolat for a long time would go to bed early.canpolat for a long time would go to bed early.canpolat for a long time would go to bed early.canpolat for a long time would go to bed early.canpolat for a long time would go to bed early.canpolat for a long time would go to bed early.canpolat for a long time would go to bed early.canpolat for a long time would go to bed early.
 
Posts: 92
Karma: 17950
Join Date: Mar 2013
Device: Xodo
I think it should be:

Code:
(bb|[^2]b)
Currently it reads: two b or two not b.
canpolat is offline   Reply With Quote
Old 10-03-2014, 07:12 AM   #420
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,441
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Be be or not be twice?
DiapDealer is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Examples of Subgroups emonti8384 Lounge 32 02-26-2011 06:00 PM
Accessories Pen examples Gunnerp245 enTourage Archive 15 02-21-2011 03:23 PM
Stylesheet examples? Skitzman69 Sigil 15 09-24-2010 08:24 PM
Examples kafkaesque1978 iRiver Story 1 07-26-2010 03:49 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM


All times are GMT -4. The time now is 11:24 PM.


MobileRead.com is a privately owned, operated and funded community.