Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 02-06-2018, 06:32 PM   #16
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I'm fairly certain it's simply the Minimum Match option being checked.
DiapDealer is offline   Reply With Quote
Old 02-08-2018, 03:48 AM   #17
Wasserpulle
Junior Member
Wasserpulle began at the beginning.
 
Wasserpulle's Avatar
 
Posts: 6
Karma: 10
Join Date: Dec 2013
Location: Europe
Device: Mediapad M3
Red face

Quote:
Originally Posted by DiapDealer View Post
I'm fairly certain it's simply the Minimum Match option being checked.
It was...
Unintentionally checked that option.

Hope this helps other guys in the future.
Wasserpulle is offline   Reply With Quote
Advert
Old 02-08-2018, 09:10 AM   #18
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
All's well that ends well.
DiapDealer is offline   Reply With Quote
Old 02-17-2018, 11:06 AM   #19
WS64
WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.
 
WS64's Avatar
 
Posts: 660
Karma: 506380
Join Date: Aug 2010
Location: Germany
Device: Kobo Aura / PB Lux 2 / Bookeen Frontlight / Kobo Mini / Nook Color
Quote:
Originally Posted by KevinH View Post
Tried the following on a Mac and it worked exactly as expected:

Code:
The regular expression used:
an:\s[a-zA-Z]*
Code:
The file to search in:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
  "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>huh</title>
</head>
<body>
<p>&nbsp;There is an: B C 1 2 3</p>
<p>&nbsp;There is an: 1 2 3</p>
</body>
</html>
It first highlighted the "an: B" and then next highlighted the "an: " since we specified the * which allows 0 occurrences of the [a-zA-z] set.

So regular expressions seem to work just fine on a Mac.

Would someone with access to Windows please try this exact example and let me know if it works correctly or not.
Kevin, and what do you get in your example when you just search for [a-zA-Z]* ?
WS64 is offline   Reply With Quote
Old 02-17-2018, 11:27 AM   #20
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Nothing. Which again makes sense to me as that particular regular expression makes no sense since the * means 0 or 1 instances so under the 0 case it matches everything meaning the set a-zA-Z need never be used. To make a reasonable re you need some pattern to anchor the search.
KevinH is offline   Reply With Quote
Advert
Old 02-17-2018, 12:20 PM   #21
WS64
WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.
 
WS64's Avatar
 
Posts: 660
Karma: 506380
Join Date: Aug 2010
Location: Germany
Device: Kobo Aura / PB Lux 2 / Bookeen Frontlight / Kobo Mini / Nook Color
Sorry, that is not correct. It should find empty strings plus all "words".
The * should always find more than the +.
Something is wrong here.
WS64 is offline   Reply With Quote
Old 02-17-2018, 12:32 PM   #22
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by WS64 View Post
Sorry, that is not correct. It should find empty strings plus all "words".
The * should always find more than the +.
Something is wrong here.
Only if the character to the immediate right of the cursor when doing the search is a letter. I suspect that Kevin is beginning from the beginning of the html file (or at least the very beginning of a line of code in the file), which will nearly always be "<".

That likely explains why it's returning nothing for him.

You are correct, though that * should always find more than +. And that's what I get when I use [a-zA-Z]* on text. I suspect Kevin would, too.

Where the OP was running into trouble was that he was reversing the default (un)greediness of * and + by checking the "Minimal Match" box.
DiapDealer is offline   Reply With Quote
Old 02-17-2018, 12:55 PM   #23
WS64
WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.
 
WS64's Avatar
 
Posts: 660
Karma: 506380
Join Date: Aug 2010
Location: Germany
Device: Kobo Aura / PB Lux 2 / Bookeen Frontlight / Kobo Mini / Nook Color
I get "No matches found".
Just for the record, 169 matches (50 with minimal matches) when searching for [a-zA-Z]+, all on the example clip.
But nothing for the * search, and that is NOT correct.
WS64 is offline   Reply With Quote
Old 02-17-2018, 01:07 PM   #24
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Where is the cursor when you start the search?

If you start the search at the beginning of any of the following lines:
Code:
<?xml version="1.0" encoding="utf-8"?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
  "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

45 here's some more letters.

"Heres a string that starts with a quote"
Then [a-zA-Z]* should--and does--result in "No matches found".

If you search at the start of any of the following lines:
Code:
Hello there.

asdkasd asdsakhk

lkj;lkj;lkj
then [a-zA-Z]* will correctly match up until the first non-letter.

Are you saying it doesn't for you?
DiapDealer is offline   Reply With Quote
Old 02-17-2018, 01:21 PM   #25
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Directly from the docs (Wiki in this case):
Code:
?	The question mark indicates zero or one occurrences of the preceding element. For example, colou?r matches both "color" and "colour".
*	The asterisk indicates zero or more occurrences of the preceding element. For example, ab*c matches "ac", "abc", "abbc", "abbbc", and so on.
+	The plus sign indicates one or more occurrences of the preceding element. For example, ab+c matches "abc", "abbc", "abbbc", and so on, but not "ac".
So an "*" by definition will match 0 or more instances of the pattern preceding it. Matching 0 cases of of the pattern [a-zA-Z] and one or more cases of the pattern makes no sense as it matches everything.

To be clear from this example:

If I run "count all" using this regular expression [a-zA-Z]* on the following line:

Code:
<p> this is a line of text </p>
when the cursor is just before the first '<, I get no matches found. If I then advance the cursor to just before the "t" in this and then run "count all" I get 1 match found (the "this") but nothing afterwards.

If I instead change to something that is actually sensible to me:
[a-zA-Z]+

I find all of the (ascii) words in the file (with the cursor on the first line).

That type of re should only be used after an pattern so that it will actually find things not just everything.

And yes you can create re patterns that make no sense and that will work differently on different implementations of re.

If I wanted to get "words" I would instead use the following regular expression:

\w+

or

[a-zA-Z]+

which does exactly parse things into "words" no matter where the cursor starts.
KevinH is offline   Reply With Quote
Old 02-17-2018, 01:43 PM   #26
WS64
WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.
 
WS64's Avatar
 
Posts: 660
Karma: 506380
Join Date: Aug 2010
Location: Germany
Device: Kobo Aura / PB Lux 2 / Bookeen Frontlight / Kobo Mini / Nook Color
It does not matter where the cursor is, I always find nothing with [a-zA-Z]* and I always find something when searching for [a-zA-Z]+.
When I put the same example in my editor (Ultraedit) both * and + find something.
And also for the one liner, it does not matter where the cursor is.

I agree that I hardly search for * but mostly for +, but still Sigil behaves strange here.

And honestly, I don't understand where the cursorposition comes into play here.
But even if that were important, it does not explain why a search for + at exactly the same cursorposition does find something when * does not.

Oh, and I usually avoid \w, I never know if it will find German umlauts or not, so I prefer to always write the exact letters I mean!

@DialDiaper, "Are you saying it doesn't for you?". Yes, I am. I can't get Sigil to find anything when searching for [a-zA-Z]*

Last edited by WS64; 02-17-2018 at 01:46 PM.
WS64 is offline   Reply With Quote
Old 02-17-2018, 01:54 PM   #27
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
It does not matter where the cursor is, I always find nothing with [a-zA-Z]* and I always find something when searching for [a-zA-Z]+.
Of course it matters where the cursor is.

If you're saying that [a-zA-Z]* finds nothing when you put the cursor in front of a letter (or sequence of letters), then you're experiencing different behavior than we are. Either that or you have the Minimal Match box checked (and you shouldn't if you expect to ever find anything with [a-zA-Z]*).

Uncheck Minimal Match and [a-zA-Z]* will find words when the cursor is placed immediately in front of a sequence of letters.

I'd also appreciate a better attempt at my username. Your transposition of letters hardly seems accidental (or respectful).

Last edited by DiapDealer; 02-17-2018 at 01:58 PM.
DiapDealer is offline   Reply With Quote
Old 02-17-2018, 01:56 PM   #28
WS64
WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.
 
WS64's Avatar
 
Posts: 660
Karma: 506380
Join Date: Aug 2010
Location: Germany
Device: Kobo Aura / PB Lux 2 / Bookeen Frontlight / Kobo Mini / Nook Color
I tried it with and without "Minimal Match", does not change anything.
And since Doitsu mentioned the same in the 4th post of this thread I am not completely alone I guess...
WS64 is offline   Reply With Quote
Old 02-17-2018, 02:02 PM   #29
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by WS64 View Post
I tried it with and without "Minimal Match", does not change anything.
Yes. It does. If the cursor is to the left of a letter, [a-zA-Z]* will match something when Minimal Match is unchecked. If Minimal Match is checked, [a-zA-Z]* will never match anything. Ever.


Quote:
Originally Posted by WS64 View Post
And since Doitsu mentioned the same in the 4th post of this thread I am not completely alone I guess...
Because his cursor was likely at the beginning of a line of code that started with '<'. I'm sure he'll be able to verify that [a-zA-Z]* does indeed return matches for him when the cursor is placed in front of a letter before clicking "Find" (and Minimum Match is unchecked).

Last edited by DiapDealer; 02-17-2018 at 02:12 PM.
DiapDealer is offline   Reply With Quote
Old 02-17-2018, 02:06 PM   #30
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
@KevinH:

Some regex search engines actually return 0 length matches and allow you to step through them using "find next".

In cases like that using your string:
Code:
<p> this is a line of text </p>
The first "match" would be a zero-length match ('<')
Hitting Find again would match 'p'.
Hitting Find again would be a zero-length match ('>')
Hitting Find again would be a zero-length match (the space)
Hitting Find one more time would match 'this'
etc...

Sigil's regex search feature has never "advanced" beyond the first zero-length match to my knowledge. And I don't see any compelling reasons to make it do so. Having to hit "Find" two, three (or more) times before you find a "real" match doesn't seem all that useful or intuitive to me.

Last edited by DiapDealer; 02-17-2018 at 02:10 PM.
DiapDealer is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Search is a regex?? drawson1 Library Management 4 12-21-2017 09:57 PM
Regex in search problems (NOT Search&Replace; the search bar) lairdb Calibre 3 03-15-2017 07:10 PM
Regex Search doesn't search all files in Edit Book GregTheGrate Editor 8 11-08-2016 12:47 AM
Why didn't this regex work right? mrmikel Editor 1 04-12-2014 10:04 AM
Search & Replace doesn't work for quotes habanr Conversion 11 04-22-2011 11:50 AM


All times are GMT -4. The time now is 05:34 AM.


MobileRead.com is a privately owned, operated and funded community.