Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 03-19-2025, 07:10 PM   #781
Pavulon
Member
Pavulon began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
Good evening, Haudek,

Got it.
Problem solved.
Thanks for everything.
Pavulon is offline   Reply With Quote
Old 03-27-2025, 08:58 PM   #782
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 502
Karma: 65460
Join Date: Jun 2011
Device: Kindle
Is there a way to match numbers that are between repeated characters?

For example to match the numbers in:

Code:
<p class="lorem">.........6.........</p>

<p class="lorem">____8___</p>

<p class="lorem">----------3-------</p>
Something like:

Code:
<p.*?>[_-\]*?\K[0-9](?=[_-\]*?</p>)
Except that would match strings like:

Code:
<p class="lorem">_._--__8..----</p>
and would fail to match strings like:

Code:
<p class="lorem">****8***</p>
when what I'm trying to do is match numbers nested within ANY repeated string of characters.
ElMiko is offline   Reply With Quote
Advert
Old 03-27-2025, 09:30 PM   #783
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,684
Karma: 9500498
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
Quote:
Originally Posted by ElMiko View Post
<p class="lorem">.........6.........</p>
I assume you want to capture the number to reuse.

PHP Code:
\.+(\d+)\.+ 
Then change the \. to match whatever other character you want to find.
Karellen is offline   Reply With Quote
Old 03-27-2025, 09:42 PM   #784
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 502
Karma: 65460
Join Date: Jun 2011
Device: Kindle
Quote:
Originally Posted by Karellen View Post
I assume you want to capture the number to reuse.

PHP Code:
\.+(\d+)\.+ 
Then change the \. to match whatever other character you want to find.
One of the things I'm trying to solve for is when I don't know what the repeating character is, only that it is repeating. It could be a period or a hyphen or the letter "a"... literally anything (except a number).

Last edited by ElMiko; 03-27-2025 at 09:45 PM.
ElMiko is offline   Reply With Quote
Old 03-27-2025, 09:49 PM   #785
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,684
Karma: 9500498
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
Quote:
Originally Posted by ElMiko View Post
when what I'm trying to do is match numbers nested within ANY repeated string of characters.
Oops. Missed this last line. Thought it was part of your signature.

Try...

PHP Code:
>\p{P}+(\d+)\p{P}+< 
Karellen is offline   Reply With Quote
Advert
Old 03-28-2025, 12:30 AM   #786
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 502
Karma: 65460
Join Date: Jun 2011
Device: Kindle
Quote:
Originally Posted by Karellen View Post
Oops. Missed this last line. Thought it was part of your signature.

Try...

PHP Code:
>\p{P}+(\d+)\p{P}+< 
So, this is similar to the second problem i ran into my sample regex. Namely, it'll match ANY string (repeated or not) with a nested number. eg:

Code:
<p class="lorem">_._--__8..----</p>
I'm trying to only find numbers nested between a repeated character, not numbers nested between any characters.
ElMiko is offline   Reply With Quote
Old 03-28-2025, 03:34 AM   #787
Haudek
Member
Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.
 
Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
Quote:
Originally Posted by ElMiko View Post
I'm trying to only find numbers nested between a repeated character, not numbers nested between any characters.
I think I understand what you need.

Code:
<p[^>]*>.*?(.).*?\d+.*?\1.*?</p>
IMHO, the key is to use \1 in the search to indicate that you need a part that has already appeared before the number.
Haudek is offline   Reply With Quote
Old 03-28-2025, 09:31 AM   #788
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 502
Karma: 65460
Join Date: Jun 2011
Device: Kindle
I almost considered apologizing in advance for the convoluted question, but better late than never: sorry, everybody! I realize this is a little tricky to understand

Let's say i have the following <p> elements:

Code:
1. <p class="lorem">....1...</p>

2. <p class="lorem">----2---</p>

3. <p class="lorem">___3_____</p>

4. <p class="lorem">_._--__4..----</p>

5. <p class="lorem">aaa5aaaa</p>
I am looking for regex that will match examples 1, 2, 3, and 5, BUT NOT 4.

That is to say, I'm looking to match a <p> element where the number is nested withing any string of repeating characters, and then isolate the number for reuse in a replacement function

@Haudek - I think we're getting close, although i don't think I see any backreference in the first half of the regex...

EDIT1 — SOLVED:

Thanks, Haudek. Your regex got the ball rolling. I'd forgotten that backreferencing works within the search field (not merely in the replace field).

Code:
<p[^>]*>(.)\1*?([0-9]+)\1*?</p>

Last edited by ElMiko; 03-28-2025 at 09:45 AM.
ElMiko is offline   Reply With Quote
Old 07-26-2025, 08:34 AM   #789
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 502
Karma: 65460
Join Date: Jun 2011
Device: Kindle
I'm trying to insert a character into a string. Specifically, I'm trying to insert omitted apostrophes into contractions. I fear this is impossible with regex since I'm not actually matching anything—I'm matching the non-space between two characters.

The search I've been using is:

Code:
(?<=\b[Cc]an|\b[DdWw]on|\b[CcWw]ouldn|\b[Ss]houldn|[Dd]idn|\b[Ii]sn|\b[Aa]ren|\b[WwHh]asn)t\b
which matches the ending "t" that can then be replaced by "’t". But I'm hoping there's a way to write the search such that I'm functionally inserting the apostrophe rather than replacing the "t".
ElMiko is offline   Reply With Quote
Old 07-26-2025, 04:47 PM   #790
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
Why is a replacement for t being 't an issue? It is just a text substring replacement. Replacements can be anything.

Also why not let a spellchecker catch those cases?

Last edited by KevinH; 07-26-2025 at 05:02 PM.
KevinH is offline   Reply With Quote
Old 07-26-2025, 05:28 PM   #791
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 502
Karma: 65460
Join Date: Jun 2011
Device: Kindle
Because, Kevin H, "wont" and "cant" are actually words, so relying on spellcheck will fail to catch instances where these are legitimately erroneous. It also serves as a mental flag that always me that if there are these kinds of apostrophe errors, there's may be others (and therefore I should run some of my other "missing apostrophe" searches). And as to why I want it to be inserted rather than an insertion AND replacement, because there are other types of missing apostrophe errors that don't end in "t", and I'd like to combine them into a single search with a universal replacement value... Namely, a single apostrophe.

But here's the thing, it might simply be more helpful to just consider my question conceptually, rather than practically. Regardless of whether you understand or agree with my reasoning for wanting to create this kind of search, the question is fundamentally about what is possible within regex. Specifically matching (and replacing) liminal spaces between characters.

I actually have an idea that I've used in other contexts, but I'm away from the puter. It involves using (|\s)

Last edited by ElMiko; 07-26-2025 at 06:08 PM.
ElMiko is offline   Reply With Quote
Old 07-26-2025, 05:50 PM   #792
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
The use filter replacements replace all and only check the boxes where the replacement works. Then repeat for a different replacement.

Or

Then use a regex to capture all the cases you want and use python function replace to determine when and where and what to insert it. Then use filter replacements on it.
KevinH is offline   Reply With Quote
Old 07-26-2025, 06:25 PM   #793
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,763
Karma: 24088559
Join Date: Dec 2010
Device: Kindle PW2
A quick and dirty solution would be:

Find:\b(I|[yY]ou|[hH]e|[sS]he|[iI]t|[wW]e|[tT]hey|[tT]hat|[tT]here|[hH]ere|[wW]hat|[wW]ho|[wW]here|[sS]hould|[cC]ould|[wW]ould|[mM]ust|[mM]ight|[cC]an|[dD]o|[dD]id|[dD]oes|[hH]ad|[hH]as|[hH]ave|[iI]s|[nN]eed|[oO]ught|[wW]as|[wW]ere)(ll|re|ve|nt|m|d|s)\b

Replace:\1'\2

It's not a perfect solution though, because it'll replace hell with he'll but also here with he're.

Last edited by Doitsu; 07-26-2025 at 06:32 PM.
Doitsu is offline   Reply With Quote
Old 07-26-2025, 06:42 PM   #794
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
Or use Doitsus find and the filter the replacements to decide from context which ones to apply and which to skip.

There often is no perfect search and replace but being shown the possible replacement in a table with user controlled context is a good way to make sure no mistakes are made.

I have almost given up on replacing one at a time, or using normal replace all, and instead use filter replacements almost exclusively now.
KevinH is offline   Reply With Quote
Old 07-26-2025, 07:00 PM   #795
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 502
Karma: 65460
Join Date: Jun 2011
Device: Kindle
@Doitsu - to be clear, this isn't a Replace All Search. Unlike KevinH, I still prefer cycling through search results individually. I guess I've just learned how to recognize typos in the context of a larger selection of text more efficiently than in more isolated filter version
Also because selecting and deselecting matches feels less efficient than cycling through matches O.G. style.
ElMiko is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Examples of Subgroups emonti8384 Lounge 32 02-26-2011 06:00 PM
Accessories Pen examples Gunnerp245 enTourage Archive 15 02-21-2011 03:23 PM
Stylesheet examples? Skitzman69 Sigil 15 09-24-2010 08:24 PM
Examples kafkaesque1978 iRiver Story 1 07-26-2010 03:49 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM


All times are GMT -4. The time now is 01:41 PM.


MobileRead.com is a privately owned, operated and funded community.