![]() |
#766 | |
Groupie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 179
Karma: 91148
Join Date: Jun 2010
Device: Sony 350
|
Quote:
![]() |
|
![]() |
![]() |
![]() |
#767 |
Member
![]() Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
|
Use of the quantifier "?"
In automatic translation:
Good morning, I have two identical construction links except for the presence of a class. <a href="anchor5" id="note5">5</a> <a class="backlink" href="anchor6" id="noted">6</a> In Regex 101, I manage to select the two links using the quantifier "?" on the class group. <a (class="backlink")? href="(.*?)" id="(.*?)">([0-9] {1,4})</a> In Sigil, it does not work, I only recover the link with the class. I also tried with the quantifier "*", without better result. Does anyone know how to tell me? Thanks. |
![]() |
![]() |
![]() |
#768 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
|
Use this:
Code:
<a( class="backlink")* href="(.+?)" id="(.+?)">(\d+)</a> 1. * means that it may or may not occur. 2. you forgot a space before "class", since you already applied one before "href". 3. \d+ is the easiest way to numbers. |
![]() |
![]() |
![]() |
#769 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,352
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
I would think you'd need to make that first space optional as well. <a( class="backlink")? href="(.*?)" id="(.*?)">([0-9] {1,4})</a> I can't explain why your original would work on Regex 101 though. ![]() There also appears to be space between your [0-9] character class and the {1,4}, but that could be an artifact of not pasting your regex between between code tags. I can confirm that your regex works in Sigil if the first space in made optional (or the second space really--it doesn't matter): Code:
<a( class="backlink")? href="(.*?)" id="(.*?)">([0-9]{1,4})</a Last edited by DiapDealer; 03-17-2025 at 10:31 AM. |
|
![]() |
![]() |
![]() |
#770 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 808
Karma: 2416112
Join Date: Jan 2017
Location: Poland
Device: Various
|
You are right!
The excerpt with * is better in this case: Code:
href="(.*?)" id="(.*?)" Code:
href="" id="" |
![]() |
![]() |
![]() |
#771 |
Member
![]() Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
|
Good morning, Yes the spaces are due to the fact of the multiple copy-paste between the word processor, the translator and the absence of a code block. In fact I was wrong, I thought it worked under Regex 101, because it automatically copies in the substitution area, I had not been careful. But in fact it is like in Sigil, it only finds with an occurrence of the chain, while I ask with or without. I would have liked, as far as possible, with the same regex (because of course, I could treat in 2 passages): - Find all links; Whether or not this first channel is (class or other) - be able to modify, for example group 5 of all links - Do not postpone the channel \ 1 The quantifier ? (0 or 1), only finds 1 the quantifier * (0 or more), does not match The quantifier {0.1}, only finds 1 Thanks Last edited by DiapDealer; 03-18-2025 at 10:20 AM. Reason: Thumbnailed oversized images |
![]() |
![]() |
![]() |
#772 |
Member
![]() Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
|
Good morning, If it can help on what I'm looking for, I have another example of a random presence of a chain. Here, I have or not (0 to several times) the presence of the insessable space. In this case I therefore use the quantifier *, and in this case, where the chain is not at the beginning, Sigil or Regex 101 finds and deals well with the 4 links. Thanks Last edited by DiapDealer; 03-18-2025 at 10:23 AM. Reason: Thumbnailed oversized images |
![]() |
![]() |
![]() |
#773 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
|
Screenshots you can post as attachments. You can do that by "Go Advanced" and clicking the paperclip icon, to add an attachment.
Code:
<p><a href="anchor450" id="for450">450</a>Texte</p> <p><a href="anchor451" id="for451">451</a> Texte</p> <p><a href="anchor452" id="for452">452</a>  Texte</p> <p><a href="anchor453" id="for453">453</a>   Texte</p> Code:
<p><a href="(.+?)" id="(.+?)">([0-9]{1,4})</a>[( )+]*(.+?)</p> |
![]() |
![]() |
![]() |
#774 |
Member
![]() Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
|
Hello Haudek,
Thank you for your answer. I am looking for a solution for the post above. This one was just an illustration, to explain myself well, to show that I managed to find an inconsistent chain in this example. And my regex: Code:
<p><a href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(.+?)*(.+?)</p> Code:
<p><a href="(.+?)" id="(.+?)">([0-9]{1,4})</a>[(#160;)+]*(.+?)</p> My problem is in the preceding post, where I cannot manage this famous inconsistent chain (Class="Blacklink"). Last edited by Pavulon; 03-18-2025 at 09:22 AM. |
![]() |
![]() |
![]() |
#775 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
|
Why?
What is its inconsistency? Code:
<p><a(?: class="backlink")? href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(?:\s)*(.+?)</p> Replace: Code:
<p><a href="\1" id="\2">\3</a>\4</p> Code:
def replace(match, number, file_name, metadata, data): if match: return "<p><a href=\"" + match.group(1) + "\" id=\"" + match.group(2) + "\">" + match.group(3) + "</a>" + match.group(4) + "</p>" |
![]() |
![]() |
![]() |
#776 |
Member
![]() Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
|
Hello Haudek,
Thank you for taking the time to help me. We are, through my fault, encountering comprehension problems. I never meant inconsistency, but inconstancy (translation?). I'd like to make it clear that there's no problem with the spacing and that it was only due to the fact that I'd mismanaged (the senility of my 73 years no doubt...) the copy-paste between the different editors. Let me start from the beginning. I was showing an example that works, where in links, I have the inconstant presence of a character string (in this case the unbreakable space). Inconstant in French, which means that sometimes there are (from one to several), sometimes there aren't, as in : Code:
<p><a href="anchor450" id="for450">450</a>Text1</p> <p><a href="anchor451" id="for451">451</a>*Text2</p> <p><a href="anchor452" id="for452">452</a>**Text3</p> <p><a href="anchor453" id="for453">453</a>***Text4</p> Code:
To research: <p><a href="(.*?)" id="(.*?)">([0-9]{1,4})</a>(*)*(.*?)</p> To replace: <p><a href="\1" id="\2">\3</a>\5 something else</p> Code:
Result: <p><a href="anchor450" id="for450">450</a>Text1 something else</p> <p><a href="anchor451" id="for451">451</a>Text2 something else</p> <p><a href="anchor452" id="for452">452</a>Text3 something else</p> <p><a href="anchor453" id="for453">453</a>Text4 something else</p> Code:
<p><a href="(.+?)" id="(.+?)">([0-9]{1,4})</a>[(*)+]*(.+?)</p> - we could use a [\d+] notation - we could use a non-capturing group (?: ...) and shift the \x So here's the problem. Let's say I have a group of links, with a string that may or may not be present. Either (0 or 1). So, of course, I could deal with this in several passages, I could always manage, that's not the problem. But intellectually, it's still bothering me (I don't know how it's going to be translated... ,-) ). Here, it's a class, but this string could be something else (epub:type="noteref"), etc. Code:
<p><a href="anchor450" id="for450">450</a>Text1</p> <p><a href="anchor451" id="for451">451</a>Text2</p> <p><a class="backlink" href="anchor452" id="for452">452</a>Text3</p> <p><a class="backlink" href="anchor450" id="for450">450</a>Text4</p> Code:
To research: <p><a (.+?)? href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(.+?)</p> To replace: <p> href="\2" id="\3">\4</a>\5 something else</p> Code:
Result: <p><a href="anchor450" id="for450">450</a>Text1</p> <p><a href="anchor451" id="for451">451</a>Text2</p> <p><a href="anchor452" id="for452">452</a>Text3 something else</p> <p><a href="anchor450" id="for450">450</a>Text4 something else</p> Code:
To research: <p><a (.+?){0,1} href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(.+?)</p> To replace: <p><a href="\2" id="\3">\4</a>\5 something else</p> Result: <p><a href="anchor450" id="for450">450</a>Text1</p> <p><a href="anchor451" id="for451">451</a>Text2</p> <p><a href="anchor452" id="for452">452</a>Text3 something else</p> <p><a href="anchor450" id="for450">450</a>Text4 something else</p> If I try with the quantifier "*" (0 or more), then 1 is possible. Code:
To research: <p><a (.+?)* href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(.+?)</p> After your posts, I also tested a non-capturing group, with the same result. Only links to the chain are matched. Code:
To research: <p><a (?:.+?)? href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(.+?)</p> To replace: <p><a href="\1" id="\2">\3</a>\4 something else</p> Result: <p><a href="anchor450" id="for450">450</a>Text1</p> <p><a href="anchor451" id="for451">451</a>Text2</p> <p><a href="anchor452" id="for452">452</a>Text3 something else</p> <p><a href="anchor450" id="for450">450</a>Text4 something else</p> As I see "significant" differences between the different translators (Deepl, Google translate, QTranslate, etc.), and I can't judge whether they're "substantial", as I don't speak English at all, I'm attaching my French text, which you can run through several translators if you need to, to clear up any ambiguities. And I'm pasting here the one given by Deepl. But it's not surprising that there are sometimes misunderstandings in exchanges. Thanks again for your attention. Translated with www.DeepL.com/Translator (free version) Last edited by Pavulon; 03-19-2025 at 09:49 AM. |
![]() |
![]() |
![]() |
#777 |
Member
![]() Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
|
I don't know how to do it, it replaces the HTML code for non-breaking spaces with asterisks.
And how did you manage to put the images like that? Did you turn them into attachments? Last edited by Pavulon; 03-19-2025 at 09:41 AM. |
![]() |
![]() |
![]() |
#778 | ||
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
|
Newest regex:
Code:
<p><a(?:\s+\w+="[^"]*")*\s+href="(.+?)"\s+id="(.+?)">([0-9]{1,4})</a>(?:\s)*(.+?)</p> Quote:
Quote:
1. From the toolbar, you select the paperclip. 2. You add an attachment. 3. After adding an image attachment, you close the "Manage Attachments" window and select the arrow next to the paperclip. Now all you have to do is select the appropriate image and [ATTACH]NUMBER[/ATTACH] will be inserted in the post code. Done. Last edited by Haudek; 03-20-2025 at 04:36 PM. Reason: A simple way to add attachments. |
||
![]() |
![]() |
![]() |
#779 |
Member
![]() Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
|
Good evening Haudek,
Perfect, it works very well. I understand the regex you gave me: - a non-capturing group - the secure management of spaces (\s+). - 1 or more characters (\w+) up to the = sign - between quotation marks, a set (from 0 to several) of characters, except for a quotation mark - the non-capturing group, present from 0 to several times but I'd never have thought of something like that. What's more, the quantifier “?” (0 or 1) works with it! Would you have an explanation, not too learned for me, as to why the formula (.*?)? only works when the chain is present, whereas the quantifier says (0 or 1)? I've made a note of how to do this for images and HTML entities, for another time. Thank you very much. |
![]() |
![]() |
![]() |
#780 | |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
|
Quote:
Look here. That "empty string" in the first line does not literally mean that something is "present." |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Examples of Subgroups | emonti8384 | Lounge | 32 | 02-26-2011 06:00 PM |
Accessories Pen examples | Gunnerp245 | enTourage Archive | 15 | 02-21-2011 03:23 PM |
Stylesheet examples? | Skitzman69 | Sigil | 15 | 09-24-2010 08:24 PM |
Examples | kafkaesque1978 | iRiver Story | 1 | 07-26-2010 03:49 PM |
Looking for examples of typos in eBooks | Tonycole | General Discussions | 1 | 05-05-2010 04:23 AM |