Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 02-14-2024, 02:36 PM   #766
Mister L
Groupie
Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Mister L is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Posts: 179
Karma: 91148
Join Date: Jun 2010
Device: Sony 350
Quote:
Originally Posted by Doitsu View Post
There's a dedicated tool for this task that you might find helpful. Print Page Approximator for EPUB and EPUB3
(The author is an MR member.)
If you happen to have a Kindle, the Calibre KFX output plugin also supports auto-generated page numbers.
It looks like Page Approximator will do exactly what I need. Thank you again Doitsu!
Mister L is offline   Reply With Quote
Old 03-17-2025, 04:09 AM   #767
Pavulon
Member
Pavulon began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
Use of the quantifier "?"

In automatic translation:
Good morning,
I have two identical construction links except for the presence of a class.

<a href="anchor5" id="note5">5</a>

<a class="backlink" href="anchor6" id="noted">6</a>

In Regex 101, I manage to select the two links using the quantifier "?" on the class group.

<a (class="backlink")? href="(.*?)" id="(.*?)">([0-9] {1,4})</a>

In Sigil, it does not work, I only recover the link with the class. I also tried with the quantifier "*", without better result.
Does anyone know how to tell me?
Thanks.
Pavulon is offline   Reply With Quote
Old 03-17-2025, 04:49 AM   #768
Haudek
Member
Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.
 
Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
Use this:
Code:
<a( class="backlink")* href="(.+?)" id="(.+?)">(\d+)</a>
Three comments:
1. * means that it may or may not occur.
2. you forgot a space before "class", since you already applied one before "href".
3. \d+ is the easiest way to numbers.
Haudek is offline   Reply With Quote
Old 03-17-2025, 10:18 AM   #769
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,352
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Pavulon View Post
<a (class="backlink")? href="(.*?)" id="(.*?)">([0-9] {1,4})</a>

In Sigil, it does not work, I only recover the link with the class. I also tried with the quantifier "*", without better result.
Does anyone know how to tell me?
Thanks.
Would your regex not make two spaces mandatory between <a and href?
I would think you'd need to make that first space optional as well.

<a( class="backlink")? href="(.*?)" id="(.*?)">([0-9] {1,4})</a>

I can't explain why your original would work on Regex 101 though.

There also appears to be space between your [0-9] character class and the {1,4}, but that could be an artifact of not pasting your regex between between code tags.

I can confirm that your regex works in Sigil if the first space in made optional (or the second space really--it doesn't matter):

Code:
<a( class="backlink")? href="(.*?)" id="(.*?)">([0-9]{1,4})</a
I wouldn't have expected the original to do what you wanted anywhere, to be honest.

Last edited by DiapDealer; 03-17-2025 at 10:31 AM.
DiapDealer is offline   Reply With Quote
Old 03-17-2025, 10:49 AM   #770
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 808
Karma: 2416112
Join Date: Jan 2017
Location: Poland
Device: Various
You are right!
The excerpt with * is better in this case:
Code:
href="(.*?)" id="(.*?)"
because it will also match empty attributes:
Code:
href="" id=""
It all depends on the intention of the questioner.
BeckyEbook is offline   Reply With Quote
Old 03-17-2025, 04:47 PM   #771
Pavulon
Member
Pavulon began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
Click image for larger version

Name:	gzwb.jpg
Views:	51
Size:	71.1 KB
ID:	214425
Click image for larger version

Name:	ejmj.jpg
Views:	59
Size:	307.5 KB
ID:	214426

Good morning,

Yes the spaces are due to the fact of the multiple copy-paste between the word processor, the translator and the absence of a code block.

In fact I was wrong, I thought it worked under Regex 101, because it automatically copies in the substitution area, I had not been careful. But in fact it is like in Sigil, it only finds with an occurrence of the chain, while I ask with or without.

I would have liked, as far as possible, with the same regex (because of course, I could treat in 2 passages):
- Find all links; Whether or not this first channel is (class or other)
- be able to modify, for example group 5 of all links
- Do not postpone the channel \ 1

The quantifier ? (0 or 1), only finds 1
the quantifier * (0 or more), does not match
The quantifier {0.1}, only finds 1

Thanks

Last edited by DiapDealer; 03-18-2025 at 10:20 AM. Reason: Thumbnailed oversized images
Pavulon is offline   Reply With Quote
Old 03-18-2025, 01:56 AM   #772
Pavulon
Member
Pavulon began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
Click image for larger version

Name:	cu10.jpg
Views:	62
Size:	191.3 KB
ID:	214427
Click image for larger version

Name:	hw8y.jpg
Views:	69
Size:	248.9 KB
ID:	214428

Good morning,

If it can help on what I'm looking for, I have another example of a random presence of a chain. Here, I have or not (0 to several times) the presence of the insessable space.
In this case I therefore use the quantifier *, and in this case, where the chain is not at the beginning, Sigil or Regex 101 finds and deals well with the 4 links.

Thanks

Last edited by DiapDealer; 03-18-2025 at 10:23 AM. Reason: Thumbnailed oversized images
Pavulon is offline   Reply With Quote
Old 03-18-2025, 03:42 AM   #773
Haudek
Member
Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.
 
Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
Screenshots you can post as attachments. You can do that by "Go Advanced" and clicking the paperclip icon, to add an attachment.

Code:
<p><a href="anchor450" id="for450">450</a>Texte</p>
<p><a href="anchor451" id="for451">451</a>&#160;Texte</p>
<p><a href="anchor452" id="for452">452</a>&#160;&#160;Texte</p>
<p><a href="anchor453" id="for453">453</a>&#160;&#160;&#160;Texte</p>
My regex:
Code:
<p><a href="(.+?)" id="(.+?)">([0-9]{1,4})</a>[(&#160;)+]*(.+?)</p>
Is this what you were looking for?
Haudek is offline   Reply With Quote
Old 03-18-2025, 09:17 AM   #774
Pavulon
Member
Pavulon began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
Hello Haudek,

Thank you for your answer.

I am looking for a solution for the post above. This one was just an illustration, to explain myself well, to show that I managed to find an inconsistent chain in this example. And my regex:

Code:
<p><a href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(.+?)*(.+?)</p>
Like yours:

Code:
<p><a href="(.+?)" id="(.+?)">([0-9]{1,4})</a>[(#160;)+]*(.+?)</p>
Find 4 positive results here.

My problem is in the preceding post, where I cannot manage this famous inconsistent chain (Class="Blacklink").

Last edited by Pavulon; 03-18-2025 at 09:22 AM.
Pavulon is offline   Reply With Quote
Old 03-18-2025, 01:25 PM   #775
Haudek
Member
Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.
 
Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
Why?

What is its inconsistency?

Code:
<p><a(?: class="backlink")? href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(?:\s)*(.+?)</p>
Note: This group that starts with ?: is not included in the group count, so whether the class will be or not - it should be OK.

Replace:
Code:
<p><a href="\1" id="\2">\3</a>\4</p>
Since in pure Sigil replacing finding (\s)*(.+?) does not remove spaces, use the "Pavulon" function (works in beta version):

Code:
def replace(match, number, file_name, metadata, data):
	if match:
		return "<p><a href=\"" + match.group(1) + "\" id=\"" + match.group(2) + "\">" + match.group(3) + "</a>" + match.group(4) + "</p>"
I don't quite understand what you're trying to achieve, but I hope this helps.
Haudek is offline   Reply With Quote
Old 03-19-2025, 09:24 AM   #776
Pavulon
Member
Pavulon began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
Hello Haudek,

Thank you for taking the time to help me.
We are, through my fault, encountering comprehension problems.
I never meant inconsistency, but inconstancy (translation?).
I'd like to make it clear that there's no problem with the spacing and that it was only due to the fact that I'd mismanaged (the senility of my 73 years no doubt...) the copy-paste between the different editors.
Let me start from the beginning.
I was showing an example that works, where in links, I have the inconstant presence of a character string (in this case the unbreakable space). Inconstant in French, which means that sometimes there are (from one to several), sometimes there aren't, as in :

Code:
<p><a href="anchor450" id="for450">450</a>Text1</p>
<p><a href="anchor451" id="for451">451</a>*Text2</p>
<p><a href="anchor452" id="for452">452</a>**Text3</p>
<p><a href="anchor453" id="for453">453</a>***Text4</p>
I was saying that with a capturing group and the quantifier "*" - star, asterisk - (0 or more), I could process all links, even the one where the required string (one or more non-breaking spaces) is absent (empty, null).

Code:
To research:
<p><a href="(.*?)" id="(.*?)">([0-9]{1,4})</a>(*)*(.*?)</p>

To replace:
<p><a href="\1" id="\2">\3</a>\5  something else</p>
Treats all 4 links well:
Code:
Result:
<p><a href="anchor450" id="for450">450</a>Text1  something else</p>
<p><a href="anchor451" id="for451">451</a>Text2  something else</p>
<p><a href="anchor452" id="for452">452</a>Text3  something else</p>
<p><a href="anchor453" id="for453">453</a>Text4  something else</p>
You suggested another regex, which also works.
Code:
<p><a href="(.+?)" id="(.+?)">([0-9]{1,4})</a>[(*)+]*(.+?)</p>
But I have no problem with this one. The quantifier on the searched group works and I agree:
- we could use a [\d+] notation
- we could use a non-capturing group (?: ...) and shift the \x

So here's the problem.
Let's say I have a group of links, with a string that may or may not be present. Either (0 or 1).
So, of course, I could deal with this in several passages, I could always manage, that's not the problem. But intellectually, it's still bothering me (I don't know how it's going to be translated... ,-) ).
Here, it's a class, but this string could be something else (epub:type="noteref"), etc.

Code:
<p><a href="anchor450" id="for450">450</a>Text1</p>
<p><a href="anchor451" id="for451">451</a>Text2</p>
<p><a class="backlink" href="anchor452" id="for452">452</a>Text3</p>
<p><a class="backlink" href="anchor450" id="for450">450</a>Text4</p>
So I tried a capturing group from an unknown string, but placed there, with the quantifier "?" (0 or 1).

Code:
To research:
<p><a (.+?)? href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(.+?)</p>

To replace:
<p> href="\2" id="\3">\4</a>\5  something else</p>
Unsuccessfully, unlike the example of non-breaking spaces above, I can only find links in which this string is present. The string (class...) has been removed, but it's clear that the text has only been modified in links with this string.

Code:
Result:
<p><a href="anchor450" id="for450">450</a>Text1</p>
<p><a href="anchor451" id="for451">451</a>Text2</p>
<p><a href="anchor452" id="for452">452</a>Text3  something else</p>
<p><a href="anchor450" id="for450">450</a>Text4  something else</p>
If I try the capturing group with quantifier {0,1}.

Code:
To research:
<p><a (.+?){0,1} href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(.+?)</p>

To replace:
<p><a href="\2" id="\3">\4</a>\5  something else</p>

Result:
<p><a href="anchor450" id="for450">450</a>Text1</p>
<p><a href="anchor451" id="for451">451</a>Text2</p>
<p><a href="anchor452" id="for452">452</a>Text3  something else</p>
<p><a href="anchor450" id="for450">450</a>Text4  something else</p>
I get the same result. I only catch links where the string is present.

If I try with the quantifier "*" (0 or more), then 1 is possible.

Code:
To research:
<p><a (.+?)* href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(.+?)</p>
I'm not capturing anything!

After your posts, I also tested a non-capturing group, with the same result. Only links to the chain are matched.

Code:
To research:
<p><a (?:.+?)? href="(.+?)" id="(.+?)">([0-9]{1,4})</a>(.+?)</p>

To replace:
<p><a href="\1" id="\2">\3</a>\4  something else</p>

Result:
<p><a href="anchor450" id="for450">450</a>Text1</p>
<p><a href="anchor451" id="for451">451</a>Text2</p>
<p><a href="anchor452" id="for452">452</a>Text3  something else</p>
<p><a href="anchor450" id="for450">450</a>Text4  something else</p>
It's quite an oddity! (I wonder how that one's going to get translated again...)

As I see "significant" differences between the different translators (Deepl, Google translate, QTranslate, etc.), and I can't judge whether they're "substantial", as I don't speak English at all, I'm attaching my French text, which you can run through several translators if you need to, to clear up any ambiguities. And I'm pasting here the one given by Deepl.
But it's not surprising that there are sometimes misunderstandings in exchanges.

Thanks again for your attention.

Translated with www.DeepL.com/Translator (free version)
Attached Files
File Type: txt Sigil regex for Haudek in French.txt (5.5 KB, 45 views)

Last edited by Pavulon; 03-19-2025 at 09:49 AM.
Pavulon is offline   Reply With Quote
Old 03-19-2025, 09:35 AM   #777
Pavulon
Member
Pavulon began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
I don't know how to do it, it replaces the HTML code for non-breaking spaces with asterisks.

And how did you manage to put the images like that? Did you turn them into attachments?

Last edited by Pavulon; 03-19-2025 at 09:41 AM.
Pavulon is offline   Reply With Quote
Old 03-19-2025, 12:16 PM   #778
Haudek
Member
Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.
 
Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
Newest regex:
Code:
<p><a(?:\s+\w+="[^"]*")*\s+href="(.+?)"\s+id="(.+?)">([0-9]{1,4})</a>(?:\s)*(.+?)</p>
Quote:
Originally Posted by Pavulon View Post
I don't know how to do it, it replaces the HTML code for non-breaking spaces with asterisks.
If you want to write &#160; then mark “&” and bold this character. Then the parser will not treat this notation as an entity. It adds a bit of work when writing the post, but for later when copying there is no problem. Instead of bold, you can use something else (italics, different color, etc.).

Quote:
Originally Posted by Pavulon View Post
And how did you manage to put the images like that? Did you turn them into attachments?
I don't know if this is the correct way, but it can be done like this:
1. From the toolbar, you select the paperclip.
2. You add an attachment.
3. After adding an image attachment, you close the "Manage Attachments" window and select the arrow next to the paperclip.
Now all you have to do is select the appropriate image and [ATTACH]NUMBER[/ATTACH] will be inserted in the post code.
Done.

Last edited by Haudek; 03-20-2025 at 04:36 PM. Reason: A simple way to add attachments.
Haudek is offline   Reply With Quote
Old 03-19-2025, 06:23 PM   #779
Pavulon
Member
Pavulon began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2023
Device: Kobo Forma
Good evening Haudek,

Perfect, it works very well.
I understand the regex you gave me:
- a non-capturing group
- the secure management of spaces (\s+).
- 1 or more characters (\w+) up to the = sign
- between quotation marks, a set (from 0 to several) of characters, except for a quotation mark
- the non-capturing group, present from 0 to several times

but I'd never have thought of something like that. What's more, the quantifier “?” (0 or 1) works with it!

Would you have an explanation, not too learned for me, as to why the formula (.*?)? only works when the chain is present, whereas the quantifier says (0 or 1)?

I've made a note of how to do this for images and HTML entities, for another time.

Thank you very much.
Pavulon is offline   Reply With Quote
Old 03-19-2025, 06:54 PM   #780
Haudek
Member
Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.
 
Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
Quote:
Originally Posted by Pavulon View Post
Would you have an explanation, not too learned for me, as to why the formula (.*?)? only works when the chain is present, whereas the quantifier says (0 or 1)?
The expression (.*?)? is actually quite specific and I don't think I've ever used it. It works by matching an optional (0 or 1) group, which in turn tries to match as few characters as possible.

Look here. That "empty string" in the first line does not literally mean that something is "present."
Haudek is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Examples of Subgroups emonti8384 Lounge 32 02-26-2011 06:00 PM
Accessories Pen examples Gunnerp245 enTourage Archive 15 02-21-2011 03:23 PM
Stylesheet examples? Skitzman69 Sigil 15 09-24-2010 08:24 PM
Examples kafkaesque1978 iRiver Story 1 07-26-2010 03:49 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM


All times are GMT -4. The time now is 08:27 AM.


MobileRead.com is a privately owned, operated and funded community.