Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 01-08-2022, 09:43 AM   #691
1v4n0
Groupie
1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.
 
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
Is there a regex that finds all the letters and only the letters, including accented ones and, just as example, š č ć ž đ? Or do I have to manually add the to the range, as in [a-zèòéùàšđčćž]? I always risk leaving some aside.

Thanks.
1v4n0 is offline   Reply With Quote
Old 01-08-2022, 09:52 AM   #692
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by 1v4n0 View Post
Is there a regex that finds all the letters and only the letters, including accented ones and, just as example, š č ć ž đ?
\p{Ll} will find all lower case Unicode letters.
\p{Lu} will find all upper case Unicode letters.
Doitsu is offline   Reply With Quote
Advert
Old 01-08-2022, 10:55 AM   #693
1v4n0
Groupie
1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.
 
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
Quote:
Originally Posted by Doitsu View Post
\p{Ll} will find all lower case Unicode letters.
\p{Lu} will find all upper case Unicode letters.
And \p{L} gives all Unicode letters, regardless of case. Thanks! I can't karma you but if I could I would
1v4n0 is offline   Reply With Quote
Old 02-06-2022, 05:06 AM   #694
stumped
Wizard
stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.
 
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
i have been using this wonderful code for years, but I confess I still don't know how it works
will some kind person talk me thru it, symbol by symbol please
remove a href
no replace needed
</?a ?([^>]+)?>
stumped is offline   Reply With Quote
Old 02-06-2022, 06:54 AM   #695
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
@stumped:
If you want to thoroughly understand regular expressions, not just your example, I recommend that you have a look here:
https://regex101.com/r/CXf9WD/1
BeckyEbook is offline   Reply With Quote
Advert
Old 02-06-2022, 07:29 AM   #696
stumped
Wizard
stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.
 
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
i dont use regex often enough to retain a thorough understanding. I know enough to write simple formulas for find replace within ebooks, but this one is too dense to follow, even via the previous link

I conceptualise it as need to find stuff which begins <a then then delete up to and including a matching /a>

when I blindly apply it in sigil, it has a 100% success rate in stripping all the <a from entire books automatically , so it may be catering for some tricky edge - cases ?
stumped is offline   Reply With Quote
Old 02-06-2022, 07:34 AM   #697
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
You think well.
In short, it searches for <(possibly /)a (possibly anything)>
So it will search for all opening anchors with any existing attributes and all closing anchors.
BeckyEbook is offline   Reply With Quote
Old 02-06-2022, 09:06 AM   #698
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I can break it down, but it will be a little later.

Last edited by DiapDealer; 02-06-2022 at 11:43 AM.
DiapDealer is online now   Reply With Quote
Old 02-06-2022, 11:43 AM   #699
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
</?a ?([^>]+)?>

The question marks are used to mark what comes before as optional.

So </?a is saying that the slash before the 'a' tag is optional. That means it matches both "<a"and "</a".

Then comes the space, which is also made optional, meaning it will match "<a", or "<a ".

The ([^>]+)? is a little more tricky, but not terribly so. The parentheses are used to group everything before the last question mark. Meaning the whole of what's inside the parentheses is optional.

"[^>]" is a common character class when trying to parse html tags. It simply means that it will match any character that is not (^) the greater-than character (>). It's used to ensure that the expression does not get greedy and grab content beyond the ending of the current tag (>). The + is for repetition. + is one or more times, and * means 0 or more times.

The use of + in this case is why the grouping parentheses and the question mark to make the whole thing optional is necessary. In this particular case: the optional space character and the ([^>]+)? could be replaced with simply [^>]*
(meaning match all characters (except >) zero or more times, instead of all characters (except >) one or more times... optionally).

Then match the closing > character.

</?a ?([^>]+)?>

should be synonymous with:

</?a[^>]*>

for the stripping of all opening and closing anchor tags (as well as any self-closing anchor tags of the variety: <a id="anchor_tag_1" />)

But no need to change what works. I included the slight simplification for explanatory purposes.

Last edited by DiapDealer; 02-06-2022 at 11:49 AM.
DiapDealer is online now   Reply With Quote
Old 02-24-2022, 10:12 AM   #700
BillPearl
Junior Member
BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.
 
Posts: 7
Karma: 591908
Join Date: Jun 2011
Device: Kindle
Find missing quote marks

Perhaps this may help you on your way.
Recently had to find missing first " of a pair. Finally came up with this:

Find strings with a missing first quote
... calibre3">((?:\\"|[^"])*")</p>
^^^^^--------tags ------^^^ bracket string

Search: ...calibre3">([")(?: (?= (\\?))\2.)*?\1</p> -between tags (no space after ?: ) ?

Replace: "\1 <-- (There's a space after the '1') -adds quote to front end of \1 (the captured text)

Did not work in all cases but this got the rest of the mis-matched pairs.
3">((?:\\"|[^"])*")[. , ?] -between tag and punctuation mark(s)

Quote:
Originally Posted by meme View Post
I'd like to see if I can collect Regular Expressions (PCRE format as introduced in Sigil 0.5.0) used for common or difficult issues, and maybe add them to the FAQ, etc. Partly so I can have a list to refer to when needed, but also to collect a lot of what's probably already been mentioned in this forum. And maybe to find out if there isn't a way to do a replacement that's needed.

For instance, is there a regex to do other types of replacement but only inside body tags?

Is there one only for the actual text - words not part of a tag name or attribute? Words that are only aprt of a tag name or attribute?

If you have any suggestions for the above cases, or any other useful Regex expressions please post them.
BillPearl is offline   Reply With Quote
Old 02-24-2022, 10:45 AM   #701
BillPearl
Junior Member
BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.BillPearl ought to be getting tired of karma fortunes by now.
 
Posts: 7
Karma: 591908
Join Date: Jun 2011
Device: Kindle
Find missing quote marks

Perhaps this may help you on your way.
Recently had to find missing first " of a pair. Finally came up with this:

Find strings with a missing first quote
calibre3">((?:\\"|[^"])*")</p>
^^^^^--------tags ------^^^ bracket string

calibre3">"\1</p> -adds quote to front end of \1 (the captured text)


Quote:
Originally Posted by meme View Post
I'd like to see if I can collect Regular Expressions (PCRE format as introduced in Sigil 0.5.0) used for common or difficult issues, and maybe add them to the FAQ, etc. Partly so I can have a list to refer to when needed, but also to collect a lot of what's probably already been mentioned in this forum. And maybe to find out if there isn't a way to do a replacement that's needed.

For instance, is there a regex to do other types of replacement but only inside body tags?

Is there one only for the actual text - words not part of a tag name or attribute? Words that are only aprt of a tag name or attribute?

If you have any suggestions for the above cases, or any other useful Regex expressions please post them.
BillPearl is offline   Reply With Quote
Old 04-04-2022, 09:45 AM   #702
Ashjuk
Fanatic
Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.
 
Ashjuk's Avatar
 
Posts: 500
Karma: 3498633
Join Date: May 2011
Location: Surrey, UK
Device: Kobo Aura One, Sony PRS 600/650
I have a book that contains a lot of complex IDs that I am trying to replace with simple ones via a regex find and replace.

The IDs have strings like this - F8901-6c93446a08e5490e8e6b029bcac88fe9

I know how to use [a-z]+ and [0-9]+ but I don't know how to relate that to these complex strings. Unfortunately they appear to be completely random without a common start or end character.

I have tried a few examples that I found by searching on-line, but nothing seems to work for me.

All help gratefully received.
Ashjuk is offline   Reply With Quote
Old 04-04-2022, 10:05 AM   #703
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
Start with something similar:
Code:
id="[A-Fa-f0-9-]+"
BeckyEbook is offline   Reply With Quote
Old 04-04-2022, 10:43 AM   #704
Ashjuk
Fanatic
Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.Ashjuk ought to be getting tired of karma fortunes by now.
 
Ashjuk's Avatar
 
Posts: 500
Karma: 3498633
Join Date: May 2011
Location: Surrey, UK
Device: Kobo Aura One, Sony PRS 600/650
Quote:
Originally Posted by BeckyEbook View Post
Start with something similar:
Code:
id="[A-Fa-f0-9-]+"
Thank you so much Becky, that worked great.

I did not realise you could combine alphanumeric characters within one set of brackets.

I had been trying [A-z]+[a-z]+[0-9]+ without success.
Ashjuk is offline   Reply With Quote
Old 04-04-2022, 08:51 PM   #705
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,801
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
id=".+?"
would find ID= with any ID value between quotes
theducks is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Examples of Subgroups emonti8384 Lounge 32 02-26-2011 06:00 PM
Accessories Pen examples Gunnerp245 enTourage Archive 15 02-21-2011 03:23 PM
Stylesheet examples? Skitzman69 Sigil 15 09-24-2010 08:24 PM
Examples kafkaesque1978 iRiver Story 1 07-26-2010 03:49 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM


All times are GMT -4. The time now is 11:40 AM.


MobileRead.com is a privately owned, operated and funded community.