01-08-2022, 09:43 AM | #691 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
Is there a regex that finds all the letters and only the letters, including accented ones and, just as example, š č ć ž đ? Or do I have to manually add the to the range, as in [a-zèòéùàšđčćž]? I always risk leaving some aside.
Thanks. |
01-08-2022, 09:52 AM | #692 |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
|
Advert | |
|
01-08-2022, 10:55 AM | #693 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
|
02-06-2022, 05:06 AM | #694 |
Wizard
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
i have been using this wonderful code for years, but I confess I still don't know how it works
will some kind person talk me thru it, symbol by symbol please remove a href no replace needed </?a ?([^>]+)?> |
02-06-2022, 06:54 AM | #695 |
Guru
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
|
@stumped:
If you want to thoroughly understand regular expressions, not just your example, I recommend that you have a look here: https://regex101.com/r/CXf9WD/1 |
Advert | |
|
02-06-2022, 07:29 AM | #696 |
Wizard
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
i dont use regex often enough to retain a thorough understanding. I know enough to write simple formulas for find replace within ebooks, but this one is too dense to follow, even via the previous link
I conceptualise it as need to find stuff which begins <a then then delete up to and including a matching /a> when I blindly apply it in sigil, it has a 100% success rate in stripping all the <a from entire books automatically , so it may be catering for some tricky edge - cases ? |
02-06-2022, 07:34 AM | #697 |
Guru
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
|
You think well.
In short, it searches for <(possibly /)a (possibly anything)> So it will search for all opening anchors with any existing attributes and all closing anchors. |
02-06-2022, 09:06 AM | #698 |
Grand Sorcerer
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I can break it down, but it will be a little later.
Last edited by DiapDealer; 02-06-2022 at 11:43 AM. |
02-06-2022, 11:43 AM | #699 |
Grand Sorcerer
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
</?a ?([^>]+)?>
The question marks are used to mark what comes before as optional. So </?a is saying that the slash before the 'a' tag is optional. That means it matches both "<a"and "</a". Then comes the space, which is also made optional, meaning it will match "<a", or "<a ". The ([^>]+)? is a little more tricky, but not terribly so. The parentheses are used to group everything before the last question mark. Meaning the whole of what's inside the parentheses is optional. "[^>]" is a common character class when trying to parse html tags. It simply means that it will match any character that is not (^) the greater-than character (>). It's used to ensure that the expression does not get greedy and grab content beyond the ending of the current tag (>). The + is for repetition. + is one or more times, and * means 0 or more times. The use of + in this case is why the grouping parentheses and the question mark to make the whole thing optional is necessary. In this particular case: the optional space character and the ([^>]+)? could be replaced with simply [^>]* (meaning match all characters (except >) zero or more times, instead of all characters (except >) one or more times... optionally). Then match the closing > character. </?a ?([^>]+)?> should be synonymous with: </?a[^>]*> for the stripping of all opening and closing anchor tags (as well as any self-closing anchor tags of the variety: <a id="anchor_tag_1" />) But no need to change what works. I included the slight simplification for explanatory purposes. Last edited by DiapDealer; 02-06-2022 at 11:49 AM. |
02-24-2022, 10:12 AM | #700 | |
Junior Member
Posts: 7
Karma: 591908
Join Date: Jun 2011
Device: Kindle
|
Find missing quote marks
Perhaps this may help you on your way.
Recently had to find missing first " of a pair. Finally came up with this: Find strings with a missing first quote ... calibre3">((?:\\"|[^"])*")</p> ^^^^^--------tags ------^^^ bracket string Search: ...calibre3">([")(?: (?= (\\?))\2.)*?\1</p> -between tags (no space after ?: ) ? Replace: "\1 <-- (There's a space after the '1') -adds quote to front end of \1 (the captured text) Did not work in all cases but this got the rest of the mis-matched pairs. 3">((?:\\"|[^"])*")[. , ?] -between tag and punctuation mark(s) Quote:
|
|
02-24-2022, 10:45 AM | #701 | |
Junior Member
Posts: 7
Karma: 591908
Join Date: Jun 2011
Device: Kindle
|
Find missing quote marks
Perhaps this may help you on your way.
Recently had to find missing first " of a pair. Finally came up with this: Find strings with a missing first quote calibre3">((?:\\"|[^"])*")</p> ^^^^^--------tags ------^^^ bracket string calibre3">"\1</p> -adds quote to front end of \1 (the captured text) Quote:
|
|
04-04-2022, 09:45 AM | #702 |
Fanatic
Posts: 500
Karma: 3498633
Join Date: May 2011
Location: Surrey, UK
Device: Kobo Aura One, Sony PRS 600/650
|
I have a book that contains a lot of complex IDs that I am trying to replace with simple ones via a regex find and replace.
The IDs have strings like this - F8901-6c93446a08e5490e8e6b029bcac88fe9 I know how to use [a-z]+ and [0-9]+ but I don't know how to relate that to these complex strings. Unfortunately they appear to be completely random without a common start or end character. I have tried a few examples that I found by searching on-line, but nothing seems to work for me. All help gratefully received. |
04-04-2022, 10:05 AM | #703 |
Guru
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
|
Start with something similar:
Code:
id="[A-Fa-f0-9-]+" |
04-04-2022, 10:43 AM | #704 |
Fanatic
Posts: 500
Karma: 3498633
Join Date: May 2011
Location: Surrey, UK
Device: Kobo Aura One, Sony PRS 600/650
|
|
04-04-2022, 08:51 PM | #705 |
Well trained by Cats
Posts: 29,801
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
id=".+?"
would find ID= with any ID value between quotes |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Examples of Subgroups | emonti8384 | Lounge | 32 | 02-26-2011 06:00 PM |
Accessories Pen examples | Gunnerp245 | enTourage Archive | 15 | 02-21-2011 03:23 PM |
Stylesheet examples? | Skitzman69 | Sigil | 15 | 09-24-2010 08:24 PM |
Examples | kafkaesque1978 | iRiver Story | 1 | 07-26-2010 03:49 PM |
Looking for examples of typos in eBooks | Tonycole | General Discussions | 1 | 05-05-2010 04:23 AM |