![]() |
I hope this is correct topic to post to.
In my language we use one letter prepositions and conjunctions (a, i, o, u, k, s, v, z) which shouldn't be on the end of lines. Here is example from book I try to "epubize": "spatřil člun a v tom člunu". (translation: "he saw a boat and in that boat") What I want is to find letters "a" and "v" and replace them with no-break space to connect them to following word. I have this regex (I found somewhere) Code:
\s([aiouksvz])\sI also tried this example and again it finds only every second letter: Code:
<p>some words a s i k v some words</p>Code:
<p>some words a s i k v some words</p> |
I think you want:
Code:
\b([aiouksvz])\s |
Thank you, it works partialy, but it does find also parts of html code as
Code:
<a href... |
Quote:
To make \b honor unicode codepoints, turn on the Unicode Character Properties flag with (*UCP) So the above" Code:
\b([aiouksvz])\sCode:
(*UCP)\b([aiouksvz])\sTo make the expression ignore the character class matches that immediately follow an angled (x)html bracket (<) you can use a negative lookbehind. Something like: Code:
(*UCP)(?<!\<)\b([aiouksvz])\sThe (*UCP) flag and the (?<!\<) lookbehind are not captured groups despite the appearance. So the replacement you're looking for will still be something like: Code:
\1 |
Quote:
|
Apologies... I pasted the wrong full expression. It had an extraneous (and incorrect) negative character class that I was testing out.
This is the one that works for me for all of your examples so far: Code:
(*UCP)(?<!\<)\b([aiouksvz])\s |
Quote:
|
Hello,
I need help on regex, i have lines like these Code:
<p>– Wahahahaha!</p>"<p>–" to " <p> 「" but I having problems replacing "</p>" when "<p>– " is present in the beginning of the lines. I have tried the regex search of: Code:
(?<=<p>– .*)<\/p> |
Try (as long as I understand your problem correctly):
Code:
(?<=<p>– )(.+)</p>Code:
\1」</p> |
Sigil's PCRE regex engine certainly supports positive lookbehinds. It just doesn't support variable-length lookbehinds--positive or negative. It's a known limitation of the PCRE engine.
Use \K to simulate a variable-length lookbehind: Code:
<p>–( .*?)\K<\/p>More on the use of \K here: https://www.regular-expressions.info/keep.html |
Quote:
|
Not sure why it looks like there's an extra space in my above expression. It seems to copy and work fine, though. *shrug*
|
Quote:
|
Suggestion
\[\s][a,i,o,u,k,s,v,zç]\[\s]
will handle '<a ' case finds space before and after letter. You may want to run this with just one letter at a time using Replace All |
I don't understand why this isn't working; my search string is:
<a id="Page_([xvi]+)|([\d]+)" class="x-ebookmaker-pageno" title="\[([xvi]+)|([\d]+)\]"></a> When the file contains <a id="Page_iv" class="x-ebookmaker-pageno" title="[iv]"></a> and I click on the Find button, it highlights only <a id="Page_i What's wrong with my regexp? |
| All times are GMT -4. The time now is 07:52 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.