| 
			
			 | 
		#661 | 
| 
			
			
			
			 Enthusiast 
			
			![]() Posts: 32 
				Karma: 10 
				Join Date: Sep 2020 
				
				
				
				Device: Onyx Poke2 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I hope this is correct topic to post to. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	In my language we use one letter prepositions and conjunctions (a, i, o, u, k, s, v, z) which shouldn't be on the end of lines. Here is example from book I try to "epubize": "spatřil člun a v tom člunu". (translation: "he saw a boat and in that boat") What I want is to find letters "a" and "v" and replace them with no-break space to connect them to following word. I have this regex (I found somewhere) Code: 
	\s([aiouksvz])\s I also tried this example and again it finds only every second letter: Code: 
	<p>some words a s i k v some words</p> Code: 
	<p>some words a s i k v some words</p>  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#662 | 
| 
			
			
			
			 Grand Sorcerer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905 
				Karma: 47303824 
				Join Date: Jul 2011 
				Location: Sydney, Australia 
				
				
				Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I think you want: 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Code: 
	\b([aiouksvz])\s  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#663 | 
| 
			
			
			
			 Enthusiast 
			
			![]() Posts: 32 
				Karma: 10 
				Join Date: Sep 2020 
				
				
				
				Device: Onyx Poke2 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Thank you, it works partialy, but it does find also parts of html code as  
		
	
		
		
		
		
		
		
		
		
		
		
	
	Code: 
	<a href...  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#664 | |
| 
			
			
			
			 Grand Sorcerer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,891 
				Karma: 207182180 
				Join Date: Jan 2010 
				
				
				
				Device: Nexus 7, Kindle Fire HD 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 ![]() To make \b honor unicode codepoints, turn on the Unicode Character Properties flag with (*UCP) So the above" Code: 
	\b([aiouksvz])\s Code: 
	(*UCP)\b([aiouksvz])\s To make the expression ignore the character class matches that immediately follow an angled (x)html bracket (<) you can use a negative lookbehind. Something like: Code: 
	(*UCP)(?<!\<)\b([aiouksvz])\s The (*UCP) flag and the (?<!\<) lookbehind are not captured groups despite the appearance. So the replacement you're looking for will still be something like: Code: 
	\1  Last edited by DiapDealer; 09-17-2020 at 11:04 AM. Reason: Edited to correct the full expression  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#665 | 
| 
			
			
			
			 Enthusiast 
			
			![]() Posts: 32 
				Karma: 10 
				Join Date: Sep 2020 
				
				
				
				Device: Onyx Poke2 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Thank you for your time and explanation, but unfortunately it's working partially again. It ignores the html code (a href, i), which is great, but it doesn't find all letters I need to find. For example in sentence "spatřil člun a v tom člunu", it should find letters "a" and "v", but it only finds "a" and ignores "i". It also find some two-letters words as "na", "do" or in English "as" and "is".
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#666 | 
| 
			
			
			
			 Grand Sorcerer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,891 
				Karma: 207182180 
				Join Date: Jan 2010 
				
				
				
				Device: Nexus 7, Kindle Fire HD 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Apologies... I pasted the wrong full expression. It had an extraneous (and incorrect) negative character class that I was testing out. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	This is the one that works for me for all of your examples so far: Code: 
	(*UCP)(?<!\<)\b([aiouksvz])\s  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#667 | 
| 
			
			
			
			 Enthusiast 
			
			![]() Posts: 32 
				Karma: 10 
				Join Date: Sep 2020 
				
				
				
				Device: Onyx Poke2 
				
				
				 | 
	
	|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#668 | 
| 
			
			
			
			 Junior Member 
			
			![]() Posts: 3 
				Karma: 10 
				Join Date: Sep 2020 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Hello, 
		
	
		
		
		
		
		
		
		
		
		
		
	
	I need help on regex, i have lines like these Code: 
	<p>– Wahahahaha!</p> <p>Grasha got drunk, raged and got on the table.</p> <p>– Wahahahaha! This is a celebration party! Drink and sing guys!</p> "<p>–" to " <p> 「" but I having problems replacing "</p>" when "<p>– " is present in the beginning of the lines. I have tried the regex search of: Code: 
	(?<=<p>– .*)<\/p>  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#669 | 
| 
			
			
			
			 Guru 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 900 
				Karma: 3501166 
				Join Date: Jan 2017 
				Location: Poland 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Try (as long as I understand your problem correctly): 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Code: 
	(?<=<p>– )(.+)</p> Code: 
	\1」</p>  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#670 | 
| 
			
			
			
			 Grand Sorcerer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,891 
				Karma: 207182180 
				Join Date: Jan 2010 
				
				
				
				Device: Nexus 7, Kindle Fire HD 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Sigil's PCRE regex engine certainly supports positive lookbehinds. It just doesn't support variable-length lookbehinds--positive or negative. It's a known limitation of the PCRE engine. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Use \K to simulate a variable-length lookbehind: Code: 
	<p>–( .*?)\K<\/p> More on the use of \K here: https://www.regular-expressions.info/keep.html  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#671 | 
| 
			
			
			
			 Junior Member 
			
			![]() Posts: 3 
				Karma: 10 
				Join Date: Sep 2020 
				
				
				
				Device: none 
				
				
				 | 
	
	|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#672 | 
| 
			
			
			
			 Grand Sorcerer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,891 
				Karma: 207182180 
				Join Date: Jan 2010 
				
				
				
				Device: Nexus 7, Kindle Fire HD 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Not sure why it looks like there's an extra space in my above expression. It seems to copy and work fine, though. *shrug*
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#673 | |
| 
			
			
			
			 Junior Member 
			
			![]() Posts: 3 
				Karma: 10 
				Join Date: Sep 2020 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#674 | 
| 
			
			
			
			 Junior Member 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8 
				Karma: 591908 
				Join Date: Jun 2011 
				
				
				
				Device: Kindle 
				
				
				 | 
	
	
	
		
		
			
			 
				
				Suggestion
			 
			
			
			\[\s][a,i,o,u,k,s,v,zç]\[\s] 
		
	
		
		
		
		
		
		
		
		
		
		
	
	will handle '<a ' case finds space before and after letter. You may want to run this with just one letter at a time using Replace All  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#675 | 
| 
			
			
			
			 Running with scissors 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,592 
				Karma: 14328510 
				Join Date: Nov 2019 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I don't understand why this isn't working; my search string is: 
		
	
		
		
		
		
		
		
		
		
		
		
	
	<a id="Page_([xvi]+)|([\d]+)" class="x-ebookmaker-pageno" title="\[([xvi]+)|([\d]+)\]"></a> When the file contains <a id="Page_iv" class="x-ebookmaker-pageno" title="[iv]"></a> and I click on the Find button, it highlights only <a id="Page_i What's wrong with my regexp?  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
![]()  | 
            
        
            
            
  | 
    
			 
			Similar Threads
		 | 
	||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Examples of Subgroups | emonti8384 | Lounge | 32 | 02-26-2011 07:00 PM | 
| Accessories Pen examples | Gunnerp245 | enTourage Archive | 15 | 02-21-2011 04:23 PM | 
| Stylesheet examples? | Skitzman69 | Sigil | 15 | 09-24-2010 09:24 PM | 
| Examples | kafkaesque1978 | iRiver Story | 1 | 07-26-2010 04:49 PM | 
| Looking for examples of typos in eBooks | Tonycole | General Discussions | 1 | 05-05-2010 05:23 AM |