| 
			
			 | 
		#481 | 
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 173 
				Karma: 40000 
				Join Date: Oct 2013 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
			
			 
				
				Search only outside tags
			 
			
			
			Is there a way to search for characters or sequences only outside the html tags? I.E. only text that actually "appears" in the book. I have tried searching within the "book view" of calibre, but the replace doesn't work. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Right now I'm looking to replace "these" quotation marks with “these”.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#482 | |
| 
			
			
			
			 Grand Sorcerer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,763 
				Karma: 24088559 
				Join Date: Dec 2010 
				
				
				
				Device: Kindle PW2 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#483 | |
| 
			
			
			
			 Unicycle Daredevil 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,944 
				Karma: 185432100 
				Join Date: Jan 2011 
				Location: Planet of the Pudding Brains 
				
				
				Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#484 | 
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 173 
				Karma: 40000 
				Join Date: Oct 2013 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Wow, cool. It worked  
		
	
		
		
		
		
		
		
		
		
		
		
	
	![]() Ty  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#485 | 
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 173 
				Karma: 40000 
				Join Date: Oct 2013 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
			
			 
				
				Unopened quotation marks
			 
			
			
			I've counted all the opening and closing quotation marks (“ ”) in an epub, and the closing ones are one more than the opening ones. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	How do I find the unopened one?  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#486 | 
| 
			
			
			
			 A Hairy Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,397 
				Karma: 20212733 
				Join Date: Dec 2012 
				Location: Charleston, SC today 
				
				
				Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			try: 
		
	
		
		
		
		
		
		
		
		
		
		
	
	search: ”([^“]*?)”  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#487 | 
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 173 
				Karma: 40000 
				Join Date: Oct 2013 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			It seems to work. Ty  
		
	
		
		
		
		
		
		
		
		
		
		
	
	 
		 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#488 | 
| 
			
			
			
			 Connoisseur 
			
			![]() Posts: 81 
				Karma: 10 
				Join Date: Nov 2013 
				
				
				
				Device: Kobo Aura HD 
				
				
				 | 
	
	
	
		
		
		
		
		 Code: 
	#Fixes ώ in words that are misspelled
CorrectText("ώ fixes",r"(\w+)(ιίι|\(ό|ο\)|ίό|ο>|ο'\)|ο'ι|ιό|οί|ιο|οι|<ο|οϊ)(\w+)(?![^<>]*>)(?!.*<body[^>]*>)", IsFixO)
in the epub tidy plugin i use this code to find mispelled ώ It searches for ιίι, (ό, etc and if it's correct it change it to ώ. As the code is now, its working only works within a word (for example στιίιμα changes to στρώμα It doesn't work in the begining or the end of the word (for example ιίιστε [the correct word is ώστε] or αντιπαρατεθιίι [the correct word is αντιπαρατεθώ] If i change the first (\w+) to (\w+|\ ) i get findings and in the beggining if the word. What i can change to match and the end of the word? Thanks Last edited by gipsy; 11-18-2015 at 09:22 AM. Reason: Explanations  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#489 | 
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 173 
				Karma: 40000 
				Join Date: Oct 2013 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Just found out that the case conversion replacement regex (\L\1\E to make the string lowercase, \U\1\E to make it uppercase) works with sigil, but not with the calibre editor.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#490 | 
| 
			
			
			
			 Ex-Helpdesk Junkie 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421 
				Karma: 85400180 
				Join Date: Nov 2012 
				Location: The Beaten Path, USA, Roundworld, This Side of Infinity 
				
				
				Device: Kindle Touch fw5.3.7 (Wifi only) 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			calibre doesn't use the PCRE library, it uses Matthew Barnett's python regex module -- which doesn't include uppercase/lowercase. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Fortunately, calibre does support function-replace, with pre-supplied functions to uppercase/lowercase text.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#491 | 
| 
			
			
			
			 Grand Sorcerer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,891 
				Karma: 207182180 
				Join Date: Jan 2010 
				
				
				
				Device: Nexus 7, Kindle Fire HD 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Note that Sigil plugins will have the same limitation with regard to regular expressions. Both the standard re and Barnett's regex module are included with the bundled Python, but only the GUI S&R engine makes use of PCRE's case conversion switches (as well as the /K switch).
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#492 | |
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 173 
				Karma: 40000 
				Join Date: Oct 2013 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 ty Last edited by 1v4n0; 04-14-2016 at 10:30 AM.  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#493 | 
| 
			
			
			
			 Interested in the matter 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 421 
				Karma: 426094 
				Join Date: Dec 2011 
				Location: Spain, south coast 
				
				
				Device: Pocketbook InkPad 3 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Take a look at: http://manual.calibre-ebook.com/function_mode.html 
		
	
		
		
		
		
		
		
		
		
		
		
		
			Specifically: Automatically case of fixing the headings in the document, (one of the builtin functions in the editor). Last edited by jbacelar; 04-14-2016 at 08:10 AM.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#494 | 
| 
			
			
			
			 Chief Bohemian Misfit 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571 
				Karma: 462964 
				Join Date: May 2013 
				
				
				
				Device: iPad, ADE 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Hope it's okay for a veritable Regex newbie to post a query in this thread -- I'm only just beginning to learn about this stuff, but with any like it'll eventually start sinking in.  
		
	
		
		
		
		
		
		
		
		
		
		
		
			![]() I seem to have developed an affinity for doing up electronic versions of "ye olde bookes" -- for example, right now I'm doing up several Shakespeare plays in the original Elizabethan English, endeavouring to give it somewhat of the "look and feel" of early typographic styles, complete with use of the long-ess (i.e. "ſ", the character that looks like an "f" but without the crossbar, and is actually an "s"). Along with the unusual use of the "u" and "v" characters in early typography, where an "ſ" is use instead of "s" has to do with placement within a word, rather than the "sound" of the character or anything else like that. Very often when I find digital transcriptions of these early texts, they've kept the "u" and "v" oddities, but for some reason have changed all the long-esses to just "s" instead -- and so I have to change them back. The rule for when this is supposed to occur is actually fairly simple (although not all early printers/typographers followed this, but the vast majority did): virtually every instance of "s" should be changed to "ſ" unless it falls at the end of the word, then it remains as "s." So to fix my texts up, I've been searching for every instance of "s" and then changing it to "ſ" -- which right away causes all my HTML code to need to be fixed up first, because things like "css," "class," "span," etc. get screwed up in the process -- and then I do another series of searches, looking for instances of "ſ" (long-ess) plus a "." or "," or ":" or ";" or "?" or "!" or ")" or "[space]" or "[apostrophe -- curly or otherwise], plus "<" should there be a closing </i> or </p> tag or something, i.e. wherever it might occur at the end of a word, and then changing it back to "s" again. It's not that big a deal, actually, I can "correct" the long-esses in a whole book in, like, 5 or 10 minutes or so, but it would be totally cool to just whiz it off with one, single regex search, of course. Oh, and it would have to be case-sensitive, of course -- all instances of upper-case "S" remain as "S." ALSO... A similar S&R could also be done on the "u" and "V" characters, the early rules for which also had to do with placement -- although as I mentioned before, most digital transcriptions of early texts seem to have retained those. It could come in handy, though, if at some point I encounter a text that has "modernized" the typography (but not word-spelling) of something. For those characters, lower-case "v" was used for both "u" and "v" at the start of a word, while "v" was used for both "u" and "v" elsewhere in the world -- thus, the word we spell as "uvula" (that thing that dangles at the back of your mouth/throat) would be spelled rather oddly as "vuula." As for upper-case "U" and "V," there was only one character, "V" -- although this is very easy to change with a simple, regular S&R, of course. (Very often the upper-case "W" character -- and occasionally the lower-case "w," too -- would be written as "VV"/"vv," but most often not, it seems to have been essentially dependent on the font the printer had available and not based on any "rule." This is why, however, we call the "w" character "double-u," actually -- in case you ever wondered.) Anyway, hope that's not too weird -- or, indeed, too basic -- a Regex question for me to ask here. The long-ess part of my query would certainly be really great to have a Regex expression for, though! Thanks so much, in advance! And thanks for bearing with me here, too, of course, with my long question/explanation. EDIT/POSTCRIPT: I forgot about "i" and "j"! In early typography, there was only one character for both -- "i" -- although once again that's easy enough to fix up with a regular S&R, of course. The only time "j" was used was as a ligature. For example, in this Elizabethan Shakespeare text I'm working on, the word "allies" (in modern English) came up, which was spelled at that time as "alliis -- and, hence, the "ii" became "ij" ("allijs"). If you look at how it looks, then you can see where we got the character "y" from.  
		Last edited by Psymon; 07-14-2016 at 06:35 AM.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#495 | 
| 
			
			
			
			 Member 
			
			![]() Posts: 24 
				Karma: 10 
				Join Date: Mar 2011 
				Location: Colorado 
				
				
				Device: Cruz Tablet 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			OK, here is a simple question for ya. In Sigil (0.7.4), I have a book where there is no separation between sentences. I am using this to find them:  ([a-z])([\.\,\?\!])([A-Z]) 
		
	
		
		
		
		
		
		
		
		
		
		
	
	which works perfectly. But what do I use in replace to move the new sentence over one space? There is over 3500 found and I don't want to insert a space manually for that many errors. Any suggestions?  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
![]()  | 
            
        
            
| Thread Tools | Search this Thread | 
            
  | 
    
			 
			Similar Threads
		 | 
	||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Examples of Subgroups | emonti8384 | Lounge | 32 | 02-26-2011 07:00 PM | 
| Accessories Pen examples | Gunnerp245 | enTourage Archive | 15 | 02-21-2011 04:23 PM | 
| Stylesheet examples? | Skitzman69 | Sigil | 15 | 09-24-2010 09:24 PM | 
| Examples | kafkaesque1978 | iRiver Story | 1 | 07-26-2010 04:49 PM | 
| Looking for examples of typos in eBooks | Tonycole | General Discussions | 1 | 05-05-2010 05:23 AM |