09-09-2015, 05:01 AM | #481 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
Search only outside tags
Is there a way to search for characters or sequences only outside the html tags? I.E. only text that actually "appears" in the book. I have tried searching within the "book view" of calibre, but the replace doesn't work.
Right now I'm looking to replace "these" quotation marks with “these”. |
09-09-2015, 05:15 AM | #482 | |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
|
|
09-09-2015, 06:19 AM | #483 | |
Unicycle Daredevil
Posts: 13,923
Karma: 185041098
Join Date: Jan 2011
Location: Planet of the Pudding Brains
Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3
|
Quote:
|
|
09-09-2015, 09:04 AM | #484 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
Wow, cool. It worked
Ty |
10-02-2015, 12:18 PM | #485 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
Unopened quotation marks
I've counted all the opening and closing quotation marks (“ ”) in an epub, and the closing ones are one more than the opening ones.
How do I find the unopened one? |
10-02-2015, 02:11 PM | #486 |
A Hairy Wizard
Posts: 3,094
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
try:
search: ”([^“]*?)” |
10-02-2015, 06:32 PM | #487 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
It seems to work. Ty
|
11-17-2015, 01:58 PM | #488 |
Connoisseur
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
|
Code:
#Fixes ώ in words that are misspelled
CorrectText("ώ fixes",r"(\w+)(ιίι|\(ό|ο\)|ίό|ο>|ο'\)|ο'ι|ιό|οί|ιο|οι|<ο|οϊ)(\w+)(?![^<>]*>)(?!.*<body[^>]*>)", IsFixO)
in the epub tidy plugin i use this code to find mispelled ώ It searches for ιίι, (ό, etc and if it's correct it change it to ώ. As the code is now, its working only works within a word (for example στιίιμα changes to στρώμα It doesn't work in the begining or the end of the word (for example ιίιστε [the correct word is ώστε] or αντιπαρατεθιίι [the correct word is αντιπαρατεθώ] If i change the first (\w+) to (\w+|\ ) i get findings and in the beggining if the word. What i can change to match and the end of the word? Thanks Last edited by gipsy; 11-18-2015 at 08:22 AM. Reason: Explanations |
02-28-2016, 12:59 PM | #489 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
Just found out that the case conversion replacement regex (\L\1\E to make the string lowercase, \U\1\E to make it uppercase) works with sigil, but not with the calibre editor.
|
02-28-2016, 01:21 PM | #490 |
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
calibre doesn't use the PCRE library, it uses Matthew Barnett's python regex module -- which doesn't include uppercase/lowercase.
Fortunately, calibre does support function-replace, with pre-supplied functions to uppercase/lowercase text. |
02-28-2016, 02:29 PM | #491 |
Grand Sorcerer
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Note that Sigil plugins will have the same limitation with regard to regular expressions. Both the standard re and Barnett's regex module are included with the bundled Python, but only the GUI S&R engine makes use of PCRE's case conversion switches (as well as the /K switch).
|
04-14-2016, 03:21 AM | #492 | |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
Quote:
ty Last edited by 1v4n0; 04-14-2016 at 09:30 AM. |
|
04-14-2016, 07:07 AM | #493 |
Interested in the matter
Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
Take a look at: http://manual.calibre-ebook.com/function_mode.html
Specifically: Automatically case of fixing the headings in the document, (one of the builtin functions in the editor). Last edited by jbacelar; 04-14-2016 at 07:10 AM. |
07-14-2016, 05:21 AM | #494 |
Chief Bohemian Misfit
Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
|
Hope it's okay for a veritable Regex newbie to post a query in this thread -- I'm only just beginning to learn about this stuff, but with any like it'll eventually start sinking in.
I seem to have developed an affinity for doing up electronic versions of "ye olde bookes" -- for example, right now I'm doing up several Shakespeare plays in the original Elizabethan English, endeavouring to give it somewhat of the "look and feel" of early typographic styles, complete with use of the long-ess (i.e. "ſ", the character that looks like an "f" but without the crossbar, and is actually an "s"). Along with the unusual use of the "u" and "v" characters in early typography, where an "ſ" is use instead of "s" has to do with placement within a word, rather than the "sound" of the character or anything else like that. Very often when I find digital transcriptions of these early texts, they've kept the "u" and "v" oddities, but for some reason have changed all the long-esses to just "s" instead -- and so I have to change them back. The rule for when this is supposed to occur is actually fairly simple (although not all early printers/typographers followed this, but the vast majority did): virtually every instance of "s" should be changed to "ſ" unless it falls at the end of the word, then it remains as "s." So to fix my texts up, I've been searching for every instance of "s" and then changing it to "ſ" -- which right away causes all my HTML code to need to be fixed up first, because things like "css," "class," "span," etc. get screwed up in the process -- and then I do another series of searches, looking for instances of "ſ" (long-ess) plus a "." or "," or ":" or ";" or "?" or "!" or ")" or "[space]" or "[apostrophe -- curly or otherwise], plus "<" should there be a closing </i> or </p> tag or something, i.e. wherever it might occur at the end of a word, and then changing it back to "s" again. It's not that big a deal, actually, I can "correct" the long-esses in a whole book in, like, 5 or 10 minutes or so, but it would be totally cool to just whiz it off with one, single regex search, of course. Oh, and it would have to be case-sensitive, of course -- all instances of upper-case "S" remain as "S." ALSO... A similar S&R could also be done on the "u" and "V" characters, the early rules for which also had to do with placement -- although as I mentioned before, most digital transcriptions of early texts seem to have retained those. It could come in handy, though, if at some point I encounter a text that has "modernized" the typography (but not word-spelling) of something. For those characters, lower-case "v" was used for both "u" and "v" at the start of a word, while "v" was used for both "u" and "v" elsewhere in the world -- thus, the word we spell as "uvula" (that thing that dangles at the back of your mouth/throat) would be spelled rather oddly as "vuula." As for upper-case "U" and "V," there was only one character, "V" -- although this is very easy to change with a simple, regular S&R, of course. (Very often the upper-case "W" character -- and occasionally the lower-case "w," too -- would be written as "VV"/"vv," but most often not, it seems to have been essentially dependent on the font the printer had available and not based on any "rule." This is why, however, we call the "w" character "double-u," actually -- in case you ever wondered.) Anyway, hope that's not too weird -- or, indeed, too basic -- a Regex question for me to ask here. The long-ess part of my query would certainly be really great to have a Regex expression for, though! Thanks so much, in advance! And thanks for bearing with me here, too, of course, with my long question/explanation. EDIT/POSTCRIPT: I forgot about "i" and "j"! In early typography, there was only one character for both -- "i" -- although once again that's easy enough to fix up with a regular S&R, of course. The only time "j" was used was as a ligature. For example, in this Elizabethan Shakespeare text I'm working on, the word "allies" (in modern English) came up, which was spelled at that time as "alliis -- and, hence, the "ii" became "ij" ("allijs"). If you look at how it looks, then you can see where we got the character "y" from. Last edited by Psymon; 07-14-2016 at 05:35 AM. |
07-27-2016, 09:46 AM | #495 |
Member
Posts: 24
Karma: 10
Join Date: Mar 2011
Location: Colorado
Device: Cruz Tablet
|
OK, here is a simple question for ya. In Sigil (0.7.4), I have a book where there is no separation between sentences. I am using this to find them: ([a-z])([\.\,\?\!])([A-Z])
which works perfectly. But what do I use in replace to move the new sentence over one space? There is over 3500 found and I don't want to insert a space manually for that many errors. Any suggestions? |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Examples of Subgroups | emonti8384 | Lounge | 32 | 02-26-2011 06:00 PM |
Accessories Pen examples | Gunnerp245 | enTourage Archive | 15 | 02-21-2011 03:23 PM |
Stylesheet examples? | Skitzman69 | Sigil | 15 | 09-24-2010 08:24 PM |
Examples | kafkaesque1978 | iRiver Story | 1 | 07-26-2010 03:49 PM |
Looking for examples of typos in eBooks | Tonycole | General Discussions | 1 | 05-05-2010 04:23 AM |