07-27-2016, 10:01 AM | #496 | |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Quote:
|
|
07-27-2016, 10:28 AM | #497 |
Well trained by Cats
Posts: 29,689
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
that might miss those that start or end with quotes
Code:
([a-z])([\.\,\?\!]["]*)(["]*[A-Z]) |
Advert | |
|
07-27-2016, 10:40 AM | #498 | ||
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
Certainly we don't expect the people who already know the answer to ask questions... Quote:
Using the power of lookaround zero-length assertions and word boundary zero-length assertions, the following regex will find a character-that-is-not-at-the-end-of-a-word (in this case "s") that is not inside HTML tags: Find: Code:
(?<=>[^<]*)s\B(?=[^>]*<)
Code:
ſ Explanation:
Last edited by eschwartz; 07-27-2016 at 10:53 AM. |
||
07-27-2016, 10:43 AM | #499 |
Member
Posts: 24
Karma: 10
Join Date: Mar 2011
Location: Colorado
Device: Cruz Tablet
|
|
08-20-2016, 10:57 AM | #500 |
Wizard
Posts: 1,022
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Hi all Regex cracks!
Is there a way to remove by one expression all anchor tags in an epub with the following syntax: Code:
<a name="pagexx" title="yy" id="pagexx"></a> Maybe it's even not so difficult, but It's too much for my poor old brains. Thanks in advance! |
Advert | |
|
08-20-2016, 11:27 AM | #501 | |
Well trained by Cats
Posts: 29,689
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Select an example, ctrl-F (This also puts the selection in Find) Right click in the find box: Tokenize Replace should be: either blank or a space you could also do it the other way: replace each SET of the numbers with a \d+ (one or more digits, an Integer) |
|
08-20-2016, 11:37 AM | #502 |
A Hairy Wizard
Posts: 3,070
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
^^^ What theducks said.
eg: find: <a name="page\d+" title="\d+" id="page\d+"></a> replace: blank if what you are replacing is JUST numbers -or- find: <a name="page(.*?)" title="(.*?)" id="page(.*?)"></a> replace: blank if what you are replacing can include letters or symbols. |
08-20-2016, 11:37 AM | #503 |
Wizard
Posts: 1,022
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Thank you! It works while the issuer names are identic. But in fact, there are several names. It should be possible to catch them all. (Some names are separated by slashes, b.t.w.)
This refers to theducks' answer. Last edited by Leonatus; 08-20-2016 at 11:40 AM. |
08-20-2016, 11:41 AM | #504 |
A Hairy Wizard
Posts: 3,070
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Ooops - ninjad you Leonatus!
|
08-20-2016, 12:22 PM | #505 |
Grand Sorcerer
Posts: 5,582
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
@Leonatus:
Use the following quick and dirty regex: Code:
<a name=".*?" title=".*?" id=".*?"></a> |
08-20-2016, 12:28 PM | #506 |
Wizard
Posts: 1,022
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Thank you all! Works like a charm! Fantastic!
(But I committed the error to leave the "dot all" and "minimal match" boxes checked. The result is the loss of big parts of text. So, whoever wishes to take profit from this item, take care!) Last edited by Leonatus; 08-20-2016 at 12:40 PM. |
09-16-2016, 02:26 PM | #507 |
Chief Bohemian Misfit
Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
|
Hey, folks -- I am trying to learn/do this regex stuff on my own (however slowly)! I'm stumped on something that I would think should be fairly easy, though.
In my book, I've got almost 300 paragraphs that start off with a dropcap, with this being an example of how those paragraphs begin... Code:
<span class="initial">H</span>onourable Code:
<span class="initial">H</span><span class="smallcaps">ONOURABLE</span> For my regex search I initially came up with this... <span class=\"initial\">(.+?)</span>([^>]*)\s ...and for replace this... <span class="initial">\1</span><span class="smallcaps">\U\2\E</span> ...(and in this latter there's an invisible space there that I suppose you won't "see" in this post -- but it would be there in my S&R, of course). For the life of me, though, that \s won't stop at the first space, that is, after the first word -- it selects the entire paragraph up to the last space in the paragraph! -- and it's also possible that there might actually be not a space, but a comma (or other punctuation) instead, and I'd like that closing span (for my smallcaps) to come before that. I've searched around the 'net trying to find the solution to this, but just can't seem to find it -- every "answer" that I find on other sites and try just doesn't seem to work. Thanks in advance, if anyone can help! (PS. I'm not sure if my "replace" code is correct either, actually -- although I never got that far with figuring this out!) Last edited by Psymon; 09-16-2016 at 02:35 PM. |
09-16-2016, 03:58 PM | #508 | ||
Grand Sorcerer
Posts: 27,467
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
Code:
<span class="initial">(.+?)</span>(\w*) If any unicode characters can be expected, you may want to make the \w unicode-aware with the (*UCP) command. Code:
(*UCP)<span class="initial">(.+?)</span>(\w*) Code:
(*UCP)<span class="initial">(\w)</span>(\w*) Code:
(*UCP)<span class="initial">(“?\w)</span>(\w*) Quote:
To eliminate the issue of one-letter word drop/smallcaps, I'd probably do something like. FIND: Code:
(*UCP)<span class="initial">“?\w</span>\K(\w*) Code:
<span class="smallcaps">\U\1\E</span> Last edited by DiapDealer; 09-16-2016 at 04:07 PM. |
||
09-16-2016, 04:39 PM | #509 |
Chief Bohemian Misfit
Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
|
<big snip>
Thank you so, so much, DiapDealer! That did indeed seem to do the trick! I know I did have at least one (maybe more) one-letter opening words, but I'll find out eventually if anything went funny there -- once my book is done, I'll be going through the entire thing page-by-page (several times, in different orientations, etc.) too look for any weirdness going on anywhere. In the meantime, though, that does seem to do the have done the trick! And thank you so much, too, for your detailed explanation of everything -- I'll study that more closely as well, and do my best to learn from it! |
09-16-2016, 05:23 PM | #510 |
Grand Sorcerer
Posts: 27,467
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Glad to help. Good luck!
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Examples of Subgroups | emonti8384 | Lounge | 32 | 02-26-2011 06:00 PM |
Accessories Pen examples | Gunnerp245 | enTourage Archive | 15 | 02-21-2011 03:23 PM |
Stylesheet examples? | Skitzman69 | Sigil | 15 | 09-24-2010 08:24 PM |
Examples | kafkaesque1978 | iRiver Story | 1 | 07-26-2010 03:49 PM |
Looking for examples of typos in eBooks | Tonycole | General Discussions | 1 | 05-05-2010 04:23 AM |