03-01-2014, 02:25 AM | #301 | ||
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
I am sorry but I am not sure to understand. Look at the screenshot for a big quote. What's missing?
Oh, you mean having this: Quote:
Quote:
If this is what you meant, problem is that the original book has none of them. Also, present or not, these quotes would not have changed a thing in the regex. Last edited by roger64; 03-01-2014 at 04:30 AM. |
||
03-02-2014, 02:09 PM | #302 |
Connoisseur
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
|
Hi,
i use the following regex to replace some hyphenation of words. Find: (\p{Greek})-(\p{Greek}) Replace: \1\2 There is an way to ignore some of the regex results? for example to ignore: πλάι-πλάι, ίσα-ίσα, μισό-μισό and all the - with the same word as \1 and \2 Thanks |
Advert | |
|
03-02-2014, 07:36 PM | #303 |
Grand Sorcerer
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Yes, but this requires some serious "Regex Fu" (e.g. negative lookbehinds).
I'd simply search for repeated Greek words with a hyphen between them and replace the hyphens with a substitute character (@): Find:(\p{Greek}+)-(\1) Replace:\1@\2 Then you can use your regex and at the end you can globally replace all at signs (@) with hyphens. @Jellby: Can you optimize this simple regex by creating a regex that will find a Greek word not followed by a hyphen and the same Greek word using backreferences and negative lookbehinds? |
03-03-2014, 03:25 AM | #304 |
frumious Bandersnatch
Posts: 7,515
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
I could try, but I can only test it in vim, which uses a different regex dialect, so I don't think it would be very useful. Besides, I would rather do as you suggested: first replace repeated Greek words.
|
03-04-2014, 04:14 AM | #305 |
Groupie
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
|
you could try this:
Code:
find: (?<=\P{Greek})(\p{Greek}+)-(?!\1) replace: \1 (\p{Greek}+) that are preceded by anything other than a greek character: (?<=\P{Greek}) then a hyphen: - that is not followed by the group it matched previously: (?!\1) replacing it with \1 just removes the hyphen ** edit ** i was trying to get this to work with unicode ranges so that it could be simplified further (no need for the look-behind), but couldn't seem to get it working in sigil, or my other text editor which has PCRE, for that matter. i was trying to match [\u0370-\u03FF] and (?-u)[\u0370-\u03FF] with no success. anyone have tips on this? ** edit 2 ** i was hoping to get rid of the look-behind by starting the expression with a word boundary, but turns out \b is only useful for ASCII characters, i.e. [a-zA-Z0-9_], so looks like the look-behind may be necessary in these cases. here's an updated version based on Doitsu's comment below that includes Greek_Extended in the search pattern: Code:
(?<![\x{0370}-\x{03FF}\x{1F00}-\x{1FFF}])([\x{0370}-\x{03FF}\x{1F00}-\x{1FFF}]+)-(?!\1) Last edited by mzmm; 03-04-2014 at 07:34 AM. |
Advert | |
|
03-04-2014, 04:32 AM | #306 |
Groupie
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
|
because of how common the @ sign has become, i'd suggest using a more obscure character like ¬ if you're going to do the search replace in steps, though.
|
03-04-2014, 06:06 AM | #307 | |
Grand Sorcerer
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
However, to be on the safe side, you may want to include the precomposed characters from the "Greek Extended" block (U+1F00 to U+1FFF): [\x{0370}-\x{03FF}\x{1F00}-\x{1FFF}] |
|
03-04-2014, 07:10 AM | #308 | |
Groupie
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
|
Quote:
|
|
03-04-2014, 10:15 AM | #309 |
Connoisseur
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
|
Thanks Doitsu, thanks mzmm!
It works better now! (the edit #2) |
03-04-2014, 01:31 PM | #310 |
Connoisseur
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
|
I try to make a step forward. But i failed :P
to exclude some results (for example the γερο- from γερο-Κομπ) i try the Find: (?<![\x{0370}-\x{03FF}\x{1F00}-\x{1FFF}])([^(γερο)\-][\x{0370}-\x{03FF}\x{1F00}-\x{1FFF}]+)-(?!\1) Replace: \1 but it also exclude words like γκρεμοτσακι-ζόταν and γρα-τζουνιές Any thoughts? Thanks again |
03-04-2014, 02:48 PM | #311 |
Groupie
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
|
mm, i think you're going to run into problems pretty quickly if you start trying to get regex to understand words (as it sounds like you've already found out). i'd really recommend against trying to write a single regex to rule them all.
that said, i'd probably come at it the other way, so you'd include Κομπ in the look-ahead, rather than γερο in the capturing group: ...<same as before>...(?!Κομπ|\1) the pipe | separating them means 'or'. i have no idea if this makes syntactical sense to do this in the greek language, but it matches the examples you've provided. |
03-08-2014, 07:07 AM | #312 |
Junior Member
Posts: 4
Karma: 10
Join Date: Mar 2014
Device: Android
|
Find and replace text but leaving some text behind
I hope this question hasn't been asked before, but here goes:
I want to make epub3 files. With notes (epub:type="noteref" and so on.) I do know how to make the files, but it isn't automated in Sigil, sadly! But when I make the files and create the links, Sigil makes this: Code:
<a href="#id1">This text will have a link to a note</a> <a id="id1">This will be the note </a> Code:
<a href="id***"> Code:
<a epub:type="noteref" href="#id***" xmlns:epub="http://www.idpf.org/2007/ops"> Is that possible? |
03-08-2014, 08:40 AM | #313 |
Groupie
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
|
you should be able to use
Code:
find: <a href="#id(\d+)" replace: <a epub:type="noteref" href="#id\1" xmlns:epub="http://www.idpf.org/2007/ops" |
03-08-2014, 10:28 AM | #314 |
Junior Member
Posts: 4
Karma: 10
Join Date: Mar 2014
Device: Android
|
|
03-08-2014, 10:44 AM | #315 |
Groupie
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
|
yep:
find: <a href="#id followed by a capturing group `()` that contains one or more integers `\d+` (\d+) followed by " replace: <a epub:type="noteref" href="#id followed by a back-reference to the captured group above \1 followed by " xmlns:epub="http://www.idpf.org/2007/ops" this site is an invaluable reference for anything regex, basic to advanced: http://www.regular-expressions.info |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Examples of Subgroups | emonti8384 | Lounge | 32 | 02-26-2011 06:00 PM |
Accessories Pen examples | Gunnerp245 | enTourage Archive | 15 | 02-21-2011 03:23 PM |
Stylesheet examples? | Skitzman69 | Sigil | 15 | 09-24-2010 08:24 PM |
Examples | kafkaesque1978 | iRiver Story | 1 | 07-26-2010 03:49 PM |
Looking for examples of typos in eBooks | Tonycole | General Discussions | 1 | 05-05-2010 04:23 AM |