Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 04-16-2014, 04:21 PM   #331
Skeeve
Zealot
Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.
 
Skeeve's Avatar
 
Posts: 142
Karma: 669192
Join Date: Nov 2013
Device: Kindle 4.1.1 no touch
Quote:
Originally Posted by gipsy View Post
Thanks for the replies!

I tried the

Find: (<i>)(.*?) (τον)(.*?)(</i>)
Replace: \1\2 του\4\5

and works in many issues.
Please also try mine.

Find: τον(?=(?:[^<]+|<[^i]|<i[^>])*?<\/i>)
Replace: του
Skeeve is offline   Reply With Quote
Old 04-16-2014, 10:38 PM   #332
Alex2110
Junior Member
Alex2110 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Apr 2014
Device: none
How can I create a regular expression that will find all occurrence of these three characters $#}?
I mean if the target string is: abc$456#fgh}890
the matches will be : 3 at position 4,8,12

I tried the expression [$#}] and it finds only one match (the one for the $ character).
Alex2110 is offline   Reply With Quote
Old 04-16-2014, 11:13 PM   #333
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,689
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Alex2110 View Post
How can I create a regular expression that will find all occurrence of these three characters $#}?
I mean if the target string is: abc$456#fgh}890
the matches will be : 3 at position 4,8,12

I tried the expression [$#}] and it finds only one match (the one for the $ character).
Remember to escape
([a-z]{3})\$(\d+)\#([a-z]{3})\}(\d+)

4 captures shown
theducks is offline   Reply With Quote
Old 04-17-2014, 12:33 PM   #334
Alex2110
Junior Member
Alex2110 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Apr 2014
Device: none
Thanks for the reply.
I think that your answer will work for the provided target string ("abc$456#fgh}890"), but I'm looking for a general regular expression that will work for any combination of alpha/digits character string as I want to find all matches of these three characters in any string.
In C++:
string s("@@@$%^4$88#");
smatch m;
bool found = regex_search(s, m, regex("[$#}]"));
if (found)
{

}
Alex2110 is offline   Reply With Quote
Old 04-17-2014, 02:23 PM   #335
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,465
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Alex2110 View Post
I tried the expression [$#}] and it finds only one match (the one for the $ character).
You're doing something wrong then. That expression finds all three in the string when I try it. One at a time, of course. PCRE is never going to match all three at once for you--if that's what you're trying to do.

Find the first occurrence--replace it with something. Find the second occurrence--replace it with something. Logic and/or variable replacement values (other than captures) are pretty much off the table.

Also remember that when using PCRE within Sigil, there really are no individual "strings" per se. There are files with text in them. The whole file is the string. You may be able to narrow things down to one line of text, but that's about it.

Last edited by DiapDealer; 04-17-2014 at 02:34 PM.
DiapDealer is offline   Reply With Quote
Old 04-17-2014, 03:33 PM   #336
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
Kovid has talked of providing scripting language to do this sort of thing in his calibre editor, but it will likely never happen in Sigil.
mrmikel is offline   Reply With Quote
Old 04-17-2014, 05:01 PM   #337
Skeeve
Zealot
Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.
 
Skeeve's Avatar
 
Posts: 142
Karma: 669192
Join Date: Nov 2013
Device: Kindle 4.1.1 no touch
Quote:
Originally Posted by gipsy View Post
Hi,
there is any way to find the word "τον" between italics and replace it with "του"
Another attempt

Search: τον(?=(?:[^<]*(?!<i>)<)*/i>)
Replace: του

Last edited by Skeeve; 04-17-2014 at 05:04 PM.
Skeeve is offline   Reply With Quote
Old 04-18-2014, 04:30 AM   #338
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Hi

about no-break spaces and superscripts

1. - I have trouble finding the exclusion character on my keyboard: where is ^ ?

2. - I wish to insert a utf-8 no-break space (\u202F) instead of a normal space (\s) after a superscript like here:
</sup>\s
but not after this link </a></sup>\s
and not when there is followed by et or de </sup>\s(et\s|de\s)
for example not here: </sup> et la...

Is this correct? I tried it, it finds nothing...
search=[^</a>]</sup>\s[^(et\s|de\s)]
replace=</sup>\u202F

Last edited by roger64; 04-18-2014 at 04:41 AM.
roger64 is offline   Reply With Quote
Old 04-18-2014, 06:41 AM   #339
Skeeve
Zealot
Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.
 
Skeeve's Avatar
 
Posts: 142
Karma: 669192
Join Date: Nov 2013
Device: Kindle 4.1.1 no touch
Quote:
Originally Posted by roger64 View Post
Is this correct? I tried it, it finds nothing...
search=[^</a>]</sup>\s[^(et\s|de\s)]
replace=</sup>\u202F
No. It's wrong. [^</a>] means "Match one(!) character if t is not "<" and not "/" and not "a" and not ">". What you meant is the negative lookbehind
Code:
(?<!</a>)
Without testing (I can't run sigil on my computer), I'd assume that your regexp should be:
Code:
search=(?<!</a>)</sup>\s(?!(et|de)\s)
replace=</sup>\u202F
Skeeve is offline   Reply With Quote
Old 04-18-2014, 10:29 AM   #340
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,465
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Looks about right to me, skeeve (without a computer for testing at the moment). You could also put a \K after the </sup> (to 'forget' what matched before). That way it would match only the space he was looking to replace. Not that critical in this example, but sometimes, \K is very handy for simplifying/shortening the replace expression.

Last edited by DiapDealer; 04-18-2014 at 11:47 AM.
DiapDealer is offline   Reply With Quote
Old 04-18-2014, 11:27 AM   #341
Skeeve
Zealot
Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.Skeeve ought to be getting tired of karma fortunes by now.
 
Skeeve's Avatar
 
Posts: 142
Karma: 669192
Join Date: Nov 2013
Device: Kindle 4.1.1 no touch
Quote:
Originally Posted by DiapDealer View Post
\K is very handy for simplifying/shortening the replace expression.
Never knew about that. Thanks for mentioning.
Skeeve is offline   Reply With Quote
Old 04-18-2014, 03:46 PM   #342
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Thanks for your kind help and your useful infos.

I failed but will try again to learn exactly what the negative lookbehind can do for me.

Last edited by roger64; 04-19-2014 at 01:06 AM.
roger64 is offline   Reply With Quote
Old 04-24-2014, 05:38 PM   #343
Steadyhands
Connoisseur
Steadyhands began at the beginning.
 
Steadyhands's Avatar
 
Posts: 57
Karma: 10
Join Date: Dec 2011
Device: Samsung Tablet
I'm looking to fix incorrect apostrophes at beginning of contracted words, but not match closed single smart quotes. If it was always a space after it would be easy but if the contraction is at the end of the sentence then ...

FYI - In the file speech is in double smart quotes. Not going to attempt a global S&R but just a step through.

Find this
‘em ‘bout ‘im

Not this
‘foo’ ‘bar’ ‘not this’ ‘or this,’ ‘this,’


I've been playing with variation of this
Quote:
‘([^’]*?)([\.\,\;\:\?\!|\s])
Edit - still playing with ideas
this works better - negative lookahead
Quote:
(?!‘([a-z]* [a-z]*|[a-z]*|[a-z]* [a-z]*\p{P}|[a-z]*\p{P})’)‘

Last edited by Steadyhands; 04-25-2014 at 02:16 AM.
Steadyhands is offline   Reply With Quote
Old 04-25-2014, 03:03 AM   #344
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,514
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
How about just this?

‘[^’]*[\n‘]

find left quote followed by any number of non-right-quote characters, until a new line or another left quote.

It will catch multi-paragraph quotes too, but those should be easy to ignore.

If you are searching this in code view and your paragraphs are coded with simple <p>...</p>:

‘[^’]*(</p>|‘)
Jellby is offline   Reply With Quote
Old 04-25-2014, 03:29 AM   #345
Steadyhands
Connoisseur
Steadyhands began at the beginning.
 
Steadyhands's Avatar
 
Posts: 57
Karma: 10
Join Date: Dec 2011
Device: Samsung Tablet
that ends up a bit too greedy
in the example below it captures all the way up up to and including the apostrophe at the start of foo


‘em the cat sat on the mat ‘bout ‘foo’
Steadyhands is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Examples of Subgroups emonti8384 Lounge 32 02-26-2011 06:00 PM
Accessories Pen examples Gunnerp245 enTourage Archive 15 02-21-2011 03:23 PM
Stylesheet examples? Skitzman69 Sigil 15 09-24-2010 08:24 PM
Examples kafkaesque1978 iRiver Story 1 07-26-2010 03:49 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM


All times are GMT -4. The time now is 08:13 AM.


MobileRead.com is a privately owned, operated and funded community.