Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 07-28-2013, 01:58 PM   #271
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Quote:
Originally Posted by Doitsu View Post
I'm sure that there's a more elegant solution, but the following simple regex should work:

Find: ([[:upper:]]{1,})([[:lower:]]+)
Replace: \1<small>\U\2\E</small>

Since this simple regex will find title case strings everywhere, you can't use it with Replace All, though.

To replace two consecutive title case words use the following regex:

Find: ([[:upper:]]{1,})([[:lower:]]+) ([[:upper:]]{1,})([[:lower:]]+)
Replace: \1<small>\U\2\E</small> \3<small>\U\4\E</small>
Thank you sir. I will give it a try.
crutledge is offline   Reply With Quote
Old 07-29-2013, 11:04 AM   #272
mzmm
Groupie
mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.
 
mzmm's Avatar
 
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
here's another version using a look behind.

Code:
find:
(?<=[A-Z])([a-z]+)

replace:
<small>\U\1\E</small>

Last edited by mzmm; 07-29-2013 at 02:46 PM.
mzmm is offline   Reply With Quote
Old 08-04-2013, 02:03 PM   #273
trebor6691
Junior Member
trebor6691 began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2013
Device: Kindle
Search, but only replace a portion of the search

One of the things I spend the most time editing is bad paragraph breaks. For instance, Tom continued his paragraph

on another line.

The easiest way so far is to regex Search:

</span></p>

<p class="calibre9"><span class="calibre6">[a-z]

then manually <shift> arrow left, and hit space. I would love to Search for the paragraph starting with a lowercase letter, but leave the letter intact and Replace everything before it with the space so that I can replace all at once.

Any help would be greatly appreciated.
trebor6691 is offline   Reply With Quote
Old 08-04-2013, 02:13 PM   #274
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,799
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by trebor6691 View Post
One of the things I spend the most time editing is bad paragraph breaks. For instance, Tom continued his paragraph

on another line.

The easiest way so far is to regex Search:

</span></p>

<p class="calibre9"><span class="calibre6">[a-z]

then manually <shift> arrow left, and hit space. I would love to Search for the paragraph starting with a lowercase letter, but leave the letter intact and Replace everything before it with the space so that I can replace all at once.

Any help would be greatly appreciated.
You were almost there
Code:
(?sm)</span></p>\s+<p class="calibre9"><span class="calibre6">([a-z])
Code:
(a space here)\1
the slash 1 puts back the captured letter from above
theducks is online now   Reply With Quote
Old 08-04-2013, 02:18 PM   #275
trebor6691
Junior Member
trebor6691 began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2013
Device: Kindle
That is awesome. Many more uses for the \1 now. Thanks a bunch.
trebor6691 is offline   Reply With Quote
Old 08-04-2013, 03:09 PM   #276
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,799
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by trebor6691 View Post
That is awesome. Many more uses for the \1 now. Thanks a bunch.
\1 is only the notation for 1st capture
just like \9 would be the 9th capture (never been past \4 myself )

It is the search term that is the magic

Get a REGEX Cheatsheet and keep it handy.
theducks is online now   Reply With Quote
Old 08-14-2013, 12:31 PM   #277
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,023
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
Did I overlook a Regex that forces uppercase after period, exclamation mark, interrogation mark and white space, in the case of initial quotation mark without white space?

Thanks in advance!
Leonatus is offline   Reply With Quote
Old 08-15-2013, 02:21 AM   #278
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,023
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
Well, I checked it out by myself and found that [.\!?] [a-z] will find any lowercase after punctuation marks, with or without whitespace.
Leonatus is offline   Reply With Quote
Old 08-15-2013, 10:57 AM   #279
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,799
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Leonatus View Post
Well, I checked it out by myself and found that [.\!?] [a-z] will find any lowercase after punctuation marks, with or without whitespace.
Code:
[\.\!?] [a-z]
If you are looking for a period, you need to escape it or it becomes a wildcard

finds:with 0 or one space (but not a nbsp)

\s?[\.\!?][a-z](\s)?
theducks is online now   Reply With Quote
Old 08-15-2013, 01:17 PM   #280
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,548
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Consider replacing [a-z] with \p{Ll}

That way, lower case unicode characters can be matched as well. You never know when a random "é" or "á" will bite you in the butt (and not just in the above regex).

[a-zA-Z] becomes \p{L}
[a-z] becomes \p{Ll}
[A-Z] becomes \p{Lu}
DiapDealer is offline   Reply With Quote
Old 08-16-2013, 04:18 AM   #281
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,023
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
Ah, thanks!
Somewhere I had read that inside square brackets some marks don't need to be escaped - except "!". However, I'm through the text, but I shall try again with the escaped period - maybe there were no matches with period and I didn't notice it.

The major problem had been in the "replace" sector: I had to replace everything manually, because all of my ideas concerning regex were inserted literally (no success)

@DiapDealer: {Ll}: I don't understand neither the meaning of "L" or of the pipe. Which is their general function?
Leonatus is offline   Reply With Quote
Old 08-16-2013, 07:41 AM   #282
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,548
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Leonatus View Post
@DiapDealer: {Ll}: I don't understand neither the meaning of "L" or of the pipe. Which is their general function?
That's actually a lowercase "L" rather than a pipe.

\p{L} matches any letter character in any language
\p{Ll} matches any lowercase letter character in any language
\p{Lu} matches any uppercase letter character in any language

Even books in English use accented characters that will be overlooked by [a-z].

NOTE: the L or the Ll or the Lu have no special regex meaning outside of the \p{} construct. They simply represent unicode properties/categories. \p{} matches a single character belonging to the specified category, and \P{} matches a single character NOT belonging to the specified category.
http://www.regular-expressions.info/unicode.html

Last edited by DiapDealer; 08-16-2013 at 08:05 AM.
DiapDealer is offline   Reply With Quote
Old 08-16-2013, 08:27 AM   #283
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,023
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
Thank you!

I just see that lowercase letters after period have been matched, even without escaping the period.

What I further do not understand is, why my command didn't care about space between punctuation mark and letter, i. e. there was a match with and without whitespace.

Last edited by Leonatus; 08-16-2013 at 08:32 AM.
Leonatus is offline   Reply With Quote
Old 09-08-2013, 06:19 AM   #284
BobC
Guru
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
Question Regex Arithmetic

I am looking to do some mass-renumbering of ID's in a book in order to insert endnote hyperlinks.

What I want to do is transform something like:

"id012345" .... "id012444"
to
"ref_1" .... "ref_100"

Significant here is that the last digit of the transformed number is not the same as that of the original. (It would actually be the footnote number extracted from the main text where it appears as [1] ... [100] )

Has anyone found a way to do this with Sigil's regex ?

BobC
BobC is offline   Reply With Quote
Old 09-08-2013, 06:26 AM   #285
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
You can search for the footnote number with \[(\d+)\] and the footnote id with id(\d+) and then combine both.
Doitsu is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Examples of Subgroups emonti8384 Lounge 32 02-26-2011 06:00 PM
Accessories Pen examples Gunnerp245 enTourage Archive 15 02-21-2011 03:23 PM
Stylesheet examples? Skitzman69 Sigil 15 09-24-2010 08:24 PM
Examples kafkaesque1978 iRiver Story 1 07-26-2010 03:49 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM


All times are GMT -4. The time now is 08:52 AM.


MobileRead.com is a privately owned, operated and funded community.