Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 07-28-2013, 01:58 PM   #271
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 15,521
Karma: 13575467
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Quote:
Originally Posted by Doitsu View Post
I'm sure that there's a more elegant solution, but the following simple regex should work:

Find: ([[:upper:]]{1,})([[:lower:]]+)
Replace: \1<small>\U\2\E</small>

Since this simple regex will find title case strings everywhere, you can't use it with Replace All, though.

To replace two consecutive title case words use the following regex:

Find: ([[:upper:]]{1,})([[:lower:]]+) ([[:upper:]]{1,})([[:lower:]]+)
Replace: \1<small>\U\2\E</small> \3<small>\U\4\E</small>
Thank you sir. I will give it a try.
crutledge is offline   Reply With Quote
Old 07-29-2013, 11:04 AM   #272
mzmm
Groupie
mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.
 
mzmm's Avatar
 
Posts: 162
Karma: 86115
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
here's another version using a look behind.

Code:
find:
(?<=[A-Z])([a-z]+)

replace:
<small>\U\1\E</small>

Last edited by mzmm; 07-29-2013 at 02:46 PM.
mzmm is offline   Reply With Quote
 
Enthusiast
Old 08-04-2013, 02:03 PM   #273
trebor6691
Junior Member
trebor6691 began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2013
Device: Kindle
Search, but only replace a portion of the search

One of the things I spend the most time editing is bad paragraph breaks. For instance, Tom continued his paragraph

on another line.

The easiest way so far is to regex Search:

</span></p>

<p class="calibre9"><span class="calibre6">[a-z]

then manually <shift> arrow left, and hit space. I would love to Search for the paragraph starting with a lowercase letter, but leave the letter intact and Replace everything before it with the space so that I can replace all at once.

Any help would be greatly appreciated.
trebor6691 is offline   Reply With Quote
Old 08-04-2013, 02:13 PM   #274
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,846
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by trebor6691 View Post
One of the things I spend the most time editing is bad paragraph breaks. For instance, Tom continued his paragraph

on another line.

The easiest way so far is to regex Search:

</span></p>

<p class="calibre9"><span class="calibre6">[a-z]

then manually <shift> arrow left, and hit space. I would love to Search for the paragraph starting with a lowercase letter, but leave the letter intact and Replace everything before it with the space so that I can replace all at once.

Any help would be greatly appreciated.
You were almost there
Code:
(?sm)</span></p>\s+<p class="calibre9"><span class="calibre6">([a-z])
Code:
(a space here)\1
the slash 1 puts back the captured letter from above
theducks is offline   Reply With Quote
Old 08-04-2013, 02:18 PM   #275
trebor6691
Junior Member
trebor6691 began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2013
Device: Kindle
That is awesome. Many more uses for the \1 now. Thanks a bunch.
trebor6691 is offline   Reply With Quote
Old 08-04-2013, 03:09 PM   #276
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,846
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by trebor6691 View Post
That is awesome. Many more uses for the \1 now. Thanks a bunch.
\1 is only the notation for 1st capture
just like \9 would be the 9th capture (never been past \4 myself )

It is the search term that is the magic

Get a REGEX Cheatsheet and keep it handy.
theducks is offline   Reply With Quote
Old 08-14-2013, 12:31 PM   #277
Leonatus
Addict
Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.
 
Leonatus's Avatar
 
Posts: 225
Karma: 157352
Join Date: Mar 2013
Location: Berlin, Germany
Device: Kobo Touch
Did I overlook a Regex that forces uppercase after period, exclamation mark, interrogation mark and white space, in the case of initial quotation mark without white space?

Thanks in advance!
Leonatus is offline   Reply With Quote
Old 08-15-2013, 02:21 AM   #278
Leonatus
Addict
Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.
 
Leonatus's Avatar
 
Posts: 225
Karma: 157352
Join Date: Mar 2013
Location: Berlin, Germany
Device: Kobo Touch
Well, I checked it out by myself and found that [.\!?] [a-z] will find any lowercase after punctuation marks, with or without whitespace.
Leonatus is offline   Reply With Quote
Old 08-15-2013, 10:57 AM   #279
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,846
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by Leonatus View Post
Well, I checked it out by myself and found that [.\!?] [a-z] will find any lowercase after punctuation marks, with or without whitespace.
Code:
[\.\!?] [a-z]
If you are looking for a period, you need to escape it or it becomes a wildcard

finds:with 0 or one space (but not a nbsp)

\s?[\.\!?][a-z](\s)?
theducks is offline   Reply With Quote
Old 08-15-2013, 01:17 PM   #280
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,261
Karma: 42123822
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Consider replacing [a-z] with \p{Ll}

That way, lower case unicode characters can be matched as well. You never know when a random "é" or "á" will bite you in the butt (and not just in the above regex).

[a-zA-Z] becomes \p{L}
[a-z] becomes \p{Ll}
[A-Z] becomes \p{Lu}
DiapDealer is offline   Reply With Quote
Old 08-16-2013, 04:18 AM   #281
Leonatus
Addict
Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.
 
Leonatus's Avatar
 
Posts: 225
Karma: 157352
Join Date: Mar 2013
Location: Berlin, Germany
Device: Kobo Touch
Ah, thanks!
Somewhere I had read that inside square brackets some marks don't need to be escaped - except "!". However, I'm through the text, but I shall try again with the escaped period - maybe there were no matches with period and I didn't notice it.

The major problem had been in the "replace" sector: I had to replace everything manually, because all of my ideas concerning regex were inserted literally (no success)

@DiapDealer: {Ll}: I don't understand neither the meaning of "L" or of the pipe. Which is their general function?
Leonatus is offline   Reply With Quote
Old 08-16-2013, 07:41 AM   #282
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,261
Karma: 42123822
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Leonatus View Post
@DiapDealer: {Ll}: I don't understand neither the meaning of "L" or of the pipe. Which is their general function?
That's actually a lowercase "L" rather than a pipe.

\p{L} matches any letter character in any language
\p{Ll} matches any lowercase letter character in any language
\p{Lu} matches any uppercase letter character in any language

Even books in English use accented characters that will be overlooked by [a-z].

NOTE: the L or the Ll or the Lu have no special regex meaning outside of the \p{} construct. They simply represent unicode properties/categories. \p{} matches a single character belonging to the specified category, and \P{} matches a single character NOT belonging to the specified category.
http://www.regular-expressions.info/unicode.html

Last edited by DiapDealer; 08-16-2013 at 08:05 AM.
DiapDealer is offline   Reply With Quote
Old 08-16-2013, 08:27 AM   #283
Leonatus
Addict
Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.
 
Leonatus's Avatar
 
Posts: 225
Karma: 157352
Join Date: Mar 2013
Location: Berlin, Germany
Device: Kobo Touch
Thank you!

I just see that lowercase letters after period have been matched, even without escaping the period.

What I further do not understand is, why my command didn't care about space between punctuation mark and letter, i. e. there was a match with and without whitespace.

Last edited by Leonatus; 08-16-2013 at 08:32 AM.
Leonatus is offline   Reply With Quote
Old 09-08-2013, 06:19 AM   #284
BobC
Addict
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 350
Karma: 245756
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, Various Android Apps
Question Regex Arithmetic

I am looking to do some mass-renumbering of ID's in a book in order to insert endnote hyperlinks.

What I want to do is transform something like:

"id012345" .... "id012444"
to
"ref_1" .... "ref_100"

Significant here is that the last digit of the transformed number is not the same as that of the original. (It would actually be the footnote number extracted from the main text where it appears as [1] ... [100] )

Has anyone found a way to do this with Sigil's regex ?

BobC
BobC is offline   Reply With Quote
Old 09-08-2013, 06:26 AM   #285
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 1,991
Karma: 4633978
Join Date: Dec 2010
Device: Kindle PW2
You can search for the footnote number with \[(\d+)\] and the footnote id with id(\d+) and then combine both.
Doitsu is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Examples of Subgroups emonti8384 Lounge 32 02-26-2011 06:00 PM
Accessories Pen examples Gunnerp245 enTourage Archive 15 02-21-2011 03:23 PM
Stylesheet examples? Skitzman69 Sigil 15 09-24-2010 08:24 PM
Examples kafkaesque1978 iRiver Story 1 07-26-2010 03:49 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM


All times are GMT -4. The time now is 12:00 AM.


MobileRead.com is a privately owned, operated and funded community.