Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 08-10-2014, 10:04 AM   #391
mzmm
Groupie
mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.
 
mzmm's Avatar
 
Posts: 162
Karma: 86115
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
Quote:
Originally Posted by eschwartz View Post
Yes, I forgot some things, like the part where lookbehinds need to look behind. :o
ha, yes, wondered about that

Quote:
Originally Posted by eschwartz View Post
Find:
Code:
(?<![.!?])(?<=[ ])([A-Z])(?=[a-z]+)
Replace. :D Keep in mind that \E -- end of modifier's action -- is not strictly necessary if the entire replacement is being flagged as lowercase:
Code:
\L\1
never really got the \E; thanks for the info

Quote:
Originally Posted by eschwartz View Post
You lost a space.
no, it's in the first capturing group. uppercase space == lowercase space, but obviously less readable :)

Quote:
Originally Posted by eschwartz View Post
Also, you imitated my mistake of offering an uppercasing solution (for letters that are already uppercase :rolleyes:) instead of lowercasing. I blame my dental surgery, what's your excuse? :D (You can blame it on me, I did trick you. :))
ehm, yes, i did, didn't i. i'll also blame your dental surgery, if you don't mind :)
mzmm is offline   Reply With Quote
Old 08-10-2014, 10:09 AM   #392
mzmm
Groupie
mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.
 
mzmm's Avatar
 
Posts: 162
Karma: 86115
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
Quote:
Originally Posted by DiapDealer View Post
... the (?<=) and (?<!) hokum of lookbehinds was (and still is) always difficult for me to remember on the fly. I find it terribly unintuitive.
+1

it's taken me forever to be able to remember these. i don't even have a mnemonic for them because i can't find one that makes any sense.
mzmm is offline   Reply With Quote
Old 08-10-2014, 10:21 AM   #393
eschwartz
Irrational Optimist
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
Posts: 5,957
Karma: 9376906
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch (Wifi only)
Quote:
Originally Posted by mzmm View Post
+1

it's taken me forever to be able to remember these. i don't even have a mnemonic for them because i can't find one that makes any sense.
Oh, I thought it was pretty easy.

The less-than sign points the not/equals backward.
eschwartz is offline   Reply With Quote
Old 08-10-2014, 10:25 AM   #394
eschwartz
Irrational Optimist
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
Posts: 5,957
Karma: 9376906
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch (Wifi only)
Quote:
Originally Posted by Leonatus View Post
For weekend reasons, I have the text to be treated not available here to test it, but there might some additional clarification be necessary.

Does your proposal not match any uppercase letter in the respective context?

The point is, that nouns in the german language have always been spelled uppercase (at the beginning of the word, of course), also today, and should remain. Whereas, in the former spelling, most of words representing objects or persons, such as pronouns, have been written uppercase, having to be written lowercase following the actual spelling grammar. So, in English it would be like this:

The black Panther was meant to attack Him immediately, but He jumped quickly aside beyond the Wall.

Thus, the "Panther" and the "Wall" should remain uppercase, but "He" and "Him" should turn lowercase.

I hope the problem I have became clearer.
I agree with mzmm -- matching only selected uppercased letters will quickly get hairy. I only tried for avoiding the ones that are immediately obvious as beginning a sentence. So my solution should match all those words (and replace them).

Given a definite exclusion list, you can definitely do it -- but it won't be very readable.
eschwartz is offline   Reply With Quote
Old 08-11-2014, 02:51 AM   #395
Leonatus
Addict
Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.
 
Leonatus's Avatar
 
Posts: 216
Karma: 157352
Join Date: Mar 2013
Location: Berlin, Germany
Device: Kobo Touch
Quote:
Originally Posted by mzmm View Post
there's going to be some issues with a regex that only catches pronouns, for a few reasons i think; one is that the formal Sie/Ihnen should remain uppercase, whereas sie (she) or ihnen (them) should be converted to lower case.

also, if one is referring to God, i'm uncertain as to weather that would constitute an uppercase Du, or lowercase du, so you may have to be aware of the context there.
Yes, with those exceptions you were dead-on, and I wouldn't even have had the courage to ask for the solution of them. Many thanks for your efforts, and, yes, today I'm going to try it. It would help me thus much if it worked, for the built-in spellcheck of word (or OO) won't consider those problems at all.

And I suppose it's sort of an ironic comment on the German grammar to write everything lowercase in English, isn't it?

Ah, and your example shows a guillemet, showing its peak to the left; that's indeed the French way to use them at the beginning of a direct speech (like a wrap). Here, it's more common to use (and I do it) a right-showing guillemet for the opening of a direct speech. So I have to replace it, I suppose?

Last edited by Leonatus; 08-11-2014 at 03:14 AM.
Leonatus is offline   Reply With Quote
Old 08-11-2014, 03:51 AM   #396
Leonatus
Addict
Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.
 
Leonatus's Avatar
 
Posts: 216
Karma: 157352
Join Date: Mar 2013
Location: Berlin, Germany
Device: Kobo Touch
Hm, the replace entry "\1\L\2" (without quotes) shows: "Invalid option".

I used the Search and Replace function in Word, checking the Wildcard-option. Is there something wrong?

Also, if I only enter the "search" option, it shows "invalid sample comparison" (I translated the german expression and don't know if it's correct in Engish). Seems that using Regex in Word is very limited.

Last edited by Leonatus; 08-11-2014 at 03:57 AM.
Leonatus is offline   Reply With Quote
Old 08-11-2014, 06:11 AM   #397
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,061
Karma: 40855212
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
The \L \U \E replacement variables aren't going to work in Word. It was probably assumed (given the subforum this is taking place in) that you were using Sigil's regex S&R feature.
DiapDealer is online now   Reply With Quote
Old 08-11-2014, 09:04 AM   #398
eschwartz
Irrational Optimist
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
Posts: 5,957
Karma: 9376906
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch (Wifi only)
Quote:
Originally Posted by DiapDealer View Post
The \L \U \E replacement variables aren't going to work in Word. It was probably assumed (given the subforum this is taking place in) that you were using Sigil's regex S&R feature.
Precisely.

If you're going to be using regex you need to be using something with a proper regex engine, not word processors.
eschwartz is offline   Reply With Quote
Old 08-11-2014, 09:09 AM   #399
mzmm
Groupie
mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.
 
mzmm's Avatar
 
Posts: 162
Karma: 86115
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
Quote:
Originally Posted by Leonatus View Post
... And I suppose it's sort of an ironic comment on the German grammar to write everything lowercase in English, isn't it?:rofl:
ha, i suppose so. wasn't this a trend at some point in Germany in the 90s though? to forgo uppercase nouns? maybe you could bring back the trend... :)

Quote:
Originally Posted by Leonatus View Post
Ah, and your example shows a guillemet, showing its peak to the left; that's indeed the French way to use them at the beginning of a direct speech (like a wrap). Here, it's more common to use (and I do it) a right-showing guillemet for the opening of a direct speech. So I have to replace it, I suppose?
ah ok, yes, just swap it out for »

Quote:
Originally Posted by DiapDealer View Post
It was probably assumed (given the subforum this is taking place in) that you were using Sigil's regex S&R feature.
yep.
mzmm is offline   Reply With Quote
Old 08-11-2014, 10:22 AM   #400
Leonatus
Addict
Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.
 
Leonatus's Avatar
 
Posts: 216
Karma: 157352
Join Date: Mar 2013
Location: Berlin, Germany
Device: Kobo Touch
@mzmm,
@DiapDealer,
@eschwartz:

Yes, yes, yes, you are completely right, and I even assumed the result beiing like this. I've got two reasons why I tried it in spite of it:
1. Provided I receive the texts I prepare not as html or even epub, I like to format them in Word or OpenOffice, simply because the possibilities of spellchecking there are, for my taste, generally more comfortable than in Sigil or Calibre. And, having them thus formatted, I convert them to epubs by using Toxaris', resp. Luke's tools, doing the fine-tuning with Sigil or Calibre.

2. (Ahem) At the place I have been this morning, I have no chance to use other programs than word for more refined Regex testing (I suppose you guess why). So I just ... tried it.

And once more, many, many thanks for your advice! Now that I'm at home and with Sigil available, your trick works like a charm! I think this will save me hours and hours of time in the future, for I'm trying to become specialized on German literature of the 19th century.
Leonatus is offline   Reply With Quote
Old 08-11-2014, 11:04 AM   #401
Leonatus
Addict
Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.Leonatus can grok the meaning of the universe.
 
Leonatus's Avatar
 
Posts: 216
Karma: 157352
Join Date: Mar 2013
Location: Berlin, Germany
Device: Kobo Touch
Just to report:
Code:
(?<![.!?])(\s»?)(Du|Dich|Dir|Er|Ihn|Ihm|Ihr|Es|Wir|Uns|Euch|Jeder|Jede|Alle|Alles)\b
matches uppercase pronouns also after sentences that end with closing guillemetsm such as:
blah.« Er ging ...

So complete
Code:
(?<![.!?«])(\s»?)(Du|Dich|Dir|Er|Ihn|Ihm|Ihr|Es|Wir|Uns|Euch|Jeder|Jede|Alle|Alles)\b
I suppose ...

no, doesn't work, because there is a whitespace following the closing guillemet.

Last edited by Leonatus; 08-11-2014 at 11:14 AM. Reason: correction
Leonatus is offline   Reply With Quote
Old 08-30-2014, 01:48 PM   #402
ReaderRabbit
Member
ReaderRabbit began at the beginning.
 
ReaderRabbit's Avatar
 
Posts: 22
Karma: 10
Join Date: Mar 2011
Location: Colorado
Device: Cruz Tablet
I am reading a book that for some reason has separated (for example) Mr. Smith into two different paragraphs. I want to find these instances and correct them.

Example:

... Mr.</p>

<p class="indent">Smith

Also, Mrs. Ms. Miss. and Dr.

What regex would find all examples?

Sorry if this has already been answered.
ReaderRabbit is offline   Reply With Quote
Old 08-30-2014, 01:57 PM   #403
Steadyhands
Enthusiast
Steadyhands began at the beginning.
 
Steadyhands's Avatar
 
Posts: 30
Karma: 10
Join Date: Dec 2011
Location: Brisbane, Oz
Device: iPad2
Quote:
Find
(Mr\.|Mrs\.|Miss|Ms\.|Dr\.|St\.)</p>\s+<p class="xxxx\d+">

Replace
\1
Note you will have to change the xxxx to whatever your paragraph style and there is a space after the \1
Steadyhands is offline   Reply With Quote
Old 08-30-2014, 02:15 PM   #404
ReaderRabbit
Member
ReaderRabbit began at the beginning.
 
ReaderRabbit's Avatar
 
Posts: 22
Karma: 10
Join Date: Mar 2011
Location: Colorado
Device: Cruz Tablet
Smile

Quote:
Originally Posted by Steadyhands View Post
Note you will have to change the xxxx to whatever your paragraph style and there is a space after the \1
Thank you. I tried this command. I even put in a test Mr. at the end of a paragraph and it did not find anything. I am using Sigil v. 0.7.4
ReaderRabbit is offline   Reply With Quote
Old 08-30-2014, 02:20 PM   #405
Steadyhands
Enthusiast
Steadyhands began at the beginning.
 
Steadyhands's Avatar
 
Posts: 30
Karma: 10
Join Date: Dec 2011
Location: Brisbane, Oz
Device: iPad2
You need to be in Regex mode in the search box for this to work.
Steadyhands is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Examples of Subgroups emonti8384 Lounge 32 02-26-2011 06:00 PM
Accessories Pen examples Gunnerp245 enTourage Archive 15 02-21-2011 03:23 PM
Stylesheet examples? Skitzman69 Sigil 15 09-24-2010 08:24 PM
Examples kafkaesque1978 iRiver Story 1 07-26-2010 03:49 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM


All times are GMT -4. The time now is 02:54 PM.


MobileRead.com is a privately owned, operated and funded community.