Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 01-11-2011, 06:25 AM   #1
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
help with regex for fixing misspellings please

i have a book in which all instances of words containing ll are ocrrupt. the double l is mis-scanned as a single l + a space i.e. follow has become fol ow.

I tried to construct a regex to use in find + replace so that I could make a replace or find next choice on every hit.

the find is easy \wl w\
but I cant figure the replace syntax to take the found expression, reuse the word boundary characters, and substitute ll for l+space. it's reusing the word boundary detects that is baffling me.

i shoudl probably also make use of the fact the ll occurs only after a vowel ( i think) in correctly spelled english
so something like this would be better for the find expression ?
[AEIOUaeiou]l \[a-z]
i still then need to reuse the "found" characters
cybmole is offline   Reply With Quote
Old 01-11-2011, 06:47 AM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,888
Karma: 59840450
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by cybmole View Post
i have a book in which all instances of words containing ll are ocrrupt. the double l is mis-scanned as a single l + a space i.e. follow has become fol ow.

I tried to construct a regex to use in find + replace so that I could make a replace or find next choice on every hit.

the find is easy \wl w\
but I cant figure the replace syntax to take the found expression, reuse the word boundary characters, and substitute ll for l+space. it's reusing the word boundary detects that is baffling me.

i shoudl probably also make use of the fact the ll occurs only after a vowel ( i think) in correctly spelled english
so something like this would be better for the find expression ?
[AEIOUaeiou]l \[a-z]
i still then need to reuse the "found" characters
Paren around groups. \a_digit to reuse groups found

Match case:
S: ([AEIOUaeiou]l) (\[a-z])
R: \1l \2

But STOP

Gil will not swill beer
will turn into a pile
theducks is offline   Reply With Quote
Advert
Old 01-11-2011, 08:02 AM   #3
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
thanks - I looked up some syntax help & realised I'd not used ( ) which are needed for \1 \2 etc operations

i've now fixed most instances by doing replace all on the messed up common 3 & 4 letter words one at a time - like fil wil fel tel til tal, then all of the "l y " endings.

it made me think quite hard about how to define valid instances of ll and also made me appreciate the occasional inconsistency- like the example you gave, & also odd words like belief that break the double l after a vowel "rule"
cybmole is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Fixing Up Typography ahi Workshop 65 11-18-2013 04:35 AM
DR1000 Fixing the reset button? Uthred iRex 5 09-14-2012 11:11 AM
Sony is fixing the Available Soon issue JSWolf Sony Reader 20 01-07-2010 12:28 PM
Fixing paragraphs with calibre? enarchay Calibre 17 08-16-2009 08:31 PM
iLiad Fixing iPdf for iRex scotty1024 iRex Developer's Corner 54 02-03-2009 06:56 AM


All times are GMT -4. The time now is 06:31 AM.


MobileRead.com is a privately owned, operated and funded community.