MobileRead Forums - View Single Post

Goshzilla · 10-11-2007, 12:22 AM

I've noticed some bizar hyphenations being used in the program. For instance sometimes words like "y-our" get hyphenated, same goes for "s-mall" and "u-sual" in some instances plural forms "word-s" or "travel-s" when I use the hyphenation python script on individiual words like that it doesn't return erroneous results, but when I take an entire line, use

a=f.readline()
b=a.split()
(while loop that goes through the length of b)
a=re.sub(b[i],'^'.join(hyphenate(b[i])), a)

the line hyphenates words incorrectly. I think this has something to do with the way .sub works by finding a "pattern" in the string, so once a line has been altered, sub doesn't quit work as well because I'm looking at a new string.

update: It's the dang punctuation that messes it up, if I don't remove the punctuation marks before calling the hyphenation method, the word gets hyphenated incorrectly. I could either just alter the simple script I wrote, or alter the hyphenate code.

I could use a line like a=re.sub("[,;:.!]", '', a) to remove the punctuation marks, but when there is an apostrophe I would like to cut out the word preceeding the apostrophe mark, so something like "Washington's" becomes "Washington"

10-11-2007, 12:22 AM	#19
Goshzilla Zealot Posts: 104 Karma: 346 Join Date: Oct 2007 Device: Rocket Ebook 1150	I've noticed some bizar hyphenations being used in the program. For instance sometimes words like "y-our" get hyphenated, same goes for "s-mall" and "u-sual" in some instances plural forms "word-s" or "travel-s" when I use the hyphenation python script on individiual words like that it doesn't return erroneous results, but when I take an entire line, use a=f.readline() b=a.split() (while loop that goes through the length of b) a=re.sub(b[i],'^'.join(hyphenate(b[i])), a) the line hyphenates words incorrectly. I think this has something to do with the way .sub works by finding a "pattern" in the string, so once a line has been altered, sub doesn't quit work as well because I'm looking at a new string. update: It's the dang punctuation that messes it up, if I don't remove the punctuation marks before calling the hyphenation method, the word gets hyphenated incorrectly. I could either just alter the simple script I wrote, or alter the hyphenate code. I could use a line like a=re.sub("[,;:.!]", '', a) to remove the punctuation marks, but when there is an apostrophe I would like to cut out the word preceeding the apostrophe mark, so something like "Washington's" becomes "Washington" Last edited by Goshzilla; 10-11-2007 at 12:33 AM.