MobileRead Forums - View Single Post

kovidgoyal · 10-11-2007, 01:28 PM

If you do modify hyphenate send me the modified code.

Quote:

Originally Posted by Goshzilla

I've noticed some bizar hyphenations being used in the program. For instance sometimes words like "y-our" get hyphenated, same goes for "s-mall" and "u-sual" in some instances plural forms "word-s" or "travel-s" when I use the hyphenation python script on individiual words like that it doesn't return erroneous results, but when I take an entire line, use

a=f.readline()
b=a.split()
(while loop that goes through the length of b)
a=re.sub(b[i],'^'.join(hyphenate(b[i])), a)

the line hyphenates words incorrectly. I think this has something to do with the way .sub works by finding a "pattern" in the string, so once a line has been altered, sub doesn't quit work as well because I'm looking at a new string.

update: It's the dang punctuation that messes it up, if I don't remove the punctuation marks before calling the hyphenation method, the word gets hyphenated incorrectly. I could either just alter the simple script I wrote, or alter the hyphenate code.

I could use a line like a=re.sub("[,;:.!]", '', a) to remove the punctuation marks, but when there is an apostrophe I would like to cut out the word preceeding the apostrophe mark, so something like "Washington's" becomes "Washington"