View Single Post
Old 10-11-2007, 12:28 PM   #22
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,423
Karma: 27757236
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
If you do modify hyphenate send me the modified code.

Quote:
Originally Posted by Goshzilla View Post
I've noticed some bizar hyphenations being used in the program. For instance sometimes words like "y-our" get hyphenated, same goes for "s-mall" and "u-sual" in some instances plural forms "word-s" or "travel-s" when I use the hyphenation python script on individiual words like that it doesn't return erroneous results, but when I take an entire line, use

a=f.readline()
b=a.split()
(while loop that goes through the length of b)
a=re.sub(b[i],'^'.join(hyphenate(b[i])), a)


the line hyphenates words incorrectly. I think this has something to do with the way .sub works by finding a "pattern" in the string, so once a line has been altered, sub doesn't quit work as well because I'm looking at a new string.


update: It's the dang punctuation that messes it up, if I don't remove the punctuation marks before calling the hyphenation method, the word gets hyphenated incorrectly. I could either just alter the simple script I wrote, or alter the hyphenate code.

I could use a line like a=re.sub("[,;:.!]", '', a) to remove the punctuation marks, but when there is an apostrophe I would like to cut out the word preceeding the apostrophe mark, so something like "Washington's" becomes "Washington"
kovidgoyal is offline   Reply With Quote