I never did modify the existing hyphenation algorithm. I only know basic coding, basic objects, I am still learning about writing my own search trees and hash tables, and I still haven't figured all the little nuances in Python just yet.
So here is what I wrote:
import re
from hyphenate import hyphenate_word as hyphenate
def process(y):
if (y=='\n'):
return y
b=re.sub("[,()*;:!?.]", '', y)
b=re.sub('"','',b)
b=re.sub('[\[\]{}<>]','',b)
k=b.split()
i=0
while(i<len(k)):
y=re.sub(k[i],'^'.join(hyphenate(k[i])),y)
i=i+1
return y
import re
from hyphenate import hyphenate_word as hyphenate
from process import process as process
f=open('gltrv10.txt', 'rb')
g=open('r23.txt', 'w')
a=f.readline()
while(a!=''):
g.write(process(a))
a=f.readline()
|