View Single Post
Old 07-25-2007, 06:22 PM   #21
Junior Member
Bokeh began at the beginning.
Posts: 5
Karma: 10
Join Date: Jul 2007
Device: Sony PRS-500
Question unicode characters problem

I have been having a problem with many txt files I try to run through this, where somewhere in the txt file is a unicode character the program cannot process.

I usually get a message like this:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc6' in position 25014: ordinal not in range(128)

which I guess is a python related error? (I know very little about programming)

I can sometimes track down the character if I can understand the unicode for that character, I use Charactermap to find and replace it in Word. But often with an error like this I don't know how to locate the character.

Does anyone have any tips for me on either a macro I can run in word to remove characters txt2lrf cannot process,or how to use the error message to locate the offending character?

Are there any plans to "pre-screen" text files in txt2lrf to help remove those characters?

So far I have been finding them manually by deleting half the text, seeing if the error remains, then keep deleting half until I find it. Which is super slow.

Any help would be appreciated!
Bokeh is offline   Reply With Quote