MobileRead Forums - View Single Post - txt2lrf

Bokeh · 07-25-2007, 05:22 PM

I have been having a problem with many txt files I try to run through this, where somewhere in the txt file is a unicode character the program cannot process.

I usually get a message like this:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc6' in position 25014: ordinal not in range(128)

which I guess is a python related error? (I know very little about programming)

I can sometimes track down the character if I can understand the unicode for that character, I use Charactermap to find and replace it in Word. But often with an error like this I don't know how to locate the character.

Does anyone have any tips for me on either a macro I can run in word to remove characters txt2lrf cannot process,or how to use the error message to locate the offending character?

Are there any plans to "pre-screen" text files in txt2lrf to help remove those characters?

So far I have been finding them manually by deleting half the text, seeing if the error remains, then keep deleting half until I find it. Which is super slow.

Any help would be appreciated!

07-25-2007, 05:22 PM	#21
Bokeh Junior Member Posts: 5 Karma: 10 Join Date: Jul 2007 Device: Sony PRS-500	unicode characters problem I have been having a problem with many txt files I try to run through this, where somewhere in the txt file is a unicode character the program cannot process. I usually get a message like this: UnicodeEncodeError: 'ascii' codec can't encode character u'\xc6' in position 25014: ordinal not in range(128) which I guess is a python related error? (I know very little about programming) I can sometimes track down the character if I can understand the unicode for that character, I use Charactermap to find and replace it in Word. But often with an error like this I don't know how to locate the character. Does anyone have any tips for me on either a macro I can run in word to remove characters txt2lrf cannot process,or how to use the error message to locate the offending character? Are there any plans to "pre-screen" text files in txt2lrf to help remove those characters? So far I have been finding them manually by deleting half the text, seeing if the error remains, then keep deleting half until I find it. Which is super slow. Any help would be appreciated!