View Single Post
Old 04-05-2008, 10:22 AM   #2
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by zelda_pinwheel View Post
...many special characters are not displayed ("Legal unicode value encountered for a glyph we do not support. Replacing with '?' ")...
That's a current limitation that you are up against! It recognizes the glyph, but doesn't render (support) it. Making images to replace common unsupported glyphs is a practical solution; just search and replace
Code:
"& # 601"

with

<img src="601image.gif" align="absbottom">

Quote:
Question 1 : is there some way (other than making images...) to get this stupid thing to display the proper glyphs ? (if not, maybe i will make images, but it's not a really good solution especially since i fear there will be dozens or even hundreds of them)
I think this is your only choice. Just make ONE image for each unsupported legal gylph and search and replace as indicated above. For say a hundred "& # 601" you will only need one 601 image!

Quote:
Question 2 : is this a problem which is known, like the UTF-8 problem ? is there an easy solution to it ??
Usually 'Tidy' used by 'LIT2SB' handles the conversions quite well, but I had it fail miserably recently for a test case of converting Esperanto. Maybe your html is similar.

p.s. I also got terrible results from Tidy when from converting UTF-8 (see post#44 here), however, I think your situation is different.

EDIT: I will get you started with some sample character images in the attached .zip!
Attached Files
File Type: zip char_images.zip (8.1 KB, 498 views)
File Type: txt char_images.txt (2.2 KB, 440 views)

Last edited by nrapallo; 04-05-2008 at 10:53 AM. Reason: added sample images
nrapallo is offline   Reply With Quote