Shiny New E-Book Gizmo: The Amazon Kindle


View Full Version : Problem, cannot use big5 document name


dwf_lou
01-17-2005, 12:01 AM
First ,I want to say sorry to everyone,
because my english is not very well.

I think SUNRISE is really great , but I get some problem now.
Every time,after I convert the web siteby SUNRISE.
the file name is correct right big5 filename,
but the document name will belcome[????]
Does anyone know how to solve this problem ?
Plz help me . I will vary appreciate.
THANKS

FILE NAME
http://photos1.flickr.com/3444031_25a117887a.jpg

DB NAME
http://photos3.flickr.com/3444032_03ee8193b1.jpg

Laurens
01-17-2005, 02:02 AM
This happens because the output encoding is not applied to the PDB name. Will look into this for the next bugfix release. For now, you should use only ASCII characters for the document name.

TadW
01-17-2005, 05:16 AM
dwf_lou, welcome on board!

Please in future don't use the caps lock key when tryping your post. It makes it much harder to read ;)

Tad

Gatton
01-17-2005, 03:53 PM
Please in future don't use the caps lock key when tryping your post. It makes it much harder to read ;)

Yes use Chinese instead. That'll show em :D

I keed I keed. Yea I'm in a wacky mood today. Must be because I had the day off ;)

dwf_lou
01-23-2005, 09:12 PM
DEAR LAURENS:
Sorry to bother u again with the same problem.
Last time , u say I can't use document name encoding in BIG5,
this problem will be solve when next bugfix released.
So yesterday when I saw the sunrise 0.41 released , I'm so exciting
for a few moment . :D
But , when I convert the web site, the problem is still there :(
If I try to use document name in BIG5, then I i will get ???? .

Can u help me ?
what's the problem , how can I solve it :blink:
vary vary vary thanks :D

Laurens
01-24-2005, 01:50 AM
I did look into this and came to the conclusion that it cannot be fixed because Big5 is a variable-length encoding. The PDB format allows only for a zero-terminated string of 8-bit characters. You cannot fit a variable-length encoding into this.

In short: this cannot be fixed. You should use characters in the Windows-1252 range only.

dwf_lou
01-24-2005, 02:33 AM
Ok , I get it .
now I will try to find another way to avoid this problem .
vary thanks for ur discursus . :D
wish u have a nice day

lambone
02-04-2005, 01:22 AM
Does it mean that the file name can not have the same encoding with the content? Say if the output file using the encoding other than Windows-1252, it can not write the correct file name into the PDB file? Under the Windows explorer, it display correctly while I open the pgb with UltraEdit, it shows in "????"

I can successfully change the file name using FileZ.

markwu
08-09-2005, 11:39 AM
I did look into this and came to the conclusion that it cannot be fixed because Big5 is a variable-length encoding. The PDB format allows only for a zero-terminated string of 8-bit characters. You cannot fit a variable-length encoding into this.

In short: this cannot be fixed. You should use characters in the Windows-1252 range only.

Hi Laurens:

If the DB name does not allow Big5 encoding, how can I change the DB name in plucker to Chinese correctly? The problem is werid.

And, as I know Big5 is a fixed-length encoding. Every Chinese characters composed by two bytes. So, what do you mean "varialbe-length" encoding? I am a little bit confused here ... :blink:

I just guess, in Microsoft Traditional Chinese Windows XP/2000, use "big5" as default locale. When sunrise running, sunrise will also convert the "document name" again ....

For example:

"XXYY" is the document name, and is encoded by big5. My channel config as following:

Input: UTF-8 (or others)
OutPut: Big5

So, when sunrise build the pdb file, it will convert the "XXYY" from utf-8 to big5 again and set it to document name..... that's why the document name is "??"

And actaully "XXYY"'s encoding here is big5, not utf-8.

Mark

markwu
08-09-2005, 12:04 PM
Forget about this. My guess is wrong.

I just try to sync a site with big5 encoding, So, the

The input encoding is: Big5
Output is big5 too ...

And, I still get ???

Mark