View Single Post
Old 06-09-2009, 07:39 AM   #773
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by tompe View Post
When I open the file I get
Code:
  href="book.html%23toc"
Why do I get %23? Why do I not get a #? Is # encoded in a special way in UTF-8?
When you see the href="book.html%23toc", it has nothing to do with UTF-8, but rather is a result of URL Encoding which removes special punctuation characters from URL's.

In Perl, there is a function that will take an already encoded URL and decode it back to simple ASCII. Please refer to uri_unescape() which converts a URL encoded string to its normal representation.

Then just post-process any URL link that contains a % using uni_unescape, if you would rather not deal with this encoding.
nrapallo is offline   Reply With Quote