Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 09-19-2010, 05:06 PM   #1
lippy
Junior Member
lippy began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jul 2010
Device: Kindle dx
Conversion of HTML to UTF-8

I'm trying to convert a few CHM files with iso-8859-1 format. I'm getting some characters being converted incorrectly: for example:

instead of '-' I get 'Â'

I've attached a text file with a few examples of the characters.

At first I thought it was something to do with the chm conversion, but I extracted the underlying HTML files and I got similar results.

I've tried explicitly setting the "input character encoding" to iso-8859-1 and it didn't help. I also tried setting it on the html to zip plug in, to no avail.

Any ideas?
Attached Files
File Type: txt Incorrectly converted chars.txt (56 Bytes, 209 views)
lippy is offline   Reply With Quote
Old 09-19-2010, 05:35 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,857
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
convert the html directly and specify the correct character encoding.
kovidgoyal is online now   Reply With Quote
Advert
Old 09-20-2010, 04:41 PM   #3
lippy
Junior Member
lippy began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jul 2010
Device: Kindle dx
Sorry if I wasn't clear in my original post, but that's what I tried - I extracted the html and tried to convert and specified the correct character encoding. I still get things like: Â littered through-out.

Thanks
lippy is offline   Reply With Quote
Old 09-20-2010, 04:46 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,857
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
In that case your html isnt utf-8
kovidgoyal is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
conversion TO html in_the_fade Calibre 4 04-29-2010 10:51 AM
HTML Conversion Problem bigtymer Calibre 7 01-14-2010 08:15 PM
HTML to TXT conversion alkr Calibre 3 10-02-2009 09:54 AM
amazon html conversion pan2 Amazon Kindle 3 03-21-2009 06:44 PM


All times are GMT -4. The time now is 01:02 AM.


MobileRead.com is a privately owned, operated and funded community.