12-28-2009, 06:46 AM | #1 |
Groupie
Posts: 155
Karma: 112134
Join Date: May 2009
Location: Kuala Lumpur
Device: iPad, K3, K4, T1
|
Converting HTML from Runeberg.org
When I try to convert HTML from Runeberg.org, it converts the Swedish character 'å' to 'l'. Everything else seems to work fine, including the other special characters 'ä' and 'ö'.
I've tried converting to both Mobi and ePUB and get the same error on both so it's the source that is the problem. I've attached the files I'm trying to convert and would greatly appreciate some advice. |
12-28-2009, 07:17 AM | #2 |
Icanhasdonuts?
Posts: 2,837
Karma: 532407
Join Date: Aug 2008
Location: Mölnbo, Sweden
Device: Kobo Aura 2nd edition, Kobo Clara HD
|
Could it be due to the TOC stating "charset=windows-1252"?
check if your Calibre is set to use CP1252 as encoding charset as this is stating: "1.Knowing the encoding of the source file: calibre tries to guess what character encoding your source files use, but often, this is impossible, so you need to tell it what encoding to use. This can be done in the GUI via the Input character encoding field in the Look & Feel section. The command-line tools all have an --input-encoding option. 2.When adding HTML files to calibre, you may need to tell calibre what encoding the files are in. To do this go to Preferences->Plugins->File Type plugins and customize the HTML2Zip plugin, telling it what encoding your HTML files are in. Now when you add HTML files to calibre they will be correctly processed. HTML files from different sources often have different encodings, so you may have to change this setting repeatedly. A common encoding for many files from the web is cp1252 and I would suggest you try that first. 3.Embedding fonts: If you are generating an LRF file to read on your SONY Reader, you are limited by the fact that the Reader only supports a few non-English characters in the fonts it comes pre-loaded with. You can work around this problem by embedding a unicode-aware font that supports the character set your file uses into the LRF file. You should embed atleast a serif and a sans-serif font. Be aware that embedding fonts significantly slows down page-turn speed on the reader." |
12-28-2009, 07:52 AM | #3 |
Groupie
Posts: 155
Karma: 112134
Join Date: May 2009
Location: Kuala Lumpur
Device: iPad, K3, K4, T1
|
Thank you so very much, Slite. Customizing the HTML2Zip plugin worked a treat.
|
12-28-2009, 08:19 AM | #4 |
Icanhasdonuts?
Posts: 2,837
Karma: 532407
Join Date: Aug 2008
Location: Mölnbo, Sweden
Device: Kobo Aura 2nd edition, Kobo Clara HD
|
Well, glad to be able to help out.
Good to know as I was thinking about starting to convert some stuff from Runeberg myself |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Content Converting HTML emails? | shermozle | Amazon Kindle | 5 | 09-27-2010 10:03 PM |
Converting PDF to HTML | Nirf | Calibre | 7 | 06-24-2010 08:51 AM |
Converting from html | mysweety | Calibre | 16 | 09-23-2009 08:20 AM |
Converting HTML to Mobi? | Sonist | Calibre | 5 | 02-10-2009 01:23 PM |
Converting CHM to Html | Tahras Rastah | Workshop | 0 | 01-16-2008 05:11 PM |