![]() |
#331 |
Fanatic
![]() ![]() ![]() ![]() ![]() Posts: 527
Karma: 470
Join Date: Sep 2007
Location: The Netherlands
Device: Kindle Oasis
|
I just tested it and it makes no difference at all. Junk In Junk Out
I created a new html file with Compozer. It generates the header with the UTF-8 already in place. I inserted (cut and past with Notepad++ ) the text from the original file and it stays the same. I did the same with Word and with 'WebPage' from Trellian |
![]() |
![]() |
![]() |
#332 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,154
Karma: 3252017
Join Date: Jan 2008
Location: Germany
Device: Pocketbook Touch Lux (623)
|
:-( I'm quickly running out of ideas.
Could you post the mobi file? I'll be at home in less than two hours, and I should be able to have a closer look at what happens then. |
![]() |
![]() |
Advert | |
|
![]() |
#333 | ||
Fanatic
![]() ![]() ![]() ![]() ![]() Posts: 527
Karma: 470
Join Date: Sep 2007
Location: The Netherlands
Device: Kindle Oasis
|
Quote:
![]() Quote:
|
||
![]() |
![]() |
![]() |
#334 | |
New York Editor
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,384
Karma: 16540415
Join Date: Aug 2007
Device: PalmTX, Pocket eDGe, Alcatel Fierce 4, RCA Viking Pro 10, Nexus 7
|
Quote:
______ Dennis |
|
![]() |
![]() |
![]() |
#335 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,418
Karma: 27757236
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
MOBI files specify their encoding in the header. Not sure if mobi2html uses that information. Try mobi2oeb
|
![]() |
![]() |
Advert | |
|
![]() |
#336 | |
Fanatic
![]() ![]() ![]() ![]() ![]() Posts: 527
Karma: 470
Join Date: Sep 2007
Location: The Netherlands
Device: Kindle Oasis
|
I guess the 'problem' lies in the mobifile. I tried mob2oeb and got the following output:
Quote:
|
|
![]() |
![]() |
![]() |
#337 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
I am reluctant to have support for 1252 so I will probably just assume that the input html file is UTF-8. |
|
![]() |
![]() |
![]() |
#338 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
|
|
![]() |
![]() |
![]() |
#339 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,418
Karma: 27757236
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The codepage is encoded in bytes 24-28 of the header. It is 1252 for windows-1252 and 65001 for UTF-8
See https://libprs500.kovidgoyal.net/bro.../reader.py#L97 |
![]() |
![]() |
![]() |
#340 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,154
Karma: 3252017
Join Date: Jan 2008
Location: Germany
Device: Pocketbook Touch Lux (623)
|
|
![]() |
![]() |
![]() |
#341 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
|
|
![]() |
![]() |
![]() |
#342 |
Fanatic
![]() ![]() ![]() ![]() ![]() Posts: 527
Karma: 470
Join Date: Sep 2007
Location: The Netherlands
Device: Kindle Oasis
|
Perhaps you can make it an option. If no parameter is given, use the internal codepage. If somebody is not happy with that, give them the choice to force a codepage
|
![]() |
![]() |
![]() |
#343 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
There is a bug in MobiPerl regarding your problem. The links will not work since the UTF-8 characters is not handled correctly. And they are translated to the wrong HTML entities. If you use --rawhtml to get what is in the MobiPocket file and add the meta tag fir UTF-8 to this then it will probably work better in a browser. Non-breakable space did work but I got some characters that did not work. I have to read up on how to handle UTF-8 in Perl so I cannot do a fast fix... |
|
![]() |
![]() |
![]() |
#344 | ||
Fanatic
![]() ![]() ![]() ![]() ![]() Posts: 527
Karma: 470
Join Date: Sep 2007
Location: The Netherlands
Device: Kindle Oasis
|
Quote:
Quote:
Ok, My mistake, I made a typo inserting the string. It works Last edited by Ortep; 02-26-2008 at 04:24 PM. Reason: Typo |
||
![]() |
![]() |
![]() |
#345 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
Do anybody know were I can find correctly coded MobiPocket files which use utf-8 and have a table of content and uses utf-8 character sequences like "0xe2 ox80 0x99" (') or "0xc2 0xa0" (nbsp). I wonder if mobigen will give me such a fille. I will test... |
|
![]() |
![]() |
![]() |
Tags |
mobi2mobi, mobils |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Mobi2Mobi Mobi2Mobi v0.13 - GUI for Mobiperl tools | Jad | Kindle Formats | 476 | 03-15-2015 05:51 PM |
Tools for Editing Kindle .mobi Files? | GJN | Kindle Formats | 33 | 12-26-2013 02:05 PM |
Handy Perl Script to convert HTML0 files to smartquotes | maggotb0y | Sony Reader | 0 | 04-12-2007 11:49 AM |
PRS-500 Perl tools to generate Reader content | TadW | Sony Reader Dev Corner | 0 | 01-08-2007 05:55 AM |
gmail copy (gmcp) - Perl script to copy files to/from Gmail | Colin Dunstan | Lounge | 0 | 09-04-2004 01:24 PM |