07-05-2011, 06:46 AM | #1 |
Connoisseur
Posts: 73
Karma: 44
Join Date: Sep 2010
Device: kindle 3
|
Problem converting document with diacritics
Hello,
I have several documents with romanian diacritics (ăîâșțĂÎÂȘȚ) which i want to convert in mobi format but Calibre fails at this operation.The converted document have characters like ? replacing the diacritics. I found a way to resolve this by saving the document in html format (Web Page, Filtered with MsWord) or OpenDocumentText (odt) and after that convert with Calibre. I'm asking you if this is a bug from Calibre ? I attach a document for tests. Thanks |
07-05-2011, 06:51 AM | #2 |
Wizard
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
If you have non-ascii characters in your document then you ahve to make sure that you set the code-page correctly in the conversion settings. The '?' characters in the output represent characters that are unknown in the code page/font combination that has been set for the output document. Having said that, it is always possible that there are limitations in what Calibre can handle in RTF format!
The reason that it is probably working when you save as HTML is that the HTML document will have the correct code page set. |
07-05-2011, 09:26 AM | #3 |
Connoisseur
Posts: 73
Karma: 44
Join Date: Sep 2010
Device: kindle 3
|
Do you know what encoding i have to specify in Calibre for this document ?
|
07-05-2011, 01:34 PM | #4 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
07-06-2011, 02:59 AM | #5 |
Connoisseur
Posts: 73
Karma: 44
Join Date: Sep 2010
Device: kindle 3
|
I believe is usefull a checkbox in Calibre to force transformation in html before making the mobi file. Right ?
|
07-06-2011, 06:49 AM | #6 |
Wizard
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
That would achieve nothing!
In fact Calibre DOES transform everything to HTML as part of the input processing of any file prior to creating the output format. In your case it appears to guess the code page wrong for this intermediate HTML stage. However if YOU do it and it displays correctly in a browser, then the code page will be encoded in the header of the HTML file so that Calibre no longer needs to guess. |
07-06-2011, 04:06 PM | #7 |
Connoisseur
Posts: 73
Karma: 44
Join Date: Sep 2010
Device: kindle 3
|
I don't know how Calibre transforms .rtf documents in html but it doesn't do right for document with diacritics.
The only way to convert document with this characters is to transform in html using MS Word's Saving as -> Web Page, Filtered not Web page or Single Web Page. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Troubleshooting Diacritics support? | amobile | Amazon Kindle | 13 | 01-16-2011 07:42 PM |
Diacritics, Czech | kucera | Kobo Reader | 9 | 12-24-2010 12:13 PM |
Diacritics problem with calibre | gregor40 | Calibre | 5 | 08-09-2010 07:14 AM |
Problem Converting | Starfish07 | Calibre | 3 | 01-07-2010 06:14 AM |
How to convert a Word document into a Kindle document? | PS Kindle | Kindle Developer's Corner | 2 | 12-08-2009 08:40 PM |