Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 07-05-2011, 06:46 AM   #1
sorin
Connoisseur
sorin began at the beginning.
 
Posts: 73
Karma: 44
Join Date: Sep 2010
Device: kindle 3
Problem converting document with diacritics

Hello,
I have several documents with romanian diacritics (ăîâșțĂÎÂȘȚ) which i want to convert in mobi format but Calibre fails at this operation.The converted document have characters like ? replacing the diacritics.

I found a way to resolve this by saving the document in html format (Web Page, Filtered with MsWord) or OpenDocumentText (odt) and after that convert with Calibre.

I'm asking you if this is a bug from Calibre ?

I attach a document for tests.

Thanks
Attached Files
File Type: rtf Calibre - test diacritics.rtf (40.0 KB, 667 views)
sorin is offline   Reply With Quote
Old 07-05-2011, 06:51 AM   #2
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
If you have non-ascii characters in your document then you ahve to make sure that you set the code-page correctly in the conversion settings. The '?' characters in the output represent characters that are unknown in the code page/font combination that has been set for the output document. Having said that, it is always possible that there are limitations in what Calibre can handle in RTF format!

The reason that it is probably working when you save as HTML is that the HTML document will have the correct code page set.
itimpi is offline   Reply With Quote
Advert
Old 07-05-2011, 09:26 AM   #3
sorin
Connoisseur
sorin began at the beginning.
 
Posts: 73
Karma: 44
Join Date: Sep 2010
Device: kindle 3
Do you know what encoding i have to specify in Calibre for this document ?
sorin is offline   Reply With Quote
Old 07-05-2011, 01:34 PM   #4
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by sorin View Post
Do you know what encoding i have to specify in Calibre for this document ?
He probably won't know, but if it's working in the html, you can find it there.
Starson17 is offline   Reply With Quote
Old 07-06-2011, 02:59 AM   #5
sorin
Connoisseur
sorin began at the beginning.
 
Posts: 73
Karma: 44
Join Date: Sep 2010
Device: kindle 3
I believe is usefull a checkbox in Calibre to force transformation in html before making the mobi file. Right ?
sorin is offline   Reply With Quote
Advert
Old 07-06-2011, 06:49 AM   #6
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
That would achieve nothing!

In fact Calibre DOES transform everything to HTML as part of the input processing of any file prior to creating the output format. In your case it appears to guess the code page wrong for this intermediate HTML stage. However if YOU do it and it displays correctly in a browser, then the code page will be encoded in the header of the HTML file so that Calibre no longer needs to guess.
itimpi is offline   Reply With Quote
Old 07-06-2011, 04:06 PM   #7
sorin
Connoisseur
sorin began at the beginning.
 
Posts: 73
Karma: 44
Join Date: Sep 2010
Device: kindle 3
I don't know how Calibre transforms .rtf documents in html but it doesn't do right for document with diacritics.
The only way to convert document with this characters is to transform in html using MS Word's Saving as -> Web Page, Filtered not Web page or Single Web Page.
sorin is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Troubleshooting Diacritics support? amobile Amazon Kindle 13 01-16-2011 07:42 PM
Diacritics, Czech kucera Kobo Reader 9 12-24-2010 12:13 PM
Diacritics problem with calibre gregor40 Calibre 5 08-09-2010 07:14 AM
Problem Converting Starfish07 Calibre 3 01-07-2010 06:14 AM
How to convert a Word document into a Kindle document? PS Kindle Kindle Developer's Corner 2 12-08-2009 08:40 PM


All times are GMT -4. The time now is 12:54 PM.


MobileRead.com is a privately owned, operated and funded community.