Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 05-14-2010, 08:23 AM   #1
mimesys
Junior Member
mimesys began at the beginning.
 
Posts: 2
Karma: 10
Join Date: May 2010
Device: none
Lightbulb help please,convert non English characters

I am a new user of calibre and I love it.
I am more interested in making Indian ebooks available to my library.
now the problem is with converting nonenglish characters into epub format.
i read on the site about conversion process and learned that i need to specify
input character encoding option.
I do not know whts my file's encoding.
I am attaching the file with nonenglish characters.can anybody help me converting it to epub format?

thank you,
Mimesys
Attached Files
File Type: pdf vasma orta.pdf (880.7 KB, 330 views)

Last edited by mimesys; 05-14-2010 at 08:31 AM.
mimesys is offline   Reply With Quote
Old 05-14-2010, 10:00 AM   #2
Dave_S
What Title ?
Dave_S ought to be getting tired of karma fortunes by now.Dave_S ought to be getting tired of karma fortunes by now.Dave_S ought to be getting tired of karma fortunes by now.Dave_S ought to be getting tired of karma fortunes by now.Dave_S ought to be getting tired of karma fortunes by now.Dave_S ought to be getting tired of karma fortunes by now.Dave_S ought to be getting tired of karma fortunes by now.Dave_S ought to be getting tired of karma fortunes by now.Dave_S ought to be getting tired of karma fortunes by now.Dave_S ought to be getting tired of karma fortunes by now.Dave_S ought to be getting tired of karma fortunes by now.
 
Posts: 1,325
Karma: 1856232
Join Date: Jan 2009
Location: Bavaria Germany
Device: Sony Experia Z Ultra
Quote:
Originally Posted by mimesys View Post
I am a new user of calibre and I love it.
I am more interested in making Indian ebooks available to my library.
now the problem is with converting nonenglish characters into epub format.
i read on the site about conversion process and learned that i need to specify
input character encoding option.
I do not know whts my file's encoding.
I am attaching the file with nonenglish characters.can anybody help me converting it to epub format?

thank you,
Mimesys
This extract from your PDF file says that your file uses "WinAnsiEncoding"

Code:
/Encoding /WinAnsiEncoding 
/FontDescriptor 7 0 R 
>> 
endobj
7 0 obj
<< 
/Type /FontDescriptor 
/FontName /MGUNJAN
I am no EPUB expert, but I would think that you would need to embed that MGUNJAN font into your EPUB document, since ANSI encoding would produce much different results with English language fonts?
Dave_S is offline   Reply With Quote
Advert
Old 05-14-2010, 01:48 PM   #3
troymc
Groupie
troymc will become famous soon enoughtroymc will become famous soon enoughtroymc will become famous soon enoughtroymc will become famous soon enoughtroymc will become famous soon enoughtroymc will become famous soon enough
 
Posts: 161
Karma: 608
Join Date: Aug 2008
Location: Plano, TX
Device: Sony PRS-505 + B&N Nook + Motion LE1700 + Motorola Xoom Wifi
Dave_S is correct.

Your PDF uses ansi encoding with a non-unicode font.

The short-term solution is to embed the font into the epub. But long-term you should look at converting to unicode.

I've attached an epub version of your pdf with the font embedded. The formatting is not very good, but you can see how the text looks when the font is embedded. You would need to experiment with Calibre's conversion options to get a better formatted output - or go back and fix the formatting manually.

I used:
  • Calibre to convert the PDF to EPUB (using defaults)
  • Fontforge to extract the font from the PDF
  • Sigil to embed the font in the resulting EPUB

EDIT: That PDF appears to be in Gujarati. If you are interested in pursuing unicode conversion, you may try contacting the people here: http://service.gurjardesh.com/FontConversion.aspx. They have an applet which converts a dozen non-unicode Gujarati fonts into unicode text. Unfortunately MGUNJAN is not one of them, but maybe they can be talked into adding it.


Troy
Attached Files
File Type: epub vasma orta final - Owner.epub (266.6 KB, 189 views)

Last edited by troymc; 05-14-2010 at 05:24 PM.
troymc is offline   Reply With Quote
Reply

Tags
help needed


Forum Jump


All times are GMT -4. The time now is 08:28 AM.


MobileRead.com is a privately owned, operated and funded community.