![]() |
#1 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Sep 2010
Device: itouch
|
Converting Sanskrit PDF to epub
I have used Calibre to convert a Sanskrit PDF document to epub format. When I open that in Stanza app on my itouch all the Sanskrit characters are rendered gibberish. I hope I have done something wrong. If any of you have tried this successfully then I would appreciate your guidance. I can read the Sanskrit PDF on my itouch using various reader apps (scrolling. sizing, etc. are bit annoying).
|
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Have you checked it in Calibre reader? My suspicion would be that the pdf's you are using have the characters/fonts embedded. Calibre is probably creating the correct output, but without an embedded font. Stanza probably doesn't have access to those characters, so you don't see anything. You would need to use Sigil to embed a font, and even then I'm not certain Stanza is compliant enough to use it.
Then again you could be having encoding problems. All depends on what you really mean by 'gibberish' |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
I suspect it's just images of the pages of Sanskrit, not an actual Sanskrit font. If so, this is doubly hard to fix. Images of pages have to be OCR'd, and that's hard enough even for a normal font. I have serious doubts that any OCR program will do Sanskrit. Now if it had been Linear A or Proto-Elamite, it would have been easy
![]() |
![]() |
![]() |
![]() |
#4 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Sep 2010
Device: itouch
|
Thank you very much. You are right about the font being embedded in the PDF. What I meant by gibberish was it comes up as Roman characters with diacritical marks. I will try your suggestion regarding using Caliber reader and also trying sigil.
|
![]() |
![]() |
![]() |
#5 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Sep 2010
Device: itouch
|
Here is how it appears in the Calibre reader.
´ÉÏSÒaÉÉïxÉmiÉzÉiÉÏ Sorry, I cannot paste the equivalent in Sanskrit here. |
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
If none of that made sense, feel free to post it here, or PM a copy to me, and I'll take a look. |
|
![]() |
![]() |
![]() |
#7 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Sep 2010
Device: itouch
|
I would like to attach my PDF file to the post. Can you please let me know how to do it?.
|
![]() |
![]() |
![]() |
#8 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Sep 2010
Device: itouch
|
Here is the PDF file I am using as the input.
|
![]() |
![]() |
![]() |
#9 |
quantum mechanic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 705
Karma: 483827
Join Date: Aug 2010
Location: NorCal
Device: Nook1, Samsung Transform, Nook2
|
There are subsets of Baraha fonts (BRH Devanagari Extra) embedded in the PDF. Luckily, these are free fonts. If you'd had embedded subsets that are proprietary fonts, there wouldn't have been much you could have done.
Just search for these fonts and embed them using Sigil and you should be good to go (assuming, as Idolse wrote, that Sigil is cimpliant enough to use embedded fonts). I think it's worth a shot. If you have trouble finding the fonts separately, just install Baraha (it's a free Devanagari wordprocessor that I've used for Marathi in the past ![]() Note: just to be extra careful, open the pdf on your PC (or Mac ![]() |
![]() |
![]() |
![]() |
#10 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Sigil will definitely let you embed fonts - there are some threads over in the Sigil sub-forum on how to do it, and there are plenty of discussions on Mobileread on how to do it. Look for "Three Men and a Boat" to see some examples. What I'm less sure of is Stanza's support for embedded fonts - google searches seem to show there is some level of support, but it's problematic. That said, I would expect Apple iBooks might be better in this respect.
|
![]() |
![]() |
![]() |
#11 |
quantum mechanic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 705
Karma: 483827
Join Date: Aug 2010
Location: NorCal
Device: Nook1, Samsung Transform, Nook2
|
Oh Lord, my mind's going on vacation
![]() ![]() Sorry 'bout that ![]() |
![]() |
![]() |
![]() |
#12 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Sep 2010
Device: itouch
|
I appreciate all the responses so far. Here is what I have done so far.
1. I generated the epub version of my PDF using Calibre. 2. When I use the Calibre reader to read the contents the generated cover page keeps the fonts and I can read the text. The body of the book is mostly gibberish. 3. In Sigil also I can read the cover page and the rest is same as in #2. 4. ibook application on my itouch is similar to #2 and #3. I have not embedded the fonts yet. My question is how come Calibre is generating the cover page keeping the fonts in tact while the body of the book does not keep the fonts? |
![]() |
![]() |
![]() |
#13 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
The cover page uses a pdf library which turns the front page into an image. Not really acceptable for the actual book contents.
Embedding the fonts is key - it won't look like anything until you do that. |
![]() |
![]() |
![]() |
#14 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: Sep 2010
Device: itouch
|
I was getting a bit frustrated trying to embed the fonts. I used Atlantis to generate the epub with embedded fonts. The generated ebook was rendered very well on both Calibre reader and Sigil. The Stanza and ibooks apps were unable to handle the embedded font (Sanskrit characters were all gibberish). I got in touch with Atlantis support team and they mentioned that they were able to read the generated ebook on 'Adobe Digital Editions' and 'Sony reader'. Please let me know if you have any other ideas.
|
![]() |
![]() |
![]() |
#15 |
quantum mechanic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 705
Karma: 483827
Join Date: Aug 2010
Location: NorCal
Device: Nook1, Samsung Transform, Nook2
|
It looks like an Apple problem at this point (not being able to read embedded fonts in epub or something like that). Can you check with them (or the Stanza/ibooks devs) whether Stanza/ibooks supports embedded fonts? Nothing you can do if they don't. Also, see if you can find any other epub readers for your itouch. I'm not an Apple user so I have no idea if this is an OS problem (no fonts other than system fonts allowed) or if it's at the level of the app (i.e. Stanza and iBooks are just deficient in that regard).
Essentially, since the Atlantis team has confirmed that the fonts are embedded properly, since the Sony reader can read it, it is no longer an issue with creating the epub but being able to read it. |
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
converting pdf to epub | Gagan | ePub | 65 | 06-28-2017 11:57 PM |
Problem with accents converting PDF to EPUB | madeira | Calibre | 0 | 07-09-2010 05:15 PM |
Problem converting pdf to epub | smartin | Calibre | 3 | 05-02-2010 06:55 AM |
Help with converting PDF to epub | neilmarr | Sigil | 6 | 11-14-2009 09:26 AM |
Best device for reading Sanskrit from PDF | R o d | Which one should I buy? | 4 | 01-08-2009 06:30 AM |