Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 06-26-2010, 07:03 PM   #1
Pancho Harrera
Junior Member
Pancho Harrera began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jun 2010
Device: iphone
No double LL's-PDF to EPUB

Everytime i convert a PDF to Epub , the epub will skip the first L and you will only see the last L. For example actua ly. What do i do? I have checked the "keep ligatures" box , but still doesen't work.
Pancho Harrera is offline   Reply With Quote
Old 06-26-2010, 11:34 PM   #2
AprilHare
Wizard
AprilHare ought to be getting tired of karma fortunes by now.AprilHare ought to be getting tired of karma fortunes by now.AprilHare ought to be getting tired of karma fortunes by now.AprilHare ought to be getting tired of karma fortunes by now.AprilHare ought to be getting tired of karma fortunes by now.AprilHare ought to be getting tired of karma fortunes by now.AprilHare ought to be getting tired of karma fortunes by now.AprilHare ought to be getting tired of karma fortunes by now.AprilHare ought to be getting tired of karma fortunes by now.AprilHare ought to be getting tired of karma fortunes by now.AprilHare ought to be getting tired of karma fortunes by now.
 
AprilHare's Avatar
 
Posts: 2,981
Karma: 11862367
Join Date: Apr 2008
Device: Sony Reader PRS-T2
What application are you using to generate EPUB files from PDFs?
I assume you're testing with an iPhone - what app?
Details, details..
AprilHare is offline   Reply With Quote
Advert
Old 08-09-2010, 03:43 PM   #3
tomsem
Grand Sorcerer
tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.
 
Posts: 6,478
Karma: 26425959
Join Date: Apr 2009
Location: USA
Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3
I gather you are using calibre for the conversion.

For reasons I don't understand, some double 'l' sequences in some PDFs are converted successfully, and some are not. In each case where I've investigated the PDF source, they are not encoded as ligatures, and nevertheless many 'll' sequences are converted to 'l ' in the HTML. So it is not about ligatures, I'm not surprised that option has no effect.

I haven't seen a problem with other letter combinations, but it's entirely possible that such problems exist.

I can only think that it's a bug in the (open source?) PDF converter code that calibre uses: if I export a 'problem' PDF to HTML from Acrobat Professional, the HTML has the correct sequence of characters, but it is deficient in other respects.

My attempts to reproduce a problem by creating my own PDFs and converting with calibre have been unsuccessful.

It is weird, and frustrating, and I'm not sure there are better one click PDF conversion options available at any reasonable price. You can always clean up the results, of course, but that often involves more time than I'm willing to invest.
tomsem is offline   Reply With Quote
Old 08-10-2010, 08:01 AM   #4
ardeegee
Maratus speciosus butt
ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.
 
ardeegee's Avatar
 
Posts: 3,292
Karma: 1162698
Join Date: Sep 2009
Device: PRS-350
I was just about to ask a similar question but I searched for posts with "ligatures" first. I'm having the same problem with a PDF. Text looks fine in the PDF, I export it to a text file and each of the "bad" characters is exported as a period. When Calibre is used to convert it (and when one on-line converter I found is used) the following problems happen (there might be more combinations, but this is what I've found):

If "ll", second one missing. If "ff", both missing. If "tt", second one missing. If "fl", both missing.
ardeegee is offline   Reply With Quote
Old 08-10-2010, 04:14 PM   #5
ardeegee
Maratus speciosus butt
ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.ardeegee ought to be getting tired of karma fortunes by now.
 
ardeegee's Avatar
 
Posts: 3,292
Karma: 1162698
Join Date: Sep 2009
Device: PRS-350
I think I know why my PDF, at least, has conversion problems. It has a font called "Charis SIL" embedded, and is something called a "smart font"
Charis SIL is a TrueType font with “smart font” capabilities added using the Graphite, OpenType®, and AAT font technologies. This means that complex typographic issues such as the placement of diacritics or the formation of ligatures are handled by the font, provided you are running an application that provides an adequate level of support for one of these smart font technologies. With the old font (and its derivatives), diacritic placement was handled using non-standard character encodings that incorporated multiple versions of a diacritic as distinctly-encoded characters.
http://scripts.sil.org/cms/scripts/p...=charissilfont
ardeegee is offline   Reply With Quote
Advert
Old 08-13-2010, 06:17 PM   #6
tomsem
Grand Sorcerer
tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.
 
Posts: 6,478
Karma: 26425959
Join Date: Apr 2009
Location: USA
Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3
Quote:
Originally Posted by ardeegee View Post
I think I know why my PDF, at least, has conversion problems. It has a font called "Charis SIL" embedded, and is something called a "smart font"
Charis SIL is a TrueType font with “smart font” capabilities added using the Graphite, OpenType®, and AAT font technologies. This means that complex typographic issues such as the placement of diacritics or the formation of ligatures are handled by the font, provided you are running an application that provides an adequate level of support for one of these smart font technologies. With the old font (and its derivatives), diacritic placement was handled using non-standard character encodings that incorporated multiple versions of a diacritic as distinctly-encoded characters.
http://scripts.sil.org/cms/scripts/p...=charissilfont
That is interesting. I checked my 'problem' PDF to see if it matched the pattern you've identified. Sure enough, my conversion problem involves Adobe Garamond Pro, an OpenType font and therefore (I assume) one with smart font capabilities. The same PDF has a small sample of text with some of the same letter combinations, using a different font (TimesNewRoman Italic, I believe, which is a TrueType font and therefore without smart font capabilities), and this does not show a conversion problem.

But when I tried to create a new PDF using the problematic letter combinations and using these two fonts, the conversion went ok. So I'm still only able to see a problem with this one PDF file, and never with anything I create myself.

I still think it points to a problem with the (open source?) PDF library calibre is using, because it cannot convert this particular PDF file to any other format without corruption, whereas Acrobat, and Adobe's free PDF2TXT and PDF2HTML services, don't have this issue.
tomsem is offline   Reply With Quote
Old 08-13-2010, 10:22 PM   #7
Freeshadow
temp. out of service
Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.
 
Posts: 2,792
Karma: 24285242
Join Date: May 2010
Location: Duisburg (DE)
Device: PB 623
that's why PDF is for printing
Quote:
complex typographic issues
Freeshadow is offline   Reply With Quote
Old 08-13-2010, 10:28 PM   #8
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,970
Karma: 128903378
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by ardeegee View Post
I think I know why my PDF, at least, has conversion problems. It has a font called "Charis SIL" embedded, and is something called a "smart font"
Charis SIL is a TrueType font with “smart font” capabilities added using the Graphite, OpenType®, and AAT font technologies. This means that complex typographic issues such as the placement of diacritics or the formation of ligatures are handled by the font, provided you are running an application that provides an adequate level of support for one of these smart font technologies. With the old font (and its derivatives), diacritic placement was handled using non-standard character encodings that incorporated multiple versions of a diacritic as distinctly-encoded characters.
http://scripts.sil.org/cms/scripts/p...=charissilfont
That font family is used in a lot of commercial ePub since it does look better then the default ADE serif font.
JSWolf is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem with double L's converting PDF to EPUB TheFakeMoonMan Conversion 26 06-08-2018 05:18 PM
Convert double page pdf joeanne12 PDF 41 02-29-2012 03:13 PM
How to Crop Double Page PDF Files? picardo PDF 14 12-25-2010 01:07 PM
ePub double spacing leebase Calibre 5 03-30-2010 03:42 PM
.pdf file and Double Spacing output holguinero PDF 0 10-05-2009 12:14 PM


All times are GMT -4. The time now is 02:13 AM.


MobileRead.com is a privately owned, operated and funded community.