Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 07-27-2010, 08:14 PM   #1
jjansen
J
jjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
Posts: 188
Karma: 12590
Join Date: Mar 2009
Location: Canada
Device: SONY PRS 505/300, iRex DR800SG, Nook
double l's

Like everyone else in this forum I have to say that Calibre is GREAT. I have referred every friend I have that owns an ereader and told them that they need to help keep Calibre in business. Thanks for the great product.

I have two quick how-to questions that I haven't figured out yet. I should probably post two threads but they are short questions.

1) I just converted a PDF to epub and nearly all my double l's (lowercase LL) get converted to a single l and a space. I have seen this before and I always just assumed that it was a problem with the PDF. I checked the PDF today and it looked fine. Any thoughts?

2) I have an epub file with some images in it. I would like to remove those images (for space not decency reasons!!). I have that option when I convert from PDF to epub. Is there anyway to do this when my starting point is epub?

Thanks to everyone who makes this a great forum............Jackie
jjansen is offline   Reply With Quote
Old 07-27-2010, 08:47 PM   #2
speakingtohe
Wizard
speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.
 
Posts: 4,812
Karma: 26912940
Join Date: Apr 2010
Device: sony PRS-T1 and T3, Kobo Mini and Aura HD, Tablet
Litagures under look and feel for the double l's?
speakingtohe is offline   Reply With Quote
Old 07-28-2010, 11:31 PM   #3
jjansen
J
jjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
Posts: 188
Karma: 12590
Join Date: Mar 2009
Location: Canada
Device: SONY PRS 505/300, iRex DR800SG, Nook
Litagures??? According to Google this word is used when discussing Egyptian mathematics, child porn, rectal wash and/or artificial teeth. In the context of double l's I am stuck.
jjansen is offline   Reply With Quote
Old 07-28-2010, 11:50 PM   #4
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by jjansen View Post
Litagures??? According to Google this word is used when discussing Egyptian mathematics, child porn, rectal wash and/or artificial teeth. In the context of double l's I am stuck.
Go to Preferences - Conversion - Look & Feel and check Keep Ligatures

You didn't check any dictionaries, did you?
Quote:
lig·a·ture (lg-chr, -chr)n.
3. A character, letter, or type, such as æ, combining two or more letters.
DoctorOhh is offline   Reply With Quote
Old 07-29-2010, 12:02 AM   #5
guyanonymous
Guru
guyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud of
 
Posts: 692
Karma: 27532
Join Date: Dec 2007
Device: Ebookwise 1150 / 1200
A minor typo...I think he meant "Ligature".

http://en.wikipedia.org/wiki/Typographic_ligature
guyanonymous is offline   Reply With Quote
Old 07-29-2010, 12:56 AM   #6
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by guyanonymous View Post
A minor typo...I think he meant "Ligature".

http://en.wikipedia.org/wiki/Typographic_ligature
I didn't even notice the original typo.
DoctorOhh is offline   Reply With Quote
Old 07-31-2010, 09:19 AM   #7
jjansen
J
jjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterjjansen can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
Posts: 188
Karma: 12590
Join Date: Mar 2009
Location: Canada
Device: SONY PRS 505/300, iRex DR800SG, Nook
Well if nothing else I learned a new word. Ligatures didn't fix the problem. I found out that my document had been written in another country. I tried removing the metadata from the pdf and lost half my letters. I assume that it is some sort of code page problem but I don't know for sure. I converted to RTF and fixed some of the really obvious ones myself and left the rest. Thanks for pointing that feature out to me...........Jackie
jjansen is offline   Reply With Quote
Old 01-24-2011, 12:54 PM   #8
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
coming here in frustratin from my other thread. I cannot find any google referecnces to double-L being a ligature. only ff fi & some other combinations.

My bug report was thrown out as a duplicate issue and others are for sure having problems with ll become ing l+space in PDF convert.

I have tried convertt to txt - still exists-
examine pdfohtml intermediate output via calibre S&R wizard ( damage already done at that stage),
tried ligatures on/ off - no difference.
examined all PDF glyphs in source ( with another program) no special ll characters.

searched forum ( hence arriving here ) though cannot have l or ll as search terms which does not help.

this must be a calibre bug but I cannot find the original bug ticket which presumably is still open & being worked on by someone ?

and yes I know this is an old thread but as it's still open old thread week, I believe...

it could be an OCR issue, so if someone could kindly tell me whether calibre is using OCR in PDF conversion that would be progress, of sorts
cybmole is offline   Reply With Quote
Old 01-24-2011, 12:57 PM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
http://calibre-ebook.com/user_manual...-pdf-documents
kovidgoyal is offline   Reply With Quote
Old 01-24-2011, 01:21 PM   #10
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by cybmole View Post
coming here in frustratin ...
it could be an OCR issue, so if someone could kindly tell me whether calibre is using OCR in PDF conversion that would be progress, of sorts
Calibre does not use OCR. Your PDF may have OCR text behind images of words. Calibre converts text not images of text, so if the text is wrong, Calibre's conversion is wrong. The text may have a single l, while the image of teh text has a double l. Or you may have ligatures. The engine Calibre uses can't handle all ligatures. Start by selecting the double-l word in your pdf, copy it, and paste it into notepad to see if that word pastes in correctly with a double or single l.
Starson17 is offline   Reply With Quote
Old 01-24-2011, 01:21 PM   #11
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
Quote:
Originally Posted by kovidgoyal View Post
ok- it says: Some PDFs use special glyphs to represent ll or ff or fi, etc.

( google, OTOH ,has nothing relevant to say about ""ll glyph" )

BUT I have examined a list of all source document glyphs with another tool & see nothing resembling this case, so Is there another possible cause ?

I have also converted with other software and all LL converted OK, but the other overall conversion was inferior to calibre's in other ways, ( Like losing images, warapping stuff in div tags, not scaling fonts well )
cybmole is offline   Reply With Quote
Old 01-24-2011, 01:31 PM   #12
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
Quote:
Originally Posted by Starson17 View Post
Calibre does not use OCR. Your PDF may have OCR text behind images of words. Calibre converts text not images of text, so if the text is wrong, Calibre's conversion is wrong. The text may have a single l, while the image of teh text has a double l. Or you may have ligatures. The engine Calibre uses can't handle all ligatures. Start by selecting the double-l word in your pdf, copy it, and paste it into notepad to see if that word pastes in correctly with a double or single l.
OK - thanks - my OCR hypothesis bites the dust - progress. moving on...

using acrobat, not my default pdfexch viewer, I can search for and exract text - here is the paragraph that I used at start of (mysterious case of diasappearing Ls ) thread; pasted from source PDF - the double Ls all look fine, yet see the posted epub conversion, below

Safely on the other side of the stairwell housing, Ruth tilted her head up and let
the cataract wash over her cataracts. She’d been scheduled to have
phacoemulsification the week after martial law was declared. Now she was stuck
with cloudy vision of a cloudy sky. She pulled some matted strands of hair away
from her eyes, her fingers straying up her forehead, which seemed to go all the
way to the back of her head.


I've added the italics here, for thread clarity. the above text looks fine if I post it into notepad.

calibre converts it to this

Safely on the other side of the stairwel housing, Ruth tilted her head up and let the cataract wash over her cataracts. She’d been scheduled to have phacoemulsification the week after martial law was declared. Now she was stuck with cloudy vision of a cloudy sky. She pul ed some matted strands of hair away from her eyes, her fingers straying up her forehead, which seemed to go al the way to the back of her head. Maybe it was better she couldn’t see that wel . In her mind she could stil picture herself as she was. Abe, too.

so, all ----al
pulled ----pul ed
still -----stil
stairwell ----stairwel

Last edited by cybmole; 01-24-2011 at 01:36 PM.
cybmole is offline   Reply With Quote
Old 01-24-2011, 01:33 PM   #13
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
You might want to cut out a page from your pdf with your double ll and paste it here.
Starson17 is offline   Reply With Quote
Old 01-24-2011, 01:36 PM   #14
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
There's no point. This is not going to be fixed until the new calibre pdf engine is ready. It is a bug in the poppler pdftohtml program calibre uses. The new pdf engine has a replacement for pdftohtml that I wrote that fixes this issue, but until it is ready, there is no fix.
kovidgoyal is offline   Reply With Quote
Old 01-24-2011, 01:38 PM   #15
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by cybmole View Post
OK - using acrobat,
so, all ----al
pulled ----pul ed
still -----stil
stairwell ----stairwel
PDF has some tricky oddball ways of doing things. Essentially, it is a bunch of text characters with rules for where to put those characters on the page. It's based on a printer language where that makes some sense, but it's not easy for a conversion engine as there's no way to even be sure you've found the end of the sentence. You have to guess from where the text is put on the page and the text itself. It's possible that the double ll's are saved as a single l with a rule for putting in two of them. Calibre's conversion engine (not written by Kovid) does not know all rules. Acrobat does know them all.
Starson17 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem with double L's converting PDF to EPUB TheFakeMoonMan Conversion 26 06-08-2018 05:18 PM
Double TOC crutledge Calibre 0 07-18-2010 12:29 PM
0.7.7 converts double "l's" to single stan1 Calibre 3 07-06-2010 03:03 AM
double click mfaine Calibre 1 11-09-2009 10:56 AM
double trouble SUSGOD Bookeen 16 11-05-2009 01:45 PM


All times are GMT -4. The time now is 07:58 AM.


MobileRead.com is a privately owned, operated and funded community.