Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 09-04-2010, 04:19 PM   #1
llcooljayce
Member
llcooljayce began at the beginning.
 
Posts: 10
Karma: 10
Join Date: Sep 2010
Device: Kindle 3
PDF to .mobi - Problems with Conversion

Hey guys, I have Malcolm Gladwell's book 'Blink' in PDF and I want to read it on my Kindle 3. I have tried sending it directly as a PDF but the font was too small. I tried converting it to a .mobi using Calibre but some words were missing or garbled ... does anyone have any suggestions on perhaps another format to try? Thanks
llcooljayce is offline   Reply With Quote
Old 09-05-2010, 10:53 AM   #2
llcooljayce
Member
llcooljayce began at the beginning.
 
Posts: 10
Karma: 10
Join Date: Sep 2010
Device: Kindle 3
Anyone?
llcooljayce is offline   Reply With Quote
Old 09-05-2010, 11:08 AM   #3
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
You shouldn't have missing or garbled words unless the original was messed up. The main problem with pdf is you lose some formatting and you'll see the occasional hard line break.

If there are accented characters this may give you the impression some text was garbled - use the 'Keep Ligatures' option under look and feel to fix that.

Edit - Your pdf might be images with OCR underneath that. That would cause you to think the pdf was good, but it's only because you can't see the bad OCR underneath, which is what Calibre would use. Find some of the garbled text in Calibre, then go and copy the same text from the pdf, and copy/paste it into a text editor. If the text editor mirrors Calibre then you can be sure that's the problem.

Last edited by ldolse; 09-05-2010 at 11:21 AM.
ldolse is offline   Reply With Quote
Old 09-05-2010, 12:59 PM   #4
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by ldolse View Post
Find some of the garbled text in Calibre, then go and copy the same text from the pdf, and copy/paste it into a text editor. If the text editor mirrors Calibre then you can be sure that's the problem.
This procedure also works:

Find some of the garbled text in Calibre.
Go to the pdf and search for that text.
If you find something in the pdf (even if the text you find looks not-garbled), then you can be sure that's the problem. (PDF searches search for the OCR text, but show you an image of the text.)
Starson17 is offline   Reply With Quote
Old 09-05-2010, 06:15 PM   #5
adeling
Junior Member
adeling began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Sep 2010
Device: Amazon Kindle 3
Quote:
Originally Posted by llcooljayce View Post
Hey guys, I have Malcolm Gladwell's book 'Blink' in PDF and I want to read it on my Kindle 3. I have tried sending it directly as a PDF but the font was too small. I tried converting it to a .mobi using Calibre but some words were missing or garbled ... does anyone have any suggestions on perhaps another format to try? Thanks
When you say some of the words were garbled, was it really only some? The reason I ask is because I did some conversion and all the words were messed up except on the 'cover'. See this page for more information: http://www.buzzle.com/articles/under...rotection.html
adeling is offline   Reply With Quote
Old 09-06-2010, 12:52 AM   #6
llcooljayce
Member
llcooljayce began at the beginning.
 
Posts: 10
Karma: 10
Join Date: Sep 2010
Device: Kindle 3
Looks like what is happening is that Calibre is taking the top of each page and OCR'ing it. For example the first chapter's title is 'The Statue That Didn't Look Right' and it has larger spacing than normal text so it shows up as 'T h e S t a t u e t h a t d i d n ' t l o o k r i g h t'

Which is why i thought other text was garbled. It also has the title of the book over and over again ...

Is there any way to ignore that stuff when converting?
llcooljayce is offline   Reply With Quote
Old 09-06-2010, 01:38 AM   #7
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Calibre didn't do any OCR. This is a relatively common problem with conversion of pdf titles - the original PDF had extra spacing between the letters for formatting - not proper full spaces, but they are each basically separate draw instructions in the PDF, so they get converted as separate characters instead of a word. There is no easy way to get rid of those except by hand editing afterward.

As far as the title of the book appearing goes, that's because the PDF has a header or footer that's being converted as well. You can use the header/footer removal option in Structure detection to remove this. You need to write a regular expression pattern for this, it's probably something like "\s*<p>\s*Blink\s*</p>", but you'll need to use the test function to tweak the pattern.
ldolse is offline   Reply With Quote
Old 09-06-2010, 01:35 PM   #8
llcooljayce
Member
llcooljayce began at the beginning.
 
Posts: 10
Karma: 10
Join Date: Sep 2010
Device: Kindle 3
I just tried the 'Remove Header' and 'Remove Footer' preference and it worked like a charm! Thanks for the suggestion.
llcooljayce is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF to Mobi Conversion rayh Calibre 2 09-24-2010 02:33 AM
epub to mobi conversion problems rcdc Calibre 5 09-18-2010 02:29 AM
Problems with PDF to Mobi conversion in Calibre (for my Kindle 3) star Calibre 1 09-13-2010 01:01 PM
.pdf to .mobi problems calicocal Calibre 1 03-12-2010 12:26 PM
pdf conversion problems.. Help! demilich Calibre 2 02-14-2010 07:18 AM


All times are GMT -4. The time now is 01:50 PM.


MobileRead.com is a privately owned, operated and funded community.