Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 12-12-2009, 05:44 PM   #1
tapar
Connoisseur
tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.
 
tapar's Avatar
 
Posts: 62
Karma: 1420
Join Date: Dec 2008
Device: Kindle Keyboard 3g, Kindle Paperwhite v1
PDF->Mobi extra spaces inserted?

Howdy,

I have been really loving Calibre and am very pleasantly surprised by how powerful it is. It has been invaluable in organizing my collection. It is going to be a godsend for teaching my mother how to manage her Kindle after XMas.

The only hurdle I am still having trouble with is the beast of beasts...converting PDF->Mobi for the Kindle. The defaults seem to produce output that looks like after every two and a quarter lines someone hit enter a couple of times to add a bit of white space. The text is whole and everything is there, but the blank space is someone distracting. I am thinking there is a setting somewhere that I am unaware of.

Anyone know if there is a solution? If I can get Calibre to do the pdf convert and not have to explain how to run MobiPocket Creator to my less than technically proficient mother I will be eternally in your debt. When I was looking for the answer I saw a post from early November where Kovid mentioned that a new pdf conversion engine was in development, so perhaps I just need to be patient. lol.
tapar is offline   Reply With Quote
Old 12-12-2009, 08:00 PM   #2
JMikeD
Evangelist
JMikeD is as sexy as a twisted cruller doughtnut.JMikeD is as sexy as a twisted cruller doughtnut.JMikeD is as sexy as a twisted cruller doughtnut.JMikeD is as sexy as a twisted cruller doughtnut.JMikeD is as sexy as a twisted cruller doughtnut.JMikeD is as sexy as a twisted cruller doughtnut.JMikeD is as sexy as a twisted cruller doughtnut.JMikeD is as sexy as a twisted cruller doughtnut.JMikeD is as sexy as a twisted cruller doughtnut.JMikeD is as sexy as a twisted cruller doughtnut.JMikeD is as sexy as a twisted cruller doughtnut.
 
JMikeD's Avatar
 
Posts: 452
Karma: 15000
Join Date: Jul 2008
Device: Various and sundry
I've been trying to find something to satisfactorily convert PDF to anything else. Even Adobe's own Acrobat program can’t seem to do it. Exporting a PDF it has just created to text results in extraneous spaces, run-together words, and extra paragraph breaks and page breaks inserted seemingly at random. If Adobe can’t get it right, what chance does anyone else have?

Edit: I've gone back an looked at my notes and find that I mis-remembered. Most of the spurious artifacts I found were introduced in the scanning process of turning printed material into a PDF. Somehow it looked OK on the display, but when the text was extracted weird things came to light.

Last edited by JMikeD; 12-12-2009 at 09:05 PM.
JMikeD is offline   Reply With Quote
Old 12-12-2009, 08:39 PM   #3
tapar
Connoisseur
tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.
 
tapar's Avatar
 
Posts: 62
Karma: 1420
Join Date: Dec 2008
Device: Kindle Keyboard 3g, Kindle Paperwhite v1
Yeah it does seem to be "pick the artifacts that bother you the least" with PDF conversion. I wish it wasn't such a popular format, hopefully a friendlier format will start to take over as the popularity of e-readers continues to grow.

I just got some insight into my issue as I was writing my reply. I was going to paste in a bit of sample text from a PDF, then convert it and paste in some results. I think my problem is a "carriage return line feed" type of issue. I noticed the text was formatted in such a way that wherever the text jumped to the next line in the PDF, the converted result had a full blank line inserted in. I remember an option about adding a line at paragraph breaks, perhaps I am triggering that inadvertently? I will experiment some more.

The problem I am having with Calibre's convert is it takes input like this small blurb form Brandon Sanderson's Elantris:
"ELANTRIS was beautiful, once. It was called the city of the gods: a place of
power. radiance, and magic. Visitors say that the very stones glowed with an
inner light, and that the city contained wondrous arcane marvels. At night,
Elantris shone like a great silvery fire, visible even from a great distance.
Yet, as magnificent as Elantris was, its inhabitants were more so. Their hair
a brilliant white, their skin an almost metallic silver. "

The same text from the converted Mobi file ends up looking like this on the Kindle:
"ELANTRIS was beautiful, once. It was called the city of the gods: a place of

power. radiance, and magic. Visitors say that the very stones glowed with an

inner light, and that the city contained wondrous arcane marvels. At night, Elantris

shone like a great silvery fire, visible even from a great distance.

Yet, as magnificent as Elantris was, its inhabitants were more so. Their hair

a brilliant white, their skin an almost metallic silver."

Actually it is too neat there still, here is a small jpg that shows the effect a bit better on flickr: http://www.flickr.com/photos/45598342@N08/
tapar is offline   Reply With Quote
Old 12-12-2009, 10:42 PM   #4
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 629
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
That Flickr image makees me think their are two problems. One is the CR/LF problem you've mentioned. The other has to do with either the page margins and/or number of characters per line - causing the text to be word wrapped.

Fix both and the problem should be resolved.
Sabardeyn is offline   Reply With Quote
Old 12-14-2009, 02:51 AM   #5
tapar
Connoisseur
tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.
 
tapar's Avatar
 
Posts: 62
Karma: 1420
Join Date: Dec 2008
Device: Kindle Keyboard 3g, Kindle Paperwhite v1
I just made an unexpected discovery that fixed my problem. I don't understand WHY it fixed things or what it really means though. I tried playing with all the available settings and found that going to the "preferences->Conversion->PDF Input" tab and adjusting the "Line Un-Wrapping Factor" to be .25 instead of 0 would totally resolve my problem. I did not find much information on this setting and don't really understand what it affects.

If anyone has some info about that setting I would appreciate it. I am wondering if it is PDF specific or display device specific. I don't know if I will have to adjust it for various PDF's or if it is a "find the setting that works for you then forget it" kind of deal.

If I don't have to adjust it all the time then this will bump Calibre right up to
"ultimate conversion tool" status and make things significantly easier on me.

Last edited by tapar; 12-14-2009 at 03:04 AM.
tapar is offline   Reply With Quote
Old 12-14-2009, 06:47 AM   #6
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,433
Karma: 950001
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
The unwrap factor is a sliding scale from 1 character to the end of the line. The factor determines how long a line has to be to be added into the paragraph with the lines above it.

For example:

This is a full line of text.
Line two.

This is a full line of text
that should not end.

An unwrap factor of 0.5 (the default if 0 is set) will add any line longer than half the page to the line above it. The first example will not be unwrapped while the second one will.
user_none is offline   Reply With Quote
Old 12-15-2009, 05:10 AM   #7
tapar
Connoisseur
tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.tapar is no ebook tyro.
 
tapar's Avatar
 
Posts: 62
Karma: 1420
Join Date: Dec 2008
Device: Kindle Keyboard 3g, Kindle Paperwhite v1
Thanks so much for the explanation, it makes sense to me know why it made a my results so much better.

I am really pleased this works so well, I am more impressed by Calibre the more I use it! It is very nice to have a one stop shop that does such a great job.
tapar is offline   Reply With Quote
Old 01-29-2011, 08:12 PM   #8
Geekdad72
Junior Member
Geekdad72 began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2011
Device: Kindle 3 wifi
Thanks for the help!!

I have a kindle 3 and hate the pdf format on the kindle. I prefer to convert it, and never had much luck with calibre. (I think that EVERYONE that owns a kindle needs to use calibre). so I would use Mobipocket creator, then import it into my library. this fixed my conversion problem with luckily a quick and easy fix, now I can do it all in one program... THANKS SOOOO MUCH!!
Geekdad72 is offline   Reply With Quote
Old 01-29-2011, 08:33 PM   #9
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 8,801
Karma: 12534285
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7
Quote:
Originally Posted by Geekdad72 View Post
I have a kindle 3 and hate the pdf format on the kindle. I prefer to convert it, and never had much luck with calibre. (I think that EVERYONE that owns a kindle needs to use calibre).
First, let me welcome you to Mobileread. I am glad that you took the time to post and share your input with others.

That said, Please Do Not, if at all possible, reopen ancient threads. I know the Old thread Warning is below the reply area and it is possible you didn't see it.

Quote:
Old Thread Warning
This thread is quite old and may contain outdated information. We recommend that you begin a new discussion if you like to talk about a similar topic. Of course, you may also proceed and reply here.
The reason I ask you to refrain from reopening old threads is because calibre is updated about once a week, which means year old posts are usually talking about a version of calibre 50 versions out of date.

PDF though is a difficult area for most conversion utilities. For anyone else viewing this thread please read the user manual statement on PDFs and the Read this before Posting PDF Questions sticky post for further information.

Last edited by DoctorOhh; 01-29-2011 at 08:36 PM.
DoctorOhh is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
I'm having a problem with extra paragraph spaces akosimike Calibre 10 05-27-2010 06:53 PM
FBReader adds extra spaces for EPUB Book pakiyabhai PocketBook 11 05-27-2010 05:58 PM
epub to pdf conversion: blank page inserted before any section Nicoo Calibre 0 12-06-2009 06:10 PM
Lit to Mobi: To many spaces Gibbo Kindle Formats 3 06-08-2009 05:32 PM
Mobi -> LRF, huge spaces between paragraphs Djehuty Calibre 3 04-22-2009 12:06 PM


All times are GMT -4. The time now is 12:11 AM.


MobileRead.com is a privately owned, operated and funded community.