Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 01-08-2011, 09:03 AM   #1
Cid
Enthusiast
Cid began at the beginning.
 
Posts: 33
Karma: 10
Join Date: Jul 2010
Device: Sony PRS-600
convert PDF input issue

Hi there, first off sorry if this was discussed but I was unable as the searches I did returned too many results.

I couldn't find the solution in the calibre manual either.

I have a PDF that is justified text, and in some lines the spacing between the words of a full line seems to be so big that the PDF input module interprets them as "end of line". This of course results in each word becoming its own line.

Example:
PDF:
This _ would _ be _ the _ justified _ line.

Output:
This
would
be
the
justified
line.

I've tried several different in- and output settings as well as different output formats, to no avail. Same problem in output epub, rtf or txt, which is why I suspect the PDF input to be the problem. I also changed the unwrap factor, both in PDF input and structure detection with results show them at work, but not helping this issue.

Anyone can enlighten me?

Last edited by Cid; 01-08-2011 at 09:28 AM.
Cid is offline   Reply With Quote
Old 01-08-2011, 09:06 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,844
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
PDF input tries to detect line endings based on spacing between characters. There's no way around that because of the nature of PDF. It will fail for some PDF files and succeed for others. I'd suggest you try a copy paste from your pdf or use acrobat professional to convert it to html. Either of those tools may use different parameters when interpreting the spaces and so might succeed.
kovidgoyal is online now   Reply With Quote
Advert
Old 01-08-2011, 09:30 AM   #3
Cid
Enthusiast
Cid began at the beginning.
 
Posts: 33
Karma: 10
Join Date: Jul 2010
Device: Sony PRS-600
Ah thank you, I was afraid so. Anyway, thank you for the help! Calibre is still awesome!
Cid is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
bookmarks in PDF input, and TOC in output pimpoum Calibre 3 12-14-2010 12:21 PM
how processes calibre PDF Input gucky Calibre 1 11-04-2010 10:23 AM
PDF to ePub in Calibre - input somewhat scrambled Seanette ePub 2 11-04-2010 07:34 AM
Bulk Convert problem - prefered format input captpete Calibre 4 08-24-2010 09:26 AM
PDF Input asjogren Calibre 8 04-25-2010 11:04 PM


All times are GMT -4. The time now is 07:47 PM.


MobileRead.com is a privately owned, operated and funded community.