View Full Version : Problem with PDF conversion


Bombatomica
02-07-2010, 06:29 PM
Hello everybody, this is my first post on this fantastic forum, i'm italian so excuse me for my bad english... i hope this is the right place to post my problem.
I have a Sony prs-300 pocket edition, and i think it's a fantastic ereader. The question is that i have hundreds of books in free-drm pdf, and on this reader i have some issues to visualize pdf because very often the font is or too small or too big, even if i use the zoom, and the page format sometimes is broken, page cut at half, white pages and so on. So, i tried to convert pdf in various format, expecially epub and word. I use Calibre to do this, but i have always the same result: In the converted text there are a lot of white spaces. To help you visualize, if the original is:

phrasephrasephrasephrase
phrasephrasephrasephrase
phrasephrasephrasephrase
phrasephrasephrasephrase
phrasephrasephrasephrase

the converted text is always:

phrasephrasephrasephrase
phrasephrasephrasephrase

phrasephrase

phrasephrasephrasephrase

phrasephrasephrase

and so on, in a completely random way. The page is always fragmented, no matter in what format i choose to convert. Besides, i have many books in .lit format, and when i convert for example in epub, i have the same fragmentation problem, even if less invasive, because if i convert from pdf there are also other problems like words cut in half at random, while with .lit i have only spaces everywere.
Is there a way to mantain the original text format in the conversion? Some option i have to select in Calibre, or some other more efficient program? Every suggestion will be highly appreciated :D

frabjous
02-07-2010, 07:07 PM
PDFs are not really made to be converted. PDF was designed as an output format, not an input format.

Also, you seem to want to have your cake and eat it too. If the reason you don't like just to read the PDFs as is is that the layout is not right, you can't both preserve that format AND convert to something that doesn't have that layout.

However, my best advice would be to keep the file in PDF format and process with soPDF (http://www.mobileread.com/forums/showthread.php?t=32066) first. The results won't be perfect, but generally, this helps make things much easier to read in a small screen will preserving the look of the PDF exactly.

DDHarriman
02-07-2010, 07:24 PM
Hi

Concerning your main question: problem with PDF conversion.

If you can not manage to read you PDF’s in your reader using the 2 ways it has to help in that - enlarge the font size and it will reflow the text, or rotate the screen and read that way preserving all the PDF formatting, then there is no simple solution.
For this last one you can also use “soPDF” as fabjous so well advices, and get better, sometimes enough reading from a large size PDF file.

Now the hard solution:
With PDF’s that do not complain to your screen, the unique solution is to see them as paper pages… meaning that the unique way that can give you better results is, if they are not DRM protected, pass them by a OCR program.
For this, today’s best ones are Omnipage pro 17 or Finerader Pro 10 (this one is very well seen by most of the users of this forum, in this or older versions).
After that you will get a digital document and you will need to proof read it, and then save it in a format easy to create a reflowing format your reader can easily present.

Best regards,

Bombatomica
02-10-2010, 08:03 PM
Thank you both for help :-) However, i tried many books directly in pdf with my Sony. Some are good, some simply are unreadable, because the format totally broke, words cut or mixed with others, etc. I gave a look at sopdf, but it's a little too complicated for me. The best solution i found until now is to convert pdf in epub with Calibre. Very often the result is good, so generally i try this.
Yes, like Frabjous said, Pdf it's an output format. It's really, really hard to convert, so i'll try to use as it is, otherwise i'll convert in epub and cross the finger :-) Thanks again!

Solitaire1
02-11-2010, 12:07 AM
To follow what DDHarriman said, when it comes to PDFs I format them as if I was going to print them on paper and use my reader's screen size as the paper size. When I do that, they look on my reader at the smallest size exactly as I intended.

What I don't expect is to be able to do with a PDF formatted this way is increase the size of the document on my reader. Although reflow is available, I've found that it provides less than satisfactory results. What I've found works best is that if I want to increase the size of the text, I go to the original source document and increase the text size, then I regenerate the PDF.

Despite the loss of the ability to increase the size of the text, the tradeoff is that I have complete control over the way my ebook looks. One other thing I've noticed is that with PDFs my reader doesn't reformat them like it does with the other formats.

While PDF offers complete control over an ebook's format, other formats don't offer that level of control. As an example, one of the complaints I've heard about the epub format is that it doesn't allow for full justification. With RTF, a problem I've had is font control where I can't predict which font my ebooks will display in.

I hope this helps.

frabjous
02-11-2010, 01:57 AM
I gave a look at sopdf, but it's a little too complicated for me.

Did you happen to try my GUI for soPDF (http://www.mobileread.com/forums/showpost.php?p=375895&postcount=63) or Nathan Campos's GUI (http://www.mobileread.com/forums/showthread.php?t=67739)? (GUI = Graphical user interface.) These should make it easier to use if you're not a command-line user.

Bombatomica
02-13-2010, 09:27 AM
I tried you GUI, it is really noob-proof, i tried some conversion, sometimes result is very good, sometimes not, but it is a very good alternative, thank you :-)