View Single Post
Old 06-15-2012, 12:56 PM   #5
daniel3ub
Junior Member
daniel3ub began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jun 2012
Device: Kindle 4NT
Thanks for your answer. I've read the sticky many times in search of a solution. And every PDF I've already tried to convert exhibit this problem, or I wouldn't cry for help.

I think you can reproduce the problem. Here are the steps I followed right now, just to be sure:

1. Downloaded a .pdf book from Project Gutenberg (I got this http://www.gutenberg.org/ebooks/1342 ). There is no header or footer in it. You can get the .txt version and make a PDF from it, too. The results are the same.
2. Convert it with Calibre to MOBI, with all the options from Heuristics checked, and "Removing spacing between paragraphs" from "Look and Feel" checked too.

You can see in the resulting MOBI (using the internal viewer or even using Kindle itself) that there is some blank lines corresponding to every page break in the PDF file. Playing around a bit, I've just found that this blank lines are soft scene breaks inserted by Calibre (if I use the option to "replace soft scene breaks" it become obvious).

However, if a paragraph is broken from one page to another in the PDF, no soft scene break is inserted, but rather a new paragraph begins in the point of the page break.

I certainly can use regex to fix this paragraph breaks, but I think that Calibre could handle these "PDF page break -> MOBI soft scene break" problem.

The problem becomes annoying in a text that already contains some real soft scene breaks, as you can imagine, as the resulting MOBI will have a lot of fake soft scene breaks

Cheers!
daniel3ub is offline   Reply With Quote