View Single Post
Old 10-19-2010, 05:20 AM   #7
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
The reason you're having the problem with spacing is because the spacing doesn't exist in the pdf document. PDFs are basically just a series of draw commands, which say 'Start drawing xyz at these coordinates'. There are no spaces involved, just coordinates. When Calibre converts pdf to html it doesn't look at where on the page the draw command was started, it just converts that line of text to a paragraph (there is no margin information).

There are other pdftohtml converters out there which do retain some of this information - one I saw used a combination of divs and css to retain where everything starts/ends. That type of conversion goes to the opposite extreme though, so the document retains hard line breaks, isn't very compatible across ebook formats, etc. Google pdf to html and look at the different online converters to see if one gets you closer to what you want, and you could use that as a source instead of the pdf.
ldolse is offline   Reply With Quote