View Single Post
Old 06-24-2010, 01:50 AM   #5
Nirf
Junior Member
Nirf began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Aug 2008
Device: PRS-505
Ok, so I followed the suggestions. Running pdftohtml on hello.pdf worked and produced a bunch of files, hello.html, hellos.html, hello_ind.html, and a zillion .png files for all the pages. However, I couldn't find any way to add the html file meaningfully as a book into calibre. I would choose hello.html, and next thing I know when the book is in the library, it shows up as a zip file, and there's no way to preview it. Very odd behavior.

Also, I let ebook-convert run for a long time this time, and here's what I eventually got (after there was a memory look so bad that everything was slowing down and I ended it the hard way)


1% Converting input to HTML...
InputFormatPlugin: PDF Input running
on /home/nir/Documents/Calibre Library/Malcolm Gladwell/Outliers_ The Story of Success (Little, Brown & Co; 2008) (3)/hello.pdf
pdftohtml log:

Parsing all content...
Initial parse failed:
Parsing file 'index.html' as HTML
Forcing index.html into XHTML namespace
Generating default TOC from spine...
34% Running transforms on ebook...
Merging user specified metadata...
Detecting structure...
Auto generated TOC with 0 entries.
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Cleaning up manifest...
Trimming unused files from manifest...
Creating LRF Output...
67% Creating LRF Output
Processing u'index.html'
Parsing HTML...
Converting to BBeB...
Terminated


These conversions are also taking huge amounts of time... the pdftohtml conversion took a very long time (a few minutes) and for the ebook-convert command to get to this point takes even longer. I didn't remember it taking even close to this long before.... what's going on?
Nirf is offline   Reply With Quote