View Single Post
Old 09-19-2010, 11:50 PM   #4
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
I know exactly what you're talking about with Microsoft inserted cruft. However for pdf this wouldn't be a problem, the markup returned by pdftohtml is really simplistic.

I guess this partially depends on the type of pdf - at 20 megs I'm assuming it's images with text underneath. That should process quickly. If it actually is 20 megs of text then it may indeed take Calibre a while to process it...
ldolse is offline   Reply With Quote