View Single Post
Old 08-07-2013, 07:46 AM   #2
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
If you work with Sigil, you can start with a blank document, add in all the html documents, arrange them in the desired order, copy and paste in the text documents from any text editor.

For the pdfs, you will need to convert them to html, or copy and paste them from a pdf viewer using the display text function or copy and paste them from the normal display mode. This will likely leave extra spaces, or breaks or carriage returns that will need to be cleaned up.

If you are unlucky and the pdfs are image pdfs containing no text, you will have to process them with an Optical Character Recognition program whose output will need to be cleaned up also...an error rate of only 2% means an error on every page.

This procedure avoids calibre adding in its own hard to understand tags and css.
mrmikel is offline   Reply With Quote