View Single Post
Old 05-16-2020, 06:38 AM   #1
Shohreh
Addict
Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.
 
Posts: 207
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
Question Recommended clean-up before HTML → EPUB?

Hello,

I'd like to concat a bunch of web pages into a single EPUB to read on my e-reader.

I tried pandoc, but it's very slow and pretty much freezes my computer, so I tried Calibre which at least kept my computer responsive:

Code:
copy /b *.html full.html

pandoc -o full.epub  full.html

"C:\Program Files\Calibre2\ebook-convert.exe" full.html full.epub
Regardless, how do you clean up HTML files before joining them into a single file? Any good practices?

"-h" returns a bewildering number of otptions.

Alternatively, what about first converting HTML files into simpler layouts (Markdown?) before joining them into a single file, and calling an HTML to EPUB converter?

Thank you.

Last edited by Shohreh; 05-16-2020 at 06:48 AM.
Shohreh is offline   Reply With Quote