View Single Post
Old 06-16-2019, 05:30 PM   #2
skb
Evangelist
skb ought to be getting tired of karma fortunes by now.skb ought to be getting tired of karma fortunes by now.skb ought to be getting tired of karma fortunes by now.skb ought to be getting tired of karma fortunes by now.skb ought to be getting tired of karma fortunes by now.skb ought to be getting tired of karma fortunes by now.skb ought to be getting tired of karma fortunes by now.skb ought to be getting tired of karma fortunes by now.skb ought to be getting tired of karma fortunes by now.skb ought to be getting tired of karma fortunes by now.skb ought to be getting tired of karma fortunes by now.
 
skb's Avatar
 
Posts: 401
Karma: 1597305
Join Date: Mar 2010
Device: Ipod G4, MacOS 10.12, Calibre, Pocketbook Touch HD 3
I've done this to a lesser degree.

In theory, you could download HTML files from a site (using something like SiteSucker).

However, depending on the site, there's usually LOTS going on - ads etc. And a lot (most?) sites these days aren't static HTML but rather generated with a CMS (like this site).

Anyway, if the HTML is vanilla enough, you could download the site. Then, I would create a new library* (especially if there's gazillions of pages) and import them into the blank library.Then convert them into epubs. Then, using the ePubMerge, merge them. Once you've got a Merged ebook, you can move it into your "normal" library (if you wish).

Having said all that, there is lots that can wrong. I would convert one HTML file and view it and check that it's actually readable.

To be honest, I post-process any HTML I import into Calibre: to remove menus, ads, images, formatting, styles etc etc. So, I try not to do it often.

That's how I'd do it - there may well be a scripted or easier way but my programming skills are waaaaay out of date.

Good luck!

* I create a new/temp library because I don't want to miss/overlook a file etc and it's a way of quarantining. I usually delete my temp library after this sort of thing. You mileage may vary.
skb is offline   Reply With Quote