MobileRead Forums - View Single Post - Quick and easy way to turn a website into a book?

skb · 06-16-2019, 06:30 PM

I've done this to a lesser degree.

In theory, you could download HTML files from a site (using something like SiteSucker).

However, depending on the site, there's usually LOTS going on - ads etc. And a lot (most?) sites these days aren't static HTML but rather generated with a CMS (like this site).

Anyway, if the HTML is vanilla enough, you could download the site. Then, I would create a new library* (especially if there's gazillions of pages) and import them into the blank library.Then convert them into epubs. Then, using the ePubMerge, merge them. Once you've got a Merged ebook, you can move it into your "normal" library (if you wish).

Having said all that, there is lots that can wrong. I would convert one HTML file and view it and check that it's actually readable.

To be honest, I post-process any HTML I import into Calibre: to remove menus, ads, images, formatting, styles etc etc. So, I try not to do it often.

That's how I'd do it - there may well be a scripted or easier way but my programming skills are waaaaay out of date.

Good luck!

* I create a new/temp library because I don't want to miss/overlook a file etc and it's a way of quarantining. I usually delete my temp library after this sort of thing. You mileage may vary.

06-16-2019, 06:30 PM	#2
skb Evangelist Posts: 401 Karma: 1597305 Join Date: Mar 2010 Device: Ipod G4, MacOS 10.12, Calibre, Pocketbook Touch HD 3	I've done this to a lesser degree. In theory, you could download HTML files from a site (using something like SiteSucker). However, depending on the site, there's usually LOTS going on - ads etc. And a lot (most?) sites these days aren't static HTML but rather generated with a CMS (like this site). Anyway, if the HTML is vanilla enough, you could download the site. Then, I would create a new library* (especially if there's gazillions of pages) and import them into the blank library.Then convert them into epubs. Then, using the ePubMerge, merge them. Once you've got a Merged ebook, you can move it into your "normal" library (if you wish). Having said all that, there is lots that can wrong. I would convert one HTML file and view it and check that it's actually readable. To be honest, I post-process any HTML I import into Calibre: to remove menus, ads, images, formatting, styles etc etc. So, I try not to do it often. That's how I'd do it - there may well be a scripted or easier way but my programming skills are waaaaay out of date. Good luck! * I create a new/temp library because I don't want to miss/overlook a file etc and it's a way of quarantining. I usually delete my temp library after this sort of thing. You mileage may vary.