View Single Post
Old 10-11-2024, 10:22 AM   #14
Shohreh
Addict
Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.
 
Posts: 207
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
Thanks much. rePub in Chrome is perfect.

FWIW, the following command in wget is pretty close to download a web page and its resources, but URLs still need to be post-edited to remove the garbage added after picture filenames (eg. .jpeg becomes .jpeg?blah, causing errors):

Code:
wget --restrict-file-names=ascii,windows --convert-links  --random-wait -U mozilla -e robots=off --span-hosts --domains=acme.com,cdn.acme.com --page-requisites --no-parent --directory-prefix=.\mydir https://acme.com/2024/09/22/blah.html
---
Edit: I also noted that the URLs of some pictures were not converted to point to a local file so won't be displayed in the EPUB. Also, SumatraPDF didn't like some useless <div> section in the EPUB created by Pandoc ("Couldn't render the page"; didn't try to see if it worked in the e-reader). Bottom line: First try one of the browser extensions before trying pandoc (or wget + Sigil/Calibre).

Last edited by Shohreh; 10-12-2024 at 03:16 AM.
Shohreh is offline   Reply With Quote