View Single Post
Old 05-31-2022, 04:47 AM   #1
Shohreh
Addict
Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.
 
Posts: 207
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
Question [pandoc] How to locate HTML file from wget?

Hello,

I like to read long web pages on my e-reader.

I use the following commands to create an ePUB file:

Code:
wget -E -H -k -K -p -e robots=off https://www.acme.com/blah.html

OPTIONAL iconv -f iso-8859-1 -t utf-8 blah.html> blah.UTF8.html

pandoc -t epub2 -o blah.epub blah.UTF8.html
The problem is finding where wget downloads the main HTML file and how it's named, lost somewhere in all those directories that contain the different files needed for offline reading. The goal is to automate the process through a batch file.

Do you know of a solution?

Thank you.
Shohreh is offline   Reply With Quote