View Single Post
Old 06-30-2025, 08:46 AM   #1
Shohreh
Addict
Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.
 
Posts: 207
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
Question Turning web site into PDF/EPUB file?

Hello,

There's a public wiki on the web, ie. a bunch of web pages hyperlinked from index.html — that I'd like to turn into a PDF/EPUB file, with bookmarks as cherry on the pie.

Does someone know of a desktop solution, either Windows or Linux?

Some combination of wget and pandoc/Calibre?

Thank you.

---
Edit: One way is to get the list of URLs within the source page, order them if needed, and run a second script to loop through that list, download each page, and append it to a single HTML page before turning it into a PDF/EPUB file

Code:
import requests
from bs4 import BeautifulSoup

url = 'https://wiki.acme.com'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
links = soup.find_all('a')
for link in links:
    print(link.get('href'))
--
Edit: Since Calibre seems unable to download a web page, wget is required… with some interim editing to 1) remove useless stuff and 2) include pictures

Code:
wget -c -O input.html https://wiki.acme.com/somepage.html
"C:\Program Files\Calibre2\ebook-convert.exe" "input.html" "output.pdf" --enable-heuristics --authors "My author" --title "My title"

Last edited by Shohreh; 07-01-2025 at 12:58 AM.
Shohreh is offline   Reply With Quote