![]() |
#1 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 206
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
|
![]()
Hello,
There's a public wiki on the web, ie. a bunch of web pages hyperlinked from index.html — that I'd like to turn into a PDF/EPUB file, with bookmarks as cherry on the pie. Does someone know of a desktop solution, either Windows or Linux? Some combination of wget and pandoc/Calibre? Thank you. --- Edit: One way is to get the list of URLs within the source page, order them if needed, and run a second script to loop through that list, download each page, and append it to a single HTML page before turning it into a PDF/EPUB file Code:
import requests from bs4 import BeautifulSoup url = 'https://wiki.acme.com' response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') links = soup.find_all('a') for link in links: print(link.get('href')) Edit: Since Calibre seems unable to download a web page, wget is required… with some interim editing to 1) remove useless stuff and 2) include pictures Code:
wget -c -O input.html https://wiki.acme.com/somepage.html "C:\Program Files\Calibre2\ebook-convert.exe" "input.html" "output.pdf" --enable-heuristics --authors "My author" --title "My title" Last edited by Shohreh; 07-01-2025 at 12:58 AM. |
![]() |
![]() |
![]() |
#2 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,725
Karma: 24031401
Join Date: Dec 2010
Device: Kindle PW2
|
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,471
Karma: 78880114
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
There is a wiki calibre plugin. See https://www.mobileread.com/forums/sh...d.php?t=183333
|
![]() |
![]() |
![]() |
#4 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,928
Karma: 168959602
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
There is also the WebToEpub extension. For my uses, I find that it does a better job than dotEPUB or save-as-ebook. The extension is available for FireFox and Chrome/Edge.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Converting web site to epub | pittendrigh | ePub | 1 | 03-12-2018 04:04 PM |
Convert whole web site to PDF? | PFletcher | Conversion | 1 | 06-18-2013 06:53 PM |
uploading e pub and pdf to my web site for sale | e.n.d. | Calibre | 4 | 12-10-2010 01:49 AM |
Anyone have the PDF file from the sony site? | hydin | Sony Reader | 3 | 11-27-2008 08:20 PM |