03-12-2018, 03:57 PM | #1 |
Connoisseur
Posts: 78
Karma: 1332336
Join Date: Mar 2011
Location: montana
Device: none
|
Converting web site to epub
Websites dynamically generated from a database (WordPress or any other such system) can be made to spit out a series of HTML fragment files. One for each page. Each such HTML fragment does not have the <HTML><HEAD> or <BODY> elements, but usually do retain all other HTML markup in the resulting fragments.
For each such fragment file a hacker can use bash sed perl awk or python to do custom things to selected markups or perhaps to do tricky things like convert all occurrences of newlines to a space, but to leave all occurrences of two consecutive newlines in place. At that point you have a text file that can be manually cut and pasted into sigil, or it can be copied into OEBPS/Text. If copied into OEBPS/Text a manual zip -r my.epub . can make a file that can be loaded into sigil. Now you have transferred a website into a first draft of an ebook inside sigil. Once inside sigil there will still be a lot of work to do. But a LOT of the work has already been done, semi-automatically. I know this can be done because I have just done it. But my work is clunky and in too many cases hard-coded and a bit error prone. Are there any well-written utilities out there already for doing this? That might be more flexible and perhaps less buggy than my quick take? |
03-12-2018, 04:04 PM | #2 |
Connoisseur
Posts: 78
Karma: 1332336
Join Date: Mar 2011
Location: montana
Device: none
|
Ah. Calibre.
I'll look into Calibre. |
Advert | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
"Web Browser is unable to establish a secure connection to this web site" | Glorfindel | Kindle Developer's Corner | 62 | 01-19-2024 12:01 PM |
Is there any site that can shrink/split web pages for kindle's "basic web"? thanks | kocoman | Amazon Kindle | 1 | 03-22-2013 06:01 PM |
Looking for a web site | jbcohen | Lounge | 1 | 03-12-2013 09:33 PM |
Looking for a web site | jbcohen | Lounge | 3 | 01-11-2013 12:35 AM |
Converting a web page to epub with TOC | philosopherdog | Calibre | 5 | 07-23-2010 07:55 AM |