02-26-2014, 09:33 AM | #1 |
Junior Member
Posts: 2
Karma: 10
Join Date: May 2013
Device: Kobo Touch
|
How to script conversion of HTML, not RSS?
Hi all -- apologies if this is answered somewhere else, but I haven't been able to find anything that seems to do what I want.
I'm interested in using Calibre recipes to convert HTML to epub, ideally from a command-line (I'm a Linux weenie from way back ). From what I've seen by digging around in the API documents and the recipes, this seems quite different from the usual approach of pointing Calibre at an RSS feed. Often I come across an article on a website I'd like to read later on my Kobo, so I'd like to have some way of saying "Go fetch this URL". So far I've been scripting this using wget for the downloading, then some truly awful sed scripts to get the relevant bits of HTML, and finally passing that to "ebook-convert". Of course, this would be a whole lot easier with a recipe: parsing, removing cruft, and all of that. Like I said, it seems that Calibre is (as far as news feeds go) quite oriented around RSS. Is there a way around this, or something I may have missed? Any pointers would be gratefully received. |
02-26-2014, 10:15 AM | #2 |
creator of calibre
Posts: 44,333
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Many calibre recipes parse html files to get a list of articles, rather than using rss. See the parse_index() method. Implement it to return a lsit of articles that contains onlya single article, poiting to your html page.
|
Advert | |
|
02-28-2014, 09:54 AM | #3 |
Junior Member
Posts: 2
Karma: 10
Join Date: May 2013
Device: Kobo Touch
|
I'll check that out. Many thanks!
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
html to epub CLI conversion / html input | m4mmon | Conversion | 2 | 05-05-2012 02:10 AM |
Script to convert all system installed Man Pages to HTML | rogerx | Conversion | 1 | 08-26-2011 08:12 AM |
Problem with html -> Mobi conversion - html tags visible. | khromov | Calibre | 9 | 08-06-2011 11:25 AM |
conversion script | JeffElkins | Calibre | 2 | 03-28-2009 03:06 AM |
prs-gen.pl :: Batch RSS->PDF Script | hrbrmstr | Sony Reader | 1 | 10-28-2006 11:53 AM |