![]() |
#1 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 131
Karma: 150390
Join Date: Nov 2011
Location: Pacific NorthWest
Device: Kindle Fire
|
Combine HTML and RSS?
The website of one of my recipes has moved to more active content... but provides RSS for some of their content also. I'd like to have my recipe use my HTML parsing (going through the soup for relevant bits) for some sections, and use feeds for others.
I have both the retrieved feeds and my parse_index which an article list of URLs, not content. How can I get the content from the RSS while keeping the scraping portion of the parse_index() process? Thanks. |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,195
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
THe simplest way is to split your recipe into two recipes. More complex, would be overwriting the build_index() method from BasicNewsRecipe to do what you want.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 131
Karma: 150390
Join Date: Nov 2011
Location: Pacific NorthWest
Device: Kindle Fire
|
Thanks. Yes, I thought of splitting it into two, but it really is a cohesive site excepting their opaque web code. I'll look at build_index. Thanks
|
![]() |
![]() |
![]() |
#4 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 131
Karma: 150390
Join Date: Nov 2011
Location: Pacific NorthWest
Device: Kindle Fire
|
Yeppers, overriding build_index() did it. "Simple" is clearly a matter of perspective though!
![]() |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
feature request - combine files - output as html | cybmole | Conversion | 1 | 01-22-2011 10:36 AM |
Calibre Recipe HTML content differs from raw html of index.html. | krunk | Calibre | 4 | 09-20-2010 09:48 PM |
Combine Books? | pghaworth | Calibre | 4 | 08-13-2010 10:25 AM |
Anything like GebLibrarian that can combine HTML pages in other formats? | Katelyn | Workshop | 5 | 08-27-2008 07:26 AM |
RSS feeds + combining multiple html into ebook? | hapax legomenon | Calibre | 2 | 08-20-2008 02:32 PM |