![]() |
#226 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
This is my final version of recipe that looks ok in ebook viewer:
Code:
class AdvancedUserRecipe1234144423(BasicNewsRecipe): title = u'Cincinnati Enquirer' oldest_article = 7 language = _('English') __author__ = 'Joseph Kitzmiller' max_articles_per_feed = 100 no_stylesheets = True use_embedded_content = False remove_javascript = True encoding = 'cp1252' extra_css = ' p {font-size: medium; font-weight: normal;} ' keep_only_tags = [dict(name='div', attrs={'class':'padding'})] remove_tags = [ dict(name=['object','link','table','embed']) ,dict(name='div',attrs={'id':'pluckcomments'}) ,dict(name='div',attrs={'class':'articleflex-container'}) ] feeds = [(u'Cincinnati Enquirer', u'http://rss.cincinnati.com/apps/pbcs.dll/section?category=rssenq01&mime=xml')] def preprocess_html(self, soup): for item in soup.findAll(style=True): del item['style'] for item in soup.findAll(face=True): del item['face'] return soup |
![]() |
![]() |
#227 |
Member
![]() Posts: 13
Karma: 10
Join Date: Feb 2009
Device: PRS-505
|
I am sure as well. I am happy with what I got working, not really a big deal to go through the Sony library. You have been a tremendous help kiklop! Perhaps if I had hundreds of feeds it would be a pain, but luckily just having issues with the one feed.
|
![]() |
Advert | |
|
![]() |
#229 |
Ebook Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 225
Karma: 2136
Join Date: Jul 2003
Location: Appleton, Wisconsin, USA
Device: Onyx BOOX Note Air 4C, Palma
|
I'm really close on a custom recipe for my local paper. Here is the code I currently have:
Code:
class AdvancedUserRecipe1234841996(BasicNewsRecipe): title = u'Appleton Post Crescent' oldest_article = 7 max_articles_per_feed = 100 remove_javascript = True html2lrf_options = ['--ignore-tables'] html2epub_options = 'linearize_tables = True' remove_tags = [dict(name='div', attrs={'class':'article-tools'})] keep_only_tags = [dict(name='div', attrs={'class':['article-headline', 'article-bodytext']})] feeds = [(u'Latest Headlines', u'http://www.postcrescent.com/apps/pbcs.dll/misc?URL=/templates/RSSlatest.pbs&mime=xml'), (u'Local News', u'http://www.postcrescent.com/apps/pbcs.dll/misc?URL=/templates/RSSlocal.pbs&mime=xml'), (u'Sports', u'http://www.postcrescent.com/apps/pbcs.dll/misc?URL=/templates/RSSsports.pbs&mime=xml')] Code:
<!--- OAS MACRO ---> Can anyone help with a brief bit of code that I can add to my recipe to remove this stubborn comment? Thanks |
![]() |
![]() |
#230 | |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,396
Karma: 27756918
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
|
|
![]() |
Advert | |
|
![]() |
#231 | |
Connoisseur
![]() Posts: 51
Karma: 10
Join Date: Dec 2008
Location: Germany
Device: SONY PRS-500
|
Commenting A Recipe
Quote:
Thanks for the Ars Technica recipe. Big request: would you mind commenting each segment of the source code so that I know what each is doing? I think that would help me to figure out how I can solve similar problems in other recipes I've experimented with. I need to know which lines of code in your revised Ars Technica recipe fetches the rest of an article that is spread across two or more Web pages. Thanks in advance... Xanthan Gum |
|
![]() |
![]() |
#232 |
Ebook Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 225
Karma: 2136
Join Date: Jul 2003
Location: Appleton, Wisconsin, USA
Device: Onyx BOOX Note Air 4C, Palma
|
|
![]() |
![]() |
#233 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
Quote:
Would you mind pointing me to a specific story link that goes on two pages? |
|
![]() |
![]() |
#234 | |
Connoisseur
![]() Posts: 51
Karma: 10
Join Date: Dec 2008
Location: Germany
Device: SONY PRS-500
|
Ars Technica Article in two parts
Quote:
Here's an example. It's the second article that appears on the Ars Technica home page tonight: http://arstechnica.com/gaming/news/2...a-bad-idea.ars Xanthan Gum |
|
![]() |
![]() |
#235 | |
Connoisseur
![]() Posts: 51
Karma: 10
Join Date: Dec 2008
Location: Germany
Device: SONY PRS-500
|
Ars Technica Not Fetching Entire Article
Quote:
It seems that the revised Ars Technica article is not fetching the second half of the article. Xanthan |
|
![]() |
![]() |
#236 |
Connoisseur
![]() Posts: 51
Karma: 10
Join Date: Dec 2008
Location: Germany
Device: SONY PRS-500
|
|
![]() |
![]() |
#237 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
As I already stated above the recipe was never designed to fetch multi page articles.
|
![]() |
![]() |
#238 |
Hyperreader
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 130
Karma: 28678
Join Date: Feb 2009
Device: Current: Boox Leaf2 (broken) Past: H2O, Kindle PW1, DXG;Pocketbook 360
|
Physicsworld recipes
Code:
import re class AdvancedUserRecipe1234495609(BasicNewsRecipe): title = u'Physicsworld' oldest_article = 7 max_articles_per_feed = 100 no_stylesheets = True use_embedded_content = False remove_javascript = True remove_tags_before = dict(name='h1') remove_tags_after = [dict(name='div', attrs={'id':'shareThis'})] preprocess_regexps = [ (re.compile(r'<div id="shareThis">.*</body>', re.DOTALL|re.IGNORECASE), lambda match: '</body>'), ] feeds = [ (u'Headlines News', u'http://feeds.feedburner.com/PhysicsWorldNews') ] ![]() |
![]() |
![]() |
#239 | |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,396
Karma: 27756918
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
![]() |
|
![]() |
![]() |
#240 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
Updated recipe Ars technica with multipage news support
|
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Custom column read ? | pchrist7 | Calibre | 2 | 10-04-2010 02:52 AM |
Archive for custom screensavers | sleeplessdave | Amazon Kindle | 1 | 07-07-2010 12:33 PM |
How to back up preferences and custom recipes? | greenapple | Calibre | 3 | 03-29-2010 05:08 AM |
Donations for Custom Recipes | ddavtian | Calibre | 5 | 01-23-2010 04:54 PM |
Help understanding custom recipes | andersent | Calibre | 0 | 12-17-2009 02:37 PM |