![]() |
#766 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
Quote:
|
|
![]() |
![]() |
#767 |
Junior Member
![]() Posts: 4
Karma: 10
Join Date: Sep 2009
Device: kindle
|
|
![]() |
Advert | |
|
![]() |
#768 |
Member
![]() Posts: 13
Karma: 10
Join Date: Sep 2009
Device: amazonkindle
|
Custom Recipes
Hi Kiklop74
Thanks once again When I run the recipe to download, I notice that all the article headings are repeating, as per sample below. Sample 1 Ron James's big tent of comedy TheStar.com - Television - Ron James's big tent of comedy Sample 2 Skates iced for love of dance TheStar.com - Television - Skates iced for love of dance |
![]() |
![]() |
#769 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
Quote:
|
|
![]() |
![]() |
#770 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
Quote:
|
|
![]() |
Advert | |
|
![]() |
#771 |
Member
![]() Posts: 13
Karma: 10
Join Date: Sep 2009
Device: amazonkindle
|
New Request - Custom Receipt
Can you create one for The Toronto Sun.
I did try, but the download file is real large and take about 9 to 12 min to download. So I am doing something wrong. Can you create a new recipe for the The Toronto Sun Thanks |
![]() |
![]() |
#772 |
Enthusiast
![]() Posts: 30
Karma: 16
Join Date: Sep 2009
Device: sony prs-505/600
|
Hi again,
I was wondering if anyone had any suggestions for my issue with the New York Times Magazine? I tried the recipe but it didn't return any of the sub-articles, only their headers. Also, I've heard discussion about using Firefox to determine the sections to remove from returned feeds. Can someone please elaborate a little more about how to do this? Thanks again for your help! - Mike |
![]() |
![]() |
#773 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
That is because of the famous anti-scraping protection they employ. Everything related to NYT is pain.
|
![]() |
![]() |
#774 |
Junior Member
![]() Posts: 4
Karma: 10
Join Date: Sep 2009
Device: kindle
|
Thanks so much!
Could you attempt this one for me please http://feeds.feedburner.com/entrepreneur/latest |
![]() |
![]() |
#775 |
Enthusiast
![]() Posts: 30
Karma: 16
Join Date: Sep 2009
Device: sony prs-505/600
|
|
![]() |
![]() |
#776 |
Enthusiast
![]() Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
|
Figured out the smart-quotes thing with encoding. But now I am trying to determine how to replace actual text that is in error. In several places in the actual RSS feed there is an appearance of 'and #8216;' instead of a single quote. The preprocess_regexps command seems to replace everything between x and y with z - that is the only thing I know to make text replacements with. But I tried the following command to no avail. Is this the right command? Do I have the syntax wrong? I just want to replace the entire string, but do I say replace everything between 'and #8217' and semicolon with "'"? (the latter being a single-quote embedded in double-quotes).
preprocess_regexps = [(re.compile(r'and #8216.?;', re.DOTALL|re.IGNORECASE), lambda match: '"')] Also - trying to convert '<STRONG>' to '<b>', but doesn't seem to work. using for a command is preprocess_regexps = [(re.compile(r'<strong.?>', re.DOTALL|re.IGNORECASE), lambda match: '<b>')] (also doing a similar command for the end tag.) What am I doing wrong? Last edited by olaf; 09-26-2009 at 11:17 AM. |
![]() |
![]() |
#777 |
Enthusiast
![]() Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
|
And next question! Is there a way to get rid of the top image in this feed (i've cut out the majority of feeds for this example, but each article is preceded by the ad images, starting with "Share" and "Larger Text" . . . Whatever I try hasn't worked so far.
Here's the recipe: import string, re class AdvancedUserRecipe1252944207(BasicNewsRecipe): title = u'Worcester Telegram test' oldest_article = 1 max_articles_per_feed = 50 timefmt = '' no_stylesheets = True preprocess_regexps = [(re.compile(r'<strong.?>', re.DOTALL|re.IGNORECASE), lambda match: '<b>')] preprocess_regexps = [(re.compile(r'</strong.?>', re.DOTALL|re.IGNORECASE), lambda match: '</b>')] preprocess_regexps = [(re.compile(r'and #8217.?;', re.DOTALL|re.IGNORECASE), lambda match: '"')] preprocess_regexps = [(re.compile(r'and #8216.?;', re.DOTALL|re.IGNORECASE), lambda match: '"')] keep_only_tags = [dict(id=['frontpage_section', 'articleWell', 'headline', 'subheadline', 'SuperHeading', 'byline', 'articleBody', 'zoom1'])] remove_tags = [dict(id=['factBoxes'])] preprocess_regexps = [(re.compile(r'<!-- This code displays columnist headshots: -->.*?<p>', re.DOTALL|re.IGNORECASE), lambda match: '')] preprocess_regexps = [(re.compile(r'<div class="verdana11">.*?<!-- END ARTICLE COMMENTS -->', re.DOTALL|re.IGNORECASE), lambda match: '')] encoding = 'cp1252' remove_tags_after = [dict(id='leaderboardBot')] feeds = [(u'Local News', u' http://www.telegram.com/apps/pbcs.dl...le=1101')] |
![]() |
![]() |
#778 | |
Junior Member
![]() Posts: 5
Karma: 10
Join Date: Sep 2009
Device: none
|
I still cant sucess... can give me abit more help ?
I can go into the 2nd page and have the picture ... Quote:
|
|
![]() |
![]() |
#779 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
Quote:
This is how it should look like: Code:
from calibre.web.feeds.recipes import BasicNewsRecipe class Telegram(BasicNewsRecipe): title = 'Telegram' oldest_article = 2 max_articles_per_feed = 100 no_stylesheets = True encoding = 'cp1252' use_embedded_content = False language = 'en' extra_css = ' .headline{font-size: x-large} ' keep_only_tags = [dict(name='div', attrs={'class':['headline','subHeadline','byline','articleBody']})] remove_tags = [ dict(name=['object','link','embed']) ,dict(name='div',attrs={'class':['relatedContent','verdana11']}) ] remove_tags_after = dict(name='div', attrs={'class':'verdana11'}) feeds = [(u'Frontpage News', u'http://www.telegram.com/apps/pbcs.dll/section?Category=RSS03&MIME=xml')] |
|
![]() |
![]() |
#780 |
Enthusiast
![]() Posts: 49
Karma: 10
Join Date: Aug 2009
Device: none
|
can anyone help me with the recipe of business world,though I have been using its rss feeds but i want the magazine just like economist
http://www.businessworld.in/bw/Magazine_Current_Issue |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Custom column read ? | pchrist7 | Calibre | 2 | 10-04-2010 02:52 AM |
Archive for custom screensavers | sleeplessdave | Amazon Kindle | 1 | 07-07-2010 12:33 PM |
How to back up preferences and custom recipes? | greenapple | Calibre | 3 | 03-29-2010 05:08 AM |
Donations for Custom Recipes | ddavtian | Calibre | 5 | 01-23-2010 04:54 PM |
Help understanding custom recipes | andersent | Calibre | 0 | 12-17-2009 02:37 PM |