![]() |
#1666 |
Connoisseur
![]() Posts: 98
Karma: 22
Join Date: Mar 2010
Device: IRiver Story, Ipod Touch, Android SmartPhone
|
Someone has a recipe for this feed rss?
http://feeds.punto-informatico.it/c/...8866/index.rss thanks in advance |
![]() |
![]() |
#1667 |
Connoisseur
![]() Posts: 83
Karma: 10
Join Date: Aug 2009
Device: iphone, Irex iliad, sony prs950, kindle Dx, Ipad
|
is it possible to make recipe for business&economy magazine. it does not fave rss feed.
thanks http://www.businessandeconomy.org/04032010/default.asp |
![]() |
![]() |
#1668 |
Enthusiast
![]() Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
|
I can not for the life of me figure out how to remove an image file at the top of each article of this newspaper. The image file has "Share - Larger Text - Smaller Text - Print" at the top of each article, pushing the main picture off to the next page and leaving the current page mostly blank. Any advice on how I get rid of that image? It seems to be embedded in code I can't seem to get at.
import string, re class AdvancedUserRecipe1252944207(BasicNewsRecipe): title = u'Telegram & Gazette' oldest_article = 1 max_articles_per_feed = 50 timefmt = '' no_stylesheets = True keep_only_tags = [dict(id=['frontpage_section', 'articleWell', 'headline', 'subheadline', 'SuperHeading', 'byline', 'articleBody', 'zoom1'])] remove_tags = [dict(id=['factBoxes'])] preprocess_regexps = [(re.compile(r'<!-- This code displays columnist headshots: -->.*?<p>', re.DOTALL|re.IGNORECASE), lambda match: '')] preprocess_regexps = [(re.compile(r'<div class="verdana11">.*?<!-- END ARTICLE COMMENTS -->', re.DOTALL|re.IGNORECASE), lambda match: '')] encoding = 'cp1252' remove_tags_after = [dict(id='leaderboardBot')] feeds = [(u'Front Page News', u'http://www.telegram.com/apps/pbcs.dll/section?Category=RSS03&MIME=xml'), (u'World & Regional', u'http://www.telegram.com/apps/pbcs.dll/section?Category=rss01&MIME=xml&profile=1052'), (u'Living', u' http://www.telegram.com/apps/pbcs.dl...l&profile=1011'), (u'Local News', u' http://www.telegram.com/apps/pbcs.dl...l&profile=1101'), (u'Business', u'http://www.telegram.com/apps/pbcs.dll/section?Category=rss01&MIME=xml&profile=1002'), (u'Opinion', u'http://www.telegram.com/apps/pbcs.dll/section?Category=rss01&MIME=xml&profile=1017'), (u'Deaths', u'http://www.telegram.com/apps/pbcs.dll/section?Category=rss01&MIME=xml&profile=1001'), (u'As I See It', u'http://www.telegram.com/apps/pbcs.dll/section?Category=rss01&MIME=xml&profile=1054')] |
![]() |
![]() |
#1669 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
remove_tags = [dict(name='div', attrs={'id':'article_tools'})] |
|
![]() |
![]() |
#1670 |
Junior Member
![]() Posts: 5
Karma: 10
Join Date: Mar 2010
Device: Kindle DX
|
THANK YOU!
|
![]() |
![]() |
#1671 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
Your recipe is too complicated. This is simplified and cleaned-up version (add more feeds, this is just example):
Code:
class Telegram(BasicNewsRecipe): title = 'Telegram' oldest_article = 2 max_articles_per_feed = 100 no_stylesheets = False use_embedded_content = False encoding = 'cp1252' publication_type = 'newspaper' remove_empty_feeds = True extra_css = ' body{font-family: Verdana,sans-serif} .headline{font-size: xx-large; font-weight: bold} .mainPhotoCaption{font-size: x-small} ' keep_only_tags = [dict(name='div', attrs={'id':'articleWell'})] remove_tags_before = dict(attrs={'class':'headline'}) remove_tags_after = dict(attrs={'id':'zoom1'}) remove_tags = [ dict(name='div', attrs={'class':'relatedContent'}) ,dict(name=['object','link','iframe']) ] feeds = [ (u'Front page' , u'http://www.telegram.com/apps/pbcs.dll/section?Category=RSS03&MIME=xml') ] def preprocess_html(self, soup): return self.adeify_images(soup) |
![]() |
![]() |
#1672 |
Connoisseur
![]() Posts: 98
Karma: 22
Join Date: Mar 2010
Device: IRiver Story, Ipod Touch, Android SmartPhone
|
My first recipe
The Apple Lounge an italian apple blog.
Any suggestion? from calibre.ebooks.BeautifulSoup import BeautifulSoup from calibre.web.feeds.news import BasicNewsRecipe class Informatica(BasicNewsRecipe): title = u'Informatica' __author__ = 'Gabriele Marini' oldest_article = 15 max_articles_per_feed = 100 use_embedded_content = False remove_tags_after = dict(name='div', attrs={'id':'greet_block'}) no_stylesheets = True feeds = [(u'The Apple Lounge', u'http://feeds.feedburner.com/Theapplelounge?format=xml')] def print_version(self, url): raw = self.browser.open(url).read() soup = BeautifulSoup(raw.decode('utf8', 'replace')) print_link = soup.find('a', {'title':'Stampa questo articolo'}) if print_link is None: return url return print_link['href'] |
![]() |
![]() |
#1673 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
You are complicating too much. Calibre already extracts appropriate link from the feed (feedburner:Origlink). You just need to add the part for printing which is 'print/'. So the correct code would be:
Code:
def print_version(self, url): return url + 'print/' |
![]() |
![]() |
#1674 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Over the weekend I ran all comics of the GoComics.com recipe at size 1200 and 4 strips from each. I have the 200+ comics available broken up into four groups (four recipes) A-F, G-M, N-Z and Editorial comics. They all ran fine. However, I ran them at 8 hour intervals, not in sequence, and I set the delay option to 2 and the simultaneous connections option to 1 to minimize server load. I have seen occasional failures in the past that may be related to server load or anti-scraping tools on their server.
|
![]() |
![]() |
#1675 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
![]() |
![]() |
#1676 |
Enthusiast
![]() Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
|
(message sent in error)
Last edited by olaf; 03-29-2010 at 11:10 AM. |
![]() |
![]() |
#1677 | |
Enthusiast
![]() Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
|
Quote:
Kiklop - that did the trick - thank you! |
|
![]() |
![]() |
#1678 |
Enthusiast
![]() Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
|
|
![]() |
![]() |
#1679 |
Enthusiast
![]() Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
|
Question regarding the Calibre online Help page. Is the 'Edit Metadata' page blank, or is my browser missing something?
|
![]() |
![]() |
#1680 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Custom column read ? | pchrist7 | Calibre | 2 | 10-04-2010 02:52 AM |
Archive for custom screensavers | sleeplessdave | Amazon Kindle | 1 | 07-07-2010 12:33 PM |
How to back up preferences and custom recipes? | greenapple | Calibre | 3 | 03-29-2010 05:08 AM |
Donations for Custom Recipes | ddavtian | Calibre | 5 | 01-23-2010 04:54 PM |
Help understanding custom recipes | andersent | Calibre | 0 | 12-17-2009 02:37 PM |