Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 03-27-2010, 12:51 PM   #1666
gambarini
Connoisseur
gambarini began at the beginning.
 
Posts: 98
Karma: 22
Join Date: Mar 2010
Device: IRiver Story, Ipod Touch, Android SmartPhone
Someone has a recipe for this feed rss?

http://feeds.punto-informatico.it/c/...8866/index.rss

thanks in advance
gambarini is offline  
Old 03-27-2010, 01:14 PM   #1667
dhiru
Connoisseur
dhiru began at the beginning.
 
Posts: 83
Karma: 10
Join Date: Aug 2009
Device: iphone, Irex iliad, sony prs950, kindle Dx, Ipad
is it possible to make recipe for business&economy magazine. it does not fave rss feed.
thanks
http://www.businessandeconomy.org/04032010/default.asp
dhiru is offline  
Old 03-28-2010, 11:37 AM   #1668
olaf
Enthusiast
olaf is on a distinguished road
 
Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
I can not for the life of me figure out how to remove an image file at the top of each article of this newspaper. The image file has "Share - Larger Text - Smaller Text - Print" at the top of each article, pushing the main picture off to the next page and leaving the current page mostly blank. Any advice on how I get rid of that image? It seems to be embedded in code I can't seem to get at.

import string, re

class AdvancedUserRecipe1252944207(BasicNewsRecipe):
title = u'Telegram & Gazette'
oldest_article = 1
max_articles_per_feed = 50
timefmt = ''
no_stylesheets = True

keep_only_tags = [dict(id=['frontpage_section', 'articleWell', 'headline', 'subheadline', 'SuperHeading', 'byline', 'articleBody', 'zoom1'])]
remove_tags = [dict(id=['factBoxes'])]
preprocess_regexps = [(re.compile(r'<!-- This code displays columnist headshots: -->.*?<p>', re.DOTALL|re.IGNORECASE), lambda match: '')]
preprocess_regexps = [(re.compile(r'<div class="verdana11">.*?<!-- END ARTICLE COMMENTS -->', re.DOTALL|re.IGNORECASE), lambda match: '')]
encoding = 'cp1252'
remove_tags_after = [dict(id='leaderboardBot')]

feeds = [(u'Front Page News', u'http://www.telegram.com/apps/pbcs.dll/section?Category=RSS03&MIME=xml'),
(u'World & Regional', u'http://www.telegram.com/apps/pbcs.dll/section?Category=rss01&MIME=xml&profile=1052'),
(u'Living', u' http://www.telegram.com/apps/pbcs.dl...l&profile=1011'),
(u'Local News', u' http://www.telegram.com/apps/pbcs.dl...l&profile=1101'),
(u'Business', u'http://www.telegram.com/apps/pbcs.dll/section?Category=rss01&MIME=xml&profile=1002'),
(u'Opinion', u'http://www.telegram.com/apps/pbcs.dll/section?Category=rss01&MIME=xml&profile=1017'),
(u'Deaths', u'http://www.telegram.com/apps/pbcs.dll/section?Category=rss01&MIME=xml&profile=1001'),
(u'As I See It', u'http://www.telegram.com/apps/pbcs.dll/section?Category=rss01&MIME=xml&profile=1054')]
olaf is offline  
Old 03-28-2010, 12:32 PM   #1669
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by olaf View Post
I can not for the life of me figure out how to remove an image file at the top of each article of this newspaper. The image file has "Share - Larger Text - Smaller Text - Print" at the top of each article, pushing the main picture off to the next page and leaving the current page mostly blank.
Try this:
remove_tags = [dict(name='div', attrs={'id':'article_tools'})]
Starson17 is offline  
Old 03-28-2010, 12:53 PM   #1670
Semonski
Junior Member
Semonski began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2010
Device: Kindle DX
THANK YOU!

Thank you so much..... I'm trying it out now.....

Kos

Quote:
Originally Posted by kiklop74 View Post
New recipe for New York Post:
Semonski is offline  
Old 03-28-2010, 03:26 PM   #1671
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
Quote:
Originally Posted by Semonski View Post
Thank you so much..... I'm trying it out now.....

Kos
Your recipe is too complicated. This is simplified and cleaned-up version (add more feeds, this is just example):

Code:
class Telegram(BasicNewsRecipe):
    title                 = 'Telegram'
    oldest_article        = 2
    max_articles_per_feed = 100
    no_stylesheets        = False
    use_embedded_content  = False
    encoding              = 'cp1252'
    publication_type      = 'newspaper'
    remove_empty_feeds    = True
    extra_css             = ' body{font-family: Verdana,sans-serif} .headline{font-size: xx-large; font-weight: bold} .mainPhotoCaption{font-size: x-small} '

    keep_only_tags     = [dict(name='div', attrs={'id':'articleWell'})]
    remove_tags_before = dict(attrs={'class':'headline'})
    remove_tags_after  = dict(attrs={'id':'zoom1'})
    remove_tags = [
                     dict(name='div', attrs={'class':'relatedContent'})
                    ,dict(name=['object','link','iframe'])
                  ]

    feeds          = [ 
                        (u'Front page' , u'http://www.telegram.com/apps/pbcs.dll/section?Category=RSS03&MIME=xml')
                     ]

    def preprocess_html(self, soup):
        return self.adeify_images(soup)
kiklop74 is offline  
Old 03-28-2010, 06:49 PM   #1672
gambarini
Connoisseur
gambarini began at the beginning.
 
Posts: 98
Karma: 22
Join Date: Mar 2010
Device: IRiver Story, Ipod Touch, Android SmartPhone
My first recipe

The Apple Lounge an italian apple blog.

Any suggestion?

from calibre.ebooks.BeautifulSoup import BeautifulSoup
from calibre.web.feeds.news import BasicNewsRecipe
class Informatica(BasicNewsRecipe):
title = u'Informatica'
__author__ = 'Gabriele Marini'
oldest_article = 15
max_articles_per_feed = 100
use_embedded_content = False
remove_tags_after = dict(name='div', attrs={'id':'greet_block'})
no_stylesheets = True
feeds = [(u'The Apple Lounge', u'http://feeds.feedburner.com/Theapplelounge?format=xml')]
def print_version(self, url):
raw = self.browser.open(url).read()
soup = BeautifulSoup(raw.decode('utf8', 'replace'))
print_link = soup.find('a', {'title':'Stampa questo articolo'})
if print_link is None:
return url
return print_link['href']
gambarini is offline  
Old 03-28-2010, 08:25 PM   #1673
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
Quote:
Originally Posted by gambarini View Post
The Apple Lounge an italian apple blog.

Any suggestion?
You are complicating too much. Calibre already extracts appropriate link from the feed (feedburner:Origlink). You just need to add the part for printing which is 'print/'. So the correct code would be:

Code:
def print_version(self, url):
     return url + 'print/'
kiklop74 is offline  
Old 03-29-2010, 09:43 AM   #1674
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by MichaelMSeattle View Post
I've tried running the GoComics reversed recipe for only about 5 comics/7 days. When I run it, it first seems to hang
Over the weekend I ran all comics of the GoComics.com recipe at size 1200 and 4 strips from each. I have the 200+ comics available broken up into four groups (four recipes) A-F, G-M, N-Z and Editorial comics. They all ran fine. However, I ran them at 8 hour intervals, not in sequence, and I set the delay option to 2 and the simultaneous connections option to 1 to minimize server load. I have seen occasional failures in the past that may be related to server load or anti-scraping tools on their server.
Starson17 is offline  
Old 03-29-2010, 09:45 AM   #1675
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by gambarini View Post
can you give me an example of the print statement?
Sorry, I missed this post.

Code:
print 'The contents of the variable site_url is: ', site_url
One of my favorites is to print soup variables.
Starson17 is offline  
Old 03-29-2010, 10:48 AM   #1676
olaf
Enthusiast
olaf is on a distinguished road
 
Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
(message sent in error)

Last edited by olaf; 03-29-2010 at 11:10 AM.
olaf is offline  
Old 03-29-2010, 11:09 AM   #1677
olaf
Enthusiast
olaf is on a distinguished road
 
Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
Quote:
Originally Posted by kiklop74 View Post
Your recipe is too complicated. This is simplified and cleaned-up version (add more feeds, this is just example):

Code:
class Telegram(BasicNewsRecipe):
    title                 = 'Telegram'
    oldest_article        = 2
    max_articles_per_feed = 100
    no_stylesheets        = False
    use_embedded_content  = False
    encoding              = 'cp1252'
    publication_type      = 'newspaper'
    remove_empty_feeds    = True
    extra_css             = ' body{font-family: Verdana,sans-serif} .headline{font-size: xx-large; font-weight: bold} .mainPhotoCaption{font-size: x-small} '

    keep_only_tags     = [dict(name='div', attrs={'id':'articleWell'})]
    remove_tags_before = dict(attrs={'class':'headline'})
    remove_tags_after  = dict(attrs={'id':'zoom1'})
    remove_tags = [
                     dict(name='div', attrs={'class':'relatedContent'})
                    ,dict(name=['object','link','iframe'])
                  ]

    feeds          = [ 
                        (u'Front page' , u'http://www.telegram.com/apps/pbcs.dll/section?Category=RSS03&MIME=xml')
                     ]

    def preprocess_html(self, soup):
        return self.adeify_images(soup)

Kiklop - that did the trick - thank you!
olaf is offline  
Old 03-29-2010, 11:33 AM   #1678
olaf
Enthusiast
olaf is on a distinguished road
 
Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
Quote:
Originally Posted by Starson17 View Post
Try this:
remove_tags = [dict(name='div', attrs={'id':'article_tools'})]

Starson - this worked - thank you!
olaf is offline  
Old 03-29-2010, 11:34 AM   #1679
olaf
Enthusiast
olaf is on a distinguished road
 
Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
Question regarding the Calibre online Help page. Is the 'Edit Metadata' page blank, or is my browser missing something?
olaf is offline  
Old 03-29-2010, 12:10 PM   #1680
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by olaf View Post
Starson - this worked - thank you!
I'm glad to hear it. I didn't test your recipe. I just popped open Firebug, found your problem content and gave you the necessary line for that single problem.
Starson17 is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom column read ? pchrist7 Calibre 2 10-04-2010 02:52 AM
Archive for custom screensavers sleeplessdave Amazon Kindle 1 07-07-2010 12:33 PM
How to back up preferences and custom recipes? greenapple Calibre 3 03-29-2010 05:08 AM
Donations for Custom Recipes ddavtian Calibre 5 01-23-2010 04:54 PM
Help understanding custom recipes andersent Calibre 0 12-17-2009 02:37 PM


All times are GMT -4. The time now is 04:59 AM.


MobileRead.com is a privately owned, operated and funded community.