View Single Post
Old 09-04-2010, 12:49 PM   #2623
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by poloman View Post
Hi! I'd like to learn how do do some feeds - I've read the tutorial and the site I'm after doesn't quite work - are there any tips/examples for feed burner based feeds?

Ideally, I'd like to create a recipe for The Daily Mash : http://feeds.feedburner.com/thedailymash

Thanks for any help you can give!
Poloman, my tip is this. And I don't mean to come across as rude by saying this. Do like I'm doing. Jump in head first with it. Even though I have programmed in C# for years, the python scripting is different for me. None the less, take and look at the recipes that are already provided. On a windows based system they are in /program files/calibre2/resources/recipes (or along that path).

First when you get it pulling the feed, then you will be hey that's not how i want it to look. So then you do like I did and go hmmm how do I remove the stuff. So i started doing a search in the recipes for remove and came across remove_tags and remove_tags_after and so on. Then also keep_only. I then took and tried those methods and if they worked I patted myself on the back and if they didn't then i took and posted segments of my code or in some cases the whole code in spoiler and code tags and the good folks on this site will generally help you out in a timely manner given you put for the effort. I know Starson17 has helped me big time along with a few others..

Bottom line is yes it is complicated to learn (heck i'm still figuring it out), but once you start to get the basics. You develop and arsenal to attack almost any feed you are faced with.

I for one feel defeated when I work on something for hours and then someone comes along instead of explaining what they done and simply doing it. Yes I'm grateful that they do that, yet on the same token I feel let down because I haven't learned anything..

So give it a try and let us know where we can help.

Here take a look at this to give you an idea... This should work for you but read the comments in it so you can get a understand of how i went about it. The only thing that I can't figure out on this is how to remove the style tags to get rid of the digg links and so forth at the bottom..

Spoiler:

Code:
from calibre.web.feeds.news import BasicNewsRecipe

class AdvancedUserRecipe1282101454(BasicNewsRecipe):
    title = 'Daily Mash'
    language = 'en'
    __author__ = 'TonytheBookworm'
    description = 'The Daily Mash'
    publisher = 'Tony Stegall'
    category = ''
    oldest_article = 7
    conversion_options = {'linearize_tables' : True}
    max_articles_per_feed = 100
    no_stylesheets = True
    
    masthead_url = 'http://www.thedailymash.co.uk/images/mashlogo5.gif'
    
      
   
    
    
    feeds          = [
                      ('Daily Mash', 'http://feeds.feedburner.com/thedailymash'),
                      
                    ]


    

    def print_version(self,url):
        split1 = url.split("?") # I take and search for all instances of ? in the url
        split2 = split1[1] # I then need to find the second part of the url to get what i need. it is 0 based index
        print 'THE SPLIT IS :', split2   # this is used to test to see what the results of the split is 
        
        #-----------------------------------------------------------------------------------------------
        #- This is how the orginal url comes in and how it needs to be converted to get a print version-
        #-----------------------------------------------------------------------------------------------
        
        #example of link to convert
        #Original link: http://www.thedailymash.co.uk/index.php?option=com_content&task=view&id=3060&Itemid=74
        #print version: http://www.thedailymash.co.uk/index2.php?option=com_content&task=view&id=3060&pop=1&page=0&Itemid=74
        
        #Now that I have my splits I take and piece it together
        #1) I take and have a constant url of www.thedailymash.co.uk/index2.php
        #2) I then want to take and append my split to the end of it
        #3) I then take and add the &page=0&pop=1 to the end 
        #4) I then get my needed url be in print format
        
        print_url = 'http://www.thedailymash.co.uk/index2.php?'+ split2 + '&page=0&pop=1'
        print 'print_url is: ', print_url
        return print_url

Last edited by TonytheBookworm; 09-04-2010 at 03:24 PM. Reason: added Recipe
TonytheBookworm is offline