View Single Post
Old 09-24-2010, 07:58 AM   #2838
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by TonytheBookworm View Post
Starson17,
I need your help on this one if you gotta minute. ...alright I got it working but i'm confused on this. In previous feeds I have done i enter the feed address and it gets the link and uses it as the title and then the content that is listed under it parses part of it and uses it as a description. Well in this feed here the content is all on the feed page so it doesn't go to the actual link. In the code above I was assuming that it went to the links one by one inside the feed. I was trying to strip the content that the link showed.
So my question to you is, what determines if it uses the feed main page content (the one that has all the links on it) or if it navigates to each link? I hope you understand what I'm asking if not i will try to explain myself better.
this code here works cause for whatever reason the links on the feed page are not followed. but in other basic feeds i have simply done nothing more than add the feed and it follows the link
Spoiler:

Code:
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup, re
class AdvancedUserRecipe1282101454(BasicNewsRecipe):
    title = 'How To Geek'
    language = 'en'
    __author__ = 'TonytheBookworm'
    description = 'Daily Computer Tips and Tricks'
    publisher = 'Howtogeek'
    category = 'PC,tips,tricks'
    oldest_article = 2
    max_articles_per_feed = 100
    linearize_tables = True
    no_stylesheets = True
    remove_javascript   = True
    
    
    
    
    remove_tags =[dict(name='a', attrs={'target':['_blank']}),
                  dict(name='table', attrs={'id':['articleTable']}),
                  dict(name='div',   attrs={'class':['feedflare']}),
                  ]
                   
    feeds          = [
                      ('Tips', 'http://feeds.howtogeek.com/howtogeek')
                      
                    ]
If use_embedded_content = False is set to False in your recipe, it will not use the content on the feed page. If it's not specifically set, I believe it will decide based on length of content on that page.
Starson17 is offline