View Single Post
Old 08-07-2010, 09:21 AM   #2397
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Flexicat View Post
Can I have some help in refining my recipe?
Sure

Quote:
I've created this one with help from this forum, but I do not think that the print version switch is working.
It works great when it runs. It isn't running.

Quote:
I have been able to trim down the content I do not want to see to manageable levels, but my script is only collecting the first page of each article.
If you run the print version, you get different content than if you don't run it. The page problem and removal of excess junk are problems that aren't related to the print version.

Quote:
Code:
http://www.tthfanfic.org/Story-22821/Snag+Strangely+Literal.htm
with the print-version address format like this:

Code:
http://www.tthfanfic.org/wholestory.php?no=22821&format=print
I am pretty sure the code is correct, but it does not seem to be switching to the print version.
Here is the code:
Spoiler:

Code:
class AdvancedUserRecipe1280965027(BasicNewsRecipe):
    title          = u'TTH'
    oldest_article = 7
    max_articles_per_feed = 30
    no_stylesheets        = True
    encoding              = 'UTF-8'
    remove_javascript     = True
    use_embedded_content  = False

    keep_only_tags = [
                    dict(attrs={'class':'storysummary formbody defaultcolors'})
                   ,dict(attrs={'class':'storybody defaultcolors'})
                  ]

    feeds          = [(u'Latest Stories', u'http://www.tthfanfic.org/rss.php')]


def print_version(self, url):
    split1 = url.split("/")
    xxx = split1[3]
    split2 = xxx.split("-")
    artid =  split2[1]
    print 'artid is: ', artid
    return 'http://www.tthfanfic.org/wholestory.php?no=' + artid + '&format=print'
Here are your problems:
1) you need to indent the print_version and each line under it by 4 more spaces. Until you do that, it's not a part of the class.
2) when you indent it, it will run and you'll get no results. Your recipe has keep_only_tags that don't exist on the print page. Since you've said to keep only things that aren't there, you get nothing. Remove the keep_only_tags.
3) You can remove print 'artid is: ' after testing. (it worked for me)

Last edited by Starson17; 08-07-2010 at 09:24 AM.
Starson17 is offline