View Single Post
Old 08-06-2010, 12:51 PM   #2396
Flexicat
Junior Member
Flexicat began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2010
Device: Kobo
Hello all,

Can I have some help in refining my recipe?

I've created this one with help from this forum, but I do not think that the print version switch is working.

I have been able to trim down the content I do not want to see to manageable levels, but my script is only collecting the first page of each article.

The article address format is like this:

Code:
http://www.tthfanfic.org/Story-22821/Snag+Strangely+Literal.htm
with the print-version address format like this:

Code:
http://www.tthfanfic.org/wholestory.php?no=22821&format=print
I am pretty sure the code is correct, but it does not seem to be switching to the print version. At the end of each article in my epub, Calibre has a line that says "article downloaded from" and the article address instead of the print version address.

I have tried turning on and off Javascript, with exactly the same results.

Here is the code:
Spoiler:

Code:
class AdvancedUserRecipe1280965027(BasicNewsRecipe):
    title          = u'TTH'
    oldest_article = 7
    max_articles_per_feed = 30
    no_stylesheets        = True
    encoding              = 'UTF-8'
    remove_javascript     = True
    use_embedded_content  = False

    keep_only_tags = [
                    dict(attrs={'class':'storysummary formbody defaultcolors'})
                   ,dict(attrs={'class':'storybody defaultcolors'})
                  ]

    feeds          = [(u'Latest Stories', u'http://www.tthfanfic.org/rss.php')]


def print_version(self, url):
    split1 = url.split("/")
    xxx = split1[3]
    split2 = xxx.split("-")
    artid =  split2[1]
    print 'artid is: ', artid
    return 'http://www.tthfanfic.org/wholestory.php?no=' + artid + '&format=print'


Thank you
Flexicat is offline