Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 01-20-2013, 05:25 PM   #1
RandomCake
Junior Member
RandomCake began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jan 2011
Device: Kindle
xkcd recipe

Hi,
I've tried modifying the following line of the builtin XKCD recipe:
Code:
oldest_article = 60
and set it to:
Code:
oldest_article = 500
But, it only brings in 100 items (this is a one time request, to allow me to catch up, having not read the comic for ages).
Is there anything obvious I'm doing wrong?
The recipe in full is:
Code:
__license__   = 'GPL v3'
__copyright__ = '2008, Kovid Goyal <kovid at kovidgoyal.net>'
'''
Changelog:
2012-04-06
Fixed empty articles, added masthead img (NiLuJe)
2011-09-24
Changed cover (drMerry)
'''
'''
Fetch xkcd.
'''

import time, re
from calibre.web.feeds.news import BasicNewsRecipe

class XkcdCom(BasicNewsRecipe):
    cover_url = 'http://imgs.xkcd.com/static/terrible_small_logo.png'
    masthead_url = 'http://imgs.xkcd.com/static/terrible_small_logo.png'
    title = 'xkcd'
    description = 'A webcomic of romance and math humor.'
    __author__ = 'Martin Pitt updated by DrMerry.'
    language = 'en'

    use_embedded_content   = False
    oldest_article = 500
    #keep_only_tags = [dict(id='middleContainer')]
    #remove_tags = [dict(name='ul'), dict(name='h3'), dict(name='br')]
    keep_only_tags = [dict(id='comic')]
    no_stylesheets = True
    # turn image bubblehelp into a paragraph, and put alt in a heading
    preprocess_regexps = [
        (re.compile(r'(<img.*title=")([^"]+)(".alt=")([^"]+)(".*>)'),
         lambda m: '<h1>%s</h1>%s%s%s<p>%s</p>' % (m.group(4), m.group(1), m.group(3), m.group(5), m.group(2)))
    ]

    def parse_index(self):
        INDEX = 'http://xkcd.com/archive/'

        soup = self.index_to_soup(INDEX)
        articles = []
        for item in soup.findAll('a', title=True):
            articles.append({
                'date': item['title'],
                'timestamp': time.mktime(time.strptime(item['title'], '%Y-%m-%d'))+1,
                'url': 'http://xkcd.com' + item['href'],
                'title': self.tag_to_string(item),
                'description': '',
                'content': '',
            })

        return [('xkcd', articles)]
RandomCake is offline   Reply With Quote
Old 01-20-2013, 07:41 PM   #2
RandomCake
Junior Member
RandomCake began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jan 2011
Device: Kindle
Nevermind... I've read all the comics online now, what a geek I am
Will just be updating my version to only fetch 5 strips at a time now...
RandomCake is offline   Reply With Quote
Old 02-21-2013, 01:31 AM   #3
freedumb2000
Member
freedumb2000 began at the beginning.
 
Posts: 24
Karma: 10
Join Date: Feb 2013
Location: Berlin, Germany
Device: Kindle Paperwhite 3G
There appears to be a hard-coded limit as to how many days you can download backwards. I modified to script to have all comics to appear to come from today and downloaded them by year. That works.
freedumb2000 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
xkcd: The Giving Tree kennyc General Discussions 5 11-22-2011 07:07 PM
xkcd recipe broken elementz Recipes 3 12-16-2010 07:27 PM
xkcd recipe error sonyreaderuser Calibre 3 09-18-2010 06:00 PM
xkcd Commentary Arapito News 11 08-24-2009 11:28 AM
Why you should buy a Kindle (according to xkcd) freecia Amazon Kindle 4 02-26-2009 12:40 AM


All times are GMT -4. The time now is 06:13 AM.


MobileRead.com is a privately owned, operated and funded community.