Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 02-20-2012, 10:47 PM   #1
rjgrigaitis
Junior Member
rjgrigaitis began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Feb 2012
Device: Sony PRS-350
Won't accept my date

I'm trying to set the "date" for the article dictionary in parse_index(). It seems that at a certain point, Calibre stops accepting my manipulation of the source data.

This is the code I think should work:
Code:
i = div.find('i')

m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
        'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
dateGroup = re.match(r"(?P<month>\w+) (?P<date>\d+), (?P<year>\d+)",
                               ''.join(i.findAll(text=True, recursive=False)).strip())
artDate = date(int(dateGroup.group('year')),
                     m[dateGroup.group('month')],
                     int(dateGroup.group('date')))
pubdate = artDate.strftime('%a, %d %b')
When executed, the date is always today's date. However, when I assign "pubdate" to the "description" in the article dictionary, it is the correct value.

Curiously, the following three all work as expected:

Code:
pubdate = strftime('%a, %d %b')
Code:
pubdate = ''.join(i.findAll(text=True, recursive=False)).strip()
Code:
pubdate = dateGroup.group('year') + '{0}'.format(m[dateGroup.group('month')]) + dateGroup.group('date')
This is my complete recipe:

Spoiler:
Code:
import string, re, time
from calibre import strftime
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup
from datetime import date
from datetime import timedelta

class AdvancedUserRecipe1328808344(BasicNewsRecipe):
    title          = u'C-Fam Friday Fax'
    oldest_article = 10.66
    max_articles_per_feed = 100
    auto_cleanup = True

    def parse_index(self):
        soup = self.index_to_soup('http://www.c-fam.org/fridayfax/')
        articles = []
        feeds = []
        seenArticles = []

        for div in soup.findAll('div'):
            a = div.find('a', href=True, attrs={'class':'ffArchiveLink'})
            if not a:
                continue

            if a['href'] in seenArticles:
                continue
            seenArticles.append(a['href'])

            i = div.find('i')
            if not i:
                continue

            m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
                    'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
            dateGroup = re.match(r"(?P<month>\w+) (?P<date>\d+), (?P<year>\d+)",
                                                  ''.join(i.findAll(text=True, recursive=False)).strip())
            artDate = date(int(dateGroup.group('year')), 
                                    m[dateGroup.group('month')], 
                                    int(dateGroup.group('date')))
            if (artDate <= (date.today() - timedelta(days=self.oldest_article))):
                continue

            pubdate = artDate.strftime('%a, %d %b')
#            pubdate = strftime('%a, %d %b')
#            pubdate = ''.join(i.findAll(text=True, recursive=False)).strip()
#            pubdate = dateGroup.group('year') + '{0}'.format(m[dateGroup.group('month')]) + dateGroup.group('date')
            url = 'http://www.c-fam.org/' + a['href']
            title = ''.join(a.findAll(text=True, recursive=False)).strip()
            description = ''
            articles.append({'title' : title,
                                       'url' : url,
                                       'date' : pubdate,
                                       'description' : pubdate})
#                                       'description' : description})

        if (len(articles) > 0):
            feeds.append((self.title, articles))
        else:
            raise ValueError('No articles found, aborting')

        return feeds


What's going on here? I don't understand why it won't work.
rjgrigaitis is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Calibre will not accept new books. Rusherman Calibre 5 06-10-2011 02:14 PM
Bulk Changing Published Date To Date hmf Calibre 4 10-19-2010 10:19 PM
Up-to-date candy teacher (date being 1921) kacir Deals and Resources (No Self-Promotion or Affiliate Links) 0 06-16-2010 04:18 PM
new official shipping date / US invitation date R2D2 iRex 18 07-06-2006 02:32 PM


All times are GMT -4. The time now is 06:47 AM.


MobileRead.com is a privately owned, operated and funded community.