Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 02-20-2012, 10:47 PM   #1
rjgrigaitis
Junior Member
rjgrigaitis began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Feb 2012
Device: Sony PRS-350
Won't accept my date

I'm trying to set the "date" for the article dictionary in parse_index(). It seems that at a certain point, Calibre stops accepting my manipulation of the source data.

This is the code I think should work:
Code:
i = div.find('i')

m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
        'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
dateGroup = re.match(r"(?P<month>\w+) (?P<date>\d+), (?P<year>\d+)",
                               ''.join(i.findAll(text=True, recursive=False)).strip())
artDate = date(int(dateGroup.group('year')),
                     m[dateGroup.group('month')],
                     int(dateGroup.group('date')))
pubdate = artDate.strftime('%a, %d %b')
When executed, the date is always today's date. However, when I assign "pubdate" to the "description" in the article dictionary, it is the correct value.

Curiously, the following three all work as expected:

Code:
pubdate = strftime('%a, %d %b')
Code:
pubdate = ''.join(i.findAll(text=True, recursive=False)).strip()
Code:
pubdate = dateGroup.group('year') + '{0}'.format(m[dateGroup.group('month')]) + dateGroup.group('date')
This is my complete recipe:

Spoiler:
Code:
import string, re, time
from calibre import strftime
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup
from datetime import date
from datetime import timedelta

class AdvancedUserRecipe1328808344(BasicNewsRecipe):
    title          = u'C-Fam Friday Fax'
    oldest_article = 10.66
    max_articles_per_feed = 100
    auto_cleanup = True

    def parse_index(self):
        soup = self.index_to_soup('http://www.c-fam.org/fridayfax/')
        articles = []
        feeds = []
        seenArticles = []

        for div in soup.findAll('div'):
            a = div.find('a', href=True, attrs={'class':'ffArchiveLink'})
            if not a:
                continue

            if a['href'] in seenArticles:
                continue
            seenArticles.append(a['href'])

            i = div.find('i')
            if not i:
                continue

            m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
                    'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
            dateGroup = re.match(r"(?P<month>\w+) (?P<date>\d+), (?P<year>\d+)",
                                                  ''.join(i.findAll(text=True, recursive=False)).strip())
            artDate = date(int(dateGroup.group('year')), 
                                    m[dateGroup.group('month')], 
                                    int(dateGroup.group('date')))
            if (artDate <= (date.today() - timedelta(days=self.oldest_article))):
                continue

            pubdate = artDate.strftime('%a, %d %b')
#            pubdate = strftime('%a, %d %b')
#            pubdate = ''.join(i.findAll(text=True, recursive=False)).strip()
#            pubdate = dateGroup.group('year') + '{0}'.format(m[dateGroup.group('month')]) + dateGroup.group('date')
            url = 'http://www.c-fam.org/' + a['href']
            title = ''.join(a.findAll(text=True, recursive=False)).strip()
            description = ''
            articles.append({'title' : title,
                                       'url' : url,
                                       'date' : pubdate,
                                       'description' : pubdate})
#                                       'description' : description})

        if (len(articles) > 0):
            feeds.append((self.title, articles))
        else:
            raise ValueError('No articles found, aborting')

        return feeds


What's going on here? I don't understand why it won't work.
rjgrigaitis is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Calibre will not accept new books. Rusherman Calibre 5 06-10-2011 02:14 PM
Bulk Changing Published Date To Date hmf Calibre 4 10-19-2010 10:19 PM
Up-to-date candy teacher (date being 1921) kacir Deals and Resources (No Self-Promotion or Affiliate Links) 0 06-16-2010 04:18 PM
new official shipping date / US invitation date R2D2 iRex 18 07-06-2006 02:32 PM


All times are GMT -4. The time now is 09:44 PM.


MobileRead.com is a privately owned, operated and funded community.