Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 04-15-2012, 11:18 AM   #1
jumafl
Enthusiast
jumafl began at the beginning.
 
Posts: 33
Karma: 10
Join Date: Apr 2012
Device: Amazon Kindle Paperwhite
Orlando Sentinel standard recipe not getting most news feeds

The Orlando Sentinel has implemented a timer box on most of its feeds that is keeping the standard recipe from working. The Sentinel is owned by the Chicago Tribune so this is probably the same issue that Kovid fixed for the Tribune a month ago. You can see the timer box by following this link and then selecting one of the feeds: http://feeds.feedburner.com/orlandosentinel/business

Once the box has timed out, all feeds are accessible for a period of time, even after closing and reopening the browser.

The standard recipe and my log file are attached. Hopefully someone can figure out how to fix this recipe. I looked at the code Kovid used to fix the Chicago Tribune but couldn’t figure out what changes were needed to fix the Sentinel.

Orlando Sentinel standard recipe.txt

Orlando Sentinel log.txt
jumafl is offline   Reply With Quote
Old 04-15-2012, 10:49 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,598
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Here you go:

Code:
import urllib, re
from calibre.web.feeds.news import BasicNewsRecipe

class AdvancedUserRecipe1279258912(BasicNewsRecipe):
    title          = u'Orlando Sentinel'
    oldest_article = 3
    max_articles_per_feed = 100

    feeds          = [
	(u'News', u'http://feeds.feedburner.com/orlandosentinel/news'),
	(u'Opinion', u'http://feeds.feedburner.com/orlandosentinel/news/opinion'),
	(u'Business', u'http://feeds.feedburner.com/orlandosentinel/business'),
	(u'Technology', u'http://feeds.feedburner.com/orlandosentinel/technology'),
	(u'Space and Science', u'http://feeds.feedburner.com/orlandosentinel/news/space'),
	(u'Entertainment', u'http://feeds.feedburner.com/orlandosentinel/entertainment'),
	(u'Life and Family', u'http://feeds.feedburner.com/orlandosentinel/features/lifestyle'),
	]
    __author__ = 'rty'
    pubisher  = 'OrlandoSentinel.com'
    description           = 'Orlando, Florida, Newspaper'
    category              = 'News, Orlando, Florida'


    remove_javascript = True
    use_embedded_content   = False
    no_stylesheets = True
    language = 'en'
    encoding               = 'utf-8'
    conversion_options = {'linearize_tables':True}
    masthead_url = 'http://www.orlandosentinel.com/media/graphic/2009-07/46844851.gif'

    auto_cleanup = True

    def get_article_url(self, article):
        ans = None
        try:
            s = article.summary
            ans = urllib.unquote(
                re.search(r'href=".+?bookmark.cfm.+?link=(.+?)"', s).group(1))
        except:
            pass
        if ans is None:
            link = article.get('feedburner_origlink', None)
            if link and link.split('/')[-1]=="story01.htm":
                link=link.split('/')[-2]
                encoding = {'0B': '.', '0C': '/', '0A': '0', '0F': '=', '0G': '&',
                        '0D': '?', '0E': '-', '0N': '.com', '0L': 'http:',
                        '0S':'//'}
                for k, v in encoding.iteritems():
                    link = link.replace(k, v)
                ans = link
            elif link:
                ans = link
        if ans is not None:
            return ans.replace('?track=rss', '')
kovidgoyal is offline   Reply With Quote
Advert
Old 04-16-2012, 12:29 AM   #3
jumafl
Enthusiast
jumafl began at the beginning.
 
Posts: 33
Karma: 10
Join Date: Apr 2012
Device: Amazon Kindle Paperwhite
Thank you Kovid, the updated version you posted worked perfectly.
jumafl is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Duplicated news in recipe with multiple feeds romualdinho Recipes 5 09-24-2012 09:27 PM
Techtarget feeds recipe julio:map Recipes 1 11-09-2011 07:42 AM
Fairbanks Daily News-miner News Recipe Submission rogerx Recipes 2 08-25-2011 07:30 PM
New Fairbanks Daily News-miner News Recipe -- Need Date inclusion only rogerx Recipes 5 08-24-2011 09:12 AM


All times are GMT -4. The time now is 08:34 PM.


MobileRead.com is a privately owned, operated and funded community.