Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 02-17-2014, 04:28 PM   #1
Al Watts
Junior Member
Al Watts began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Feb 2014
Location: Los Angeles County, CA
Device: Kindle Fire 7 HDX
Orange County Register broken?

New to Calibre and it may be me. I have successfully downloaded a copy of the Irish Times, when I try to download the Orange County Register I can read the indexes of the stories but can't read any stories. Tried to download Field and Stream, to see if it is me or a problem with Calibre and it just takes forever, don't know if it will ever download. I have a very fast internet connection.
Al Watts is offline   Reply With Quote
Old 08-10-2014, 11:19 PM   #2
rrrrrrrrrrrryan
Junior Member
rrrrrrrrrrrryan began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Aug 2014
Device: Kindle Paperwhite
Indeed, it appears that the OC Register changed their URLs a while back.

I was able to fix it.
Here's the updated recipe in case anyone stumbles across this thread:

Code:
#!/usr/bin/env  python
__license__   = 'GPL v3'
__author__    = 'Lorenzo Vigentini (updated by rrrrrrrrrrrryan at gmail.com)'
__copyright__ = '2009, Lorenzo Vigentini <l.vigentini at gmail.com>'
description   = 'News from Orange county - v1.02 (10, August 2014)'

'''
http://www.ocregister.com/
'''

from calibre.web.feeds.news import BasicNewsRecipe

class ocRegister(BasicNewsRecipe):
    author        = 'Lorenzo Vigentini'
    description   = 'News from the Orange county'

    cover_url      = 'http://images.onset.freedom.com/ocregister/logo.gif'
    title          = u'Orange County Register'
    publisher      = 'Orange County Register Communication'
    category       = 'News, finance, economy, politics'

    language       = 'en'
    timefmt        = '[%a, %d %b, %Y]'

    oldest_article = 1
    max_articles_per_feed = 25
    use_embedded_content  = False
    recursion             = 10

    # remove_javascript     = True
    no_stylesheets        = True

    needs_subscription = "optional"

    use_javascript_to_login = True

    def javascript_login(self, browser, username, password):
        browser.visit('http://www.ocregister.com/sections/login')
        form = browser.select_form(nr=1) # Select the second form on the page
        form['username'] = username
        form['password_temp'] = password
        browser.submit(timeout=120) # Submit the form and wait at most two minutes for loading to complete

    def print_version(self,url):
        printUrl    = 'http://www.ocregister.com/common/printer/view.php?db=ocregister&id='
        segments = url.split('/')
        subSegments = (segments[4]).split('.')
        subSubSegments = (subSegments[0]).split('-')
        myArticle = (subSubSegments[1])
        myURL= printUrl + myArticle
        return myURL

    keep_only_tags     = [
                            dict(name='div', attrs={'id':'ArticleContentWrap'})
                        ]

    remove_tags = [
                     dict(name='div', attrs={'class':'hideForPrint'}),
                     dict(name='div', attrs={'id':'ContentFooter'})
                  ]

    feeds          = [
                       (u'News', u'http://www.ocregister.com/common/rss/rss.php?catID=18800'),
                       (u'Top Stories', u'http://www.ocregister.com/common/rss/rss.php?catID=23541'),
                       (u'Business', u'http://www.ocregister.com/common/rss/rss.php?catID=18909'),
                       (u'Cars', u'http://www.ocregister.com/common/rss/rss.php?catID=20128'),
                       (u'Entertainment', u'http://www.ocregister.com/common/rss/rss.php?catID=18926'),
                       (u'Home', u'http://www.ocregister.com/common/rss/rss.php?catID=19142'),
                       (u'Life', u'http://www.ocregister.com/common/rss/rss.php?catID=18936'),
                       (u'Opinion', u'http://www.ocregister.com/common/rss/rss.php?catID=18963'),
                       (u'Sports', u'http://www.ocregister.com/common/rss/rss.php?catID=18901'),
                       (u'Travel', u'http://www.ocregister.com/common/rss/rss.php?catID=18959')
                     ]

    extra_css = '''
                h1 {color:#ff6600;font-family:Arial,Helvetica,sans-serif; font-size:20px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:20px;}
                h2 {color:#4D4D4D;font-family:Arial,Helvetica,sans-serif; font-size:16px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:16px; }
                h3 {color:#4D4D4D;font-family:Arial,Helvetica,sans-serif; font-size:15px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:15px;}
                h4 {color:#333333; font-family:Arial,Helvetica,sans-serif;font-size:13px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:13px; }
                h5 {color:#333333; font-family:Arial,Helvetica,sans-serif; font-size:11px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:11px; text-transform:uppercase;}
                #articledate {color:#333333;font-family:Arial,Helvetica,sans-serif;font-size:10px; font-size-adjust:none; font-stretch:normal; font-style:italic; font-variant:normal; font-weight:bold; line-height:10px; text-decoration:none;}
                #articlebyline {color:#4D4D4D;font-family:Arial,Helvetica,sans-serif;font-size:10px; font-size-adjust:none; font-stretch:normal; font-style:bold; font-variant:normal; font-weight:bold; line-height:10px; text-decoration:none;}
                img {align:left;}
                #topstoryhead {color:#ff6600;font-family:Arial,Helvetica,sans-serif; font-size:22px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:20px;}
                '''
rrrrrrrrrrrryan is offline   Reply With Quote
Advert
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to de-register a Kindle which has a broken screen uhstud2215 Amazon Kindle 8 08-23-2013 05:33 AM
Troubleshooting Need to register with broken screen kolosus Amazon Kindle 9 05-15-2013 03:16 PM
how to de-register and register kindle3 outside USA klastbreath Amazon Kindle 3 03-31-2011 06:07 PM
Register Article on Orange and Kindle-Like Service Halk News 0 12-20-2008 06:55 AM
The Orange County Register - Kindle-ized daffy4u Amazon Kindle 0 08-30-2008 12:04 PM


All times are GMT -4. The time now is 02:46 AM.


MobileRead.com is a privately owned, operated and funded community.