Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-26-2010, 08:33 AM   #1
scottsan
Connoisseur
scottsan began at the beginning.
 
scottsan's Avatar
 
Posts: 98
Karma: 10
Join Date: Apr 2008
Device: sony prs 505
NY Times problem

I am using calibre v.0.7.2.4 and mac os 10.6.4.

I tired to fetch the NY Times today and revieived the following error message:
Failed: Fetch news from The New York Times.

Here is the report:


ERROR: Conversion Error: <b>Failed</b>: Fetch news from The New York Times

Fetch news from The New York Times
Resolved conversion options
calibre version: 0.7.24
{'asciiize': False,
'author_sort': None,
'authors': None,
'base_font_size': 0,
'book_producer': None,
'change_justification': 'original',
'chapter': None,
'chapter_mark': 'pagebreak',
'comments': None,
'cover': None,
'debug_pipeline': None,
'disable_font_rescaling': False,
'dont_download_recipe': False,
'dont_split_on_page_breaks': True,
'extra_css': None,
'extract_to': None,
'flow_size': 260,
'font_size_mapping': None,
'footer_regex': '(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s * <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>) ' ,
'header_regex': '(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s * <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>) ' ,
'html_unwrap_factor': 0.40000000000000002,
'input_encoding': None,
'input_profile': <calibre.customize.profiles.InputProfile object at 0x690de90>,
'insert_blank_line': False,
'insert_metadata': False,
'isbn': None,
'keep_ligatures': False,
'language': None,
'level1_toc': None,
'level2_toc': None,
'level3_toc': None,
'line_height': 0,
'linearize_tables': False,
'lrf': False,
'margin_bottom': 5.0,
'margin_left': 5.0,
'margin_right': 5.0,
'margin_top': 5.0,
'max_toc_links': 50,
'no_chapters_in_toc': False,
'no_default_epub_cover': False,
'no_inline_navbars': False,
'no_svg_cover': False,
'output_profile': <calibre.customize.profiles.SonyReaderOutput object at 0x6916270>,
'page_breaks_before': None,
'password': 'scottsan',
'prefer_metadata_cover': False,
'preprocess_html': False,
'preserve_cover_aspect_ratio': False,
'pretty_print': True,
'pubdate': None,
'publisher': None,
'rating': None,
'read_metadata_from_opf': None,
'remove_first_image': False,
'remove_footer': False,
'remove_header': False,
'remove_paragraph_spacing': False,
'remove_paragraph_spacing_indent_size': 1.5,
'series': None,
'series_index': None,
'smarten_punctuation': False,
'tags': None,
'test': False,
'timestamp': None,
'title': None,
'title_sort': None,
'toc_filter': None,
'toc_threshold': 6,
'use_auto_toc': False,
'username': 'ankaraaikikai@mac.com',
'verbose': 2}
Python function terminated unexpectedly: list index out of range
InputFormatPlugin: Recipe Input running
Queued 0 articles
Traceback (most recent call last):
File "/Applications/Ebook Software/calibre.app/Contents/Resources/Python/lib/python2.6/site.py", line 147, in main
return run_entry_point()
File "/Applications/Ebook Software/calibre.app/Contents/Resources/Python/lib/python2.6/site.py", line 116, in run_entry_point
return getattr(pmod, func)()
File "site-packages/calibre/utils/ipc/worker.py", line 107, in main
File "site-packages/calibre/gui2/convert/gui_conversion.py", line 24, in gui_convert
File "site-packages/calibre/ebooks/conversion/plumber.py", line 832, in run
File "site-packages/calibre/customize/conversion.py", line 216, in __call__
File "site-packages/calibre/web/feeds/input.py", line 105, in convert
File "site-packages/calibre/web/feeds/news.py", line 713, in download
File "site-packages/calibre/web/feeds/news.py", line 876, in build_index
IndexError: list index out of range


What's wrong??

Thanks
scottsan is offline   Reply With Quote
Old 10-26-2010, 09:03 AM   #2
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by scottsan View Post
What's wrong??
I see it's been reported as a bug (#7304) and that more than one person sees the error. That's GRiker's recipe. I'm sure it will get assigned to him, or Kovid will tackle it.
Starson17 is offline   Reply With Quote
Old 10-26-2010, 10:46 AM   #3
scottsan
Connoisseur
scottsan began at the beginning.
 
scottsan's Avatar
 
Posts: 98
Karma: 10
Join Date: Apr 2008
Device: sony prs 505
NY Times problem

I'm curious to know why this recipe worked with the previous version of Calibre, but now doesn't. When Calibre comes out with a new version are the recipes changed or modified in some way? And, if a recipe is working just fine, why would it be modified?
scottsan is offline   Reply With Quote
Old 10-26-2010, 11:07 AM   #4
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by scottsan View Post
I'm curious to know why this recipe worked with the previous version of Calibre, but now doesn't.
It's probably because the New York Times changed the layout of their website, not because Calibre changed anything.

Quote:
When Calibre comes out with a new version are the recipes changed or modified in some way?
They may be changed even if Calibre does not release a new version. Calibre pulls the current recipe from online before using it. That source is updated whenever an updated version of a recipe is accepted.

Quote:
And, if a recipe is working just fine, why would it be modified?
To make it work better. I modify my recipes when I see something that I'd like improved, even if it's already working. You seem to think that something changed in Calibre to cause this. That's very unlikely. It's extremely common for a recipe to stop working, but I've never seen that happen as a result of a change by Calibre. It's possible, but not likely. The author always tests before changing a recipe. Occasionally, the internals of Calibre will change, and that could have an effect on an unchanged recipe, but again, it's unlikely that's what happened here. Site redesigns are almost always at the root of problems like this.
Starson17 is offline   Reply With Quote
Old 10-26-2010, 11:42 AM   #5
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
NYT recipe update

Format changes on the NYT web site. Here is an updated recipe:
Code:
#!/usr/bin/env  python

__license__   = 'GPL v3'
__copyright__ = '2008, Kovid Goyal <kovid at kovidgoyal.net>'
'''
nytimes.com
'''
import string, re, time
from calibre import strftime
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup

def decode(self, src):
    enc = 'utf-8'
    if 'iso-8859-1' in src:
        enc = 'cp1252'
    return src.decode(enc, 'ignore')

class NYTimes(BasicNewsRecipe):

    title       = u'New York Times'
    __author__  = 'Kovid Goyal/Nick Redding'
    language = 'en'
    requires_version = (0, 6, 36)

    description = 'Daily news from the New York Times (subscription version)'
    timefmt = ' [%b %d]'
    needs_subscription = True
    remove_tags_before = dict(id='article')
    remove_tags_after  = dict(id='article')
    remove_tags = [dict(attrs={'class':['articleTools', 'post-tools', 'side_tool','nextArticleLink',
                                        'nextArticleLink clearfix','columnGroup doubleRule','doubleRule','entry-meta',
                                        'icon enlargeThis','columnGroup  last','relatedSearchesModule']}),
                   dict({'class':re.compile('^subNavigation')}),
                   dict({'class':re.compile('^leaderboard')}),
                   dict({'class':re.compile('^module')}),
                   dict({'class':'metaFootnote'}),
                   dict(id=['inlineBox','footer', 'toolsRight', 'articleInline','login','masthead',
                            'navigation', 'archive', 'side_search', 'blog_sidebar','cCol','portfolioInline',
                            'side_tool', 'side_index','header','readerReviewsCount','readerReviews',
                            'relatedArticles', 'relatedTopics', 'adxSponLink']),
                   dict(name=['script', 'noscript', 'style','form','hr'])]
    encoding = decode
    no_stylesheets = True
    extra_css = '''
                .articleHeadline { margin-top:0.5em; margin-bottom:0.25em; }
                .credit { font-size: small; font-style:italic; line-height:1em; margin-top:5px; margin-left:0; margin-right:0; margin-bottom: 0; }
                .byline { font-size: small; font-style:italic; line-height:1em; margin-top:10px; margin-left:0; margin-right:0; margin-bottom: 0; }
                .dateline { font-size: small; line-height:1em;margin-top:5px; margin-left:0; margin-right:0; margin-bottom: 0; }
                .kicker { font-size: small; line-height:1em;margin-top:5px; margin-left:0; margin-right:0; margin-bottom: 0; }
                .timestamp { font-size: small; }
                .caption { font-size: small; line-height:1em; margin-top:5px; margin-left:0; margin-right:0; margin-bottom: 0; }
                a:link {text-decoration: none; }'''

    def get_browser(self):
        br = BasicNewsRecipe.get_browser()
        if self.username is not None and self.password is not None:
            br.open('http://www.nytimes.com/auth/login')
            br.select_form(name='login')
            br['USERID']   = self.username
            br['PASSWORD'] = self.password
            raw = br.submit().read()
            if 'Sorry, we could not find the combination you entered. Please try again.' in raw:
                raise Exception('Your username and password are incorrect')
            #open('/t/log.html', 'wb').write(raw)
        return br

    def get_masthead_url(self):
        masthead = 'http://graphics8.nytimes.com/images/misc/nytlogo379x64.gif'
        #masthead = 'http://members.cox.net/nickredding/nytlogo.gif'
        br = BasicNewsRecipe.get_browser()
        try:
            br.open(masthead)
        except:
            self.log("\nMasthead unavailable")
            masthead = None
        return masthead


    def get_cover_url(self):
        cover = None
        st = time.localtime()
        year = str(st.tm_year)
        month = "%.2d" % st.tm_mon
        day = "%.2d" % st.tm_mday
        cover = 'http://graphics8.nytimes.com/images/' + year + '/' +  month +'/' + day +'/nytfrontpage/scan.jpg'
        br = BasicNewsRecipe.get_browser()
        try:
            br.open(cover)
        except:
            self.log("\nCover unavailable")
            cover = None
        return cover

    def short_title(self):
        return 'New York Times'

    def parse_index(self):
        self.encoding = 'cp1252'
        soup = self.index_to_soup('http://www.nytimes.com/pages/todayspaper/index.html')
        self.encoding = decode

        def feed_title(div):
            return ''.join(div.findAll(text=True, recursive=True)).strip()

        articles = {}
        key = None
        ans = []
        url_list = []

        def handle_article(div):
            a = div.find('a', href=True)
            if not a:
                return
            url = re.sub(r'\?.*', '', a['href'])
            if not url.startswith("http"):
                return
            if not url.endswith(".html"):
                return
            if 'podcast' in url:
                return
            url += '?pagewanted=all'
            if url in url_list:
                return
            url_list.append(url)
            title = self.tag_to_string(a, use_alt=True).strip()
            #self.log("Title: %s" % title)
            description = ''
            pubdate = strftime('%a, %d %b')
            summary = div.find(True, attrs={'class':'summary'})
            if summary:
                description = self.tag_to_string(summary, use_alt=False)
            author = ''
            authorAttribution = div.find(True, attrs={'class':'byline'})
            if authorAttribution:
                author = self.tag_to_string(authorAttribution, use_alt=False)
            else:
                authorAttribution = div.find(True, attrs={'class':'byline'})
                if authorAttribution:
                    author = self.tag_to_string(authorAttribution, use_alt=False)
            feed = key if key is not None else 'Uncategorized'
            if not articles.has_key(feed):
                articles[feed] = []
            articles[feed].append(
                            dict(title=title, url=url, date=pubdate,
                                description=description, author=author,
                                content=''))
            


        # Find each instance of class="section-headline", class="story", class="story headline"
        for div in soup.findAll(True,
            attrs={'class':['section-headline', 'story', 'story headline','sectionHeader','headlinesOnly multiline flush']}):

            if div['class'] in ['section-headline','sectionHeader']:
                key = string.capwords(feed_title(div))
                articles[key] = []
                ans.append(key)
                #self.log('Section: %s' % key)                

            elif div['class'] in ['story', 'story headline'] :
                handle_article(div)
            elif div['class'] == 'headlinesOnly multiline flush':
                for lidiv in div.findAll('li'):
                    handle_article(lidiv)
                    
#        ans = self.sort_index_by(ans, {'The Front Page':-1,
#                                      'Dining In, Dining Out':1,
#                                     'Obituaries':2})
        ans = [(key, articles[key]) for key in ans if articles.has_key(key)]

        return ans

    def preprocess_html(self, soup):
        kicker_tag = soup.find(attrs={'class':'kicker'})
        if kicker_tag:
            tagline = self.tag_to_string(kicker_tag)
            #self.log("FOUND KICKER %s" % tagline)
            if tagline=='Op-Ed Columnist':
                img_div = soup.find('div','inlineImage module')
                #self.log("Searching for photo")
                if img_div:
                    img_div.extract()
                    #self.log("Photo deleted")
        refresh = soup.find('meta', {'http-equiv':'refresh'})
        if refresh is None:
            return soup
        content = refresh.get('content').partition('=')[2]
        raw = self.browser.open_novisit('http://www.nytimes.com'+content).read()
        return BeautifulSoup(raw.decode('cp1252', 'replace'))
nickredding is offline   Reply With Quote
Old 10-26-2010, 12:41 PM   #6
scottsan
Connoisseur
scottsan began at the beginning.
 
scottsan's Avatar
 
Posts: 98
Karma: 10
Join Date: Apr 2008
Device: sony prs 505
NY Times problem

thanks,
it worked like a charm
scottsan is offline   Reply With Quote
Old 10-30-2010, 03:25 AM   #7
bevdeforges
Addict
bevdeforges ought to be getting tired of karma fortunes by now.bevdeforges ought to be getting tired of karma fortunes by now.bevdeforges ought to be getting tired of karma fortunes by now.bevdeforges ought to be getting tired of karma fortunes by now.bevdeforges ought to be getting tired of karma fortunes by now.bevdeforges ought to be getting tired of karma fortunes by now.bevdeforges ought to be getting tired of karma fortunes by now.bevdeforges ought to be getting tired of karma fortunes by now.bevdeforges ought to be getting tired of karma fortunes by now.bevdeforges ought to be getting tired of karma fortunes by now.bevdeforges ought to be getting tired of karma fortunes by now.
 
bevdeforges's Avatar
 
Posts: 288
Karma: 1094000
Join Date: Mar 2010
Location: Essonne, France
Device: Kobo Forma; Sony PRS600B; Sony 350; Sony T-2
Quote:
Originally Posted by nickredding View Post
Format changes on the NYT web site. Here is an updated recipe:
Say, many thanks for that. I had noticed the change in format in the e-mail "Top Stories" I get, but since I only download the Times a couple times a week hadn't gotten around to "engaging with" the issue. You saved me lots of time and effort!
bevdeforges is offline   Reply With Quote
Old 10-31-2010, 06:26 AM   #8
tylau0
Connoisseur
tylau0 began at the beginning.
 
Posts: 82
Karma: 10
Join Date: Oct 2010
Device: Kindle
Any hope to have the "Top Stories" recipe updated as well?

Any hope to have the "Top Stories" recipe updated as well?

Quote:
Originally Posted by nickredding View Post
Format changes on the NYT web site. Here is an updated recipe:
Code:
#!/usr/bin/env  python

__license__   = 'GPL v3'
__copyright__ = '2008, Kovid Goyal <kovid at kovidgoyal.net>'
'''
nytimes.com
'''
import string, re, time
from calibre import strftime
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup

def decode(self, src):
    enc = 'utf-8'
    if 'iso-8859-1' in src:
        enc = 'cp1252'
    return src.decode(enc, 'ignore')

class NYTimes(BasicNewsRecipe):

    title       = u'New York Times'
    __author__  = 'Kovid Goyal/Nick Redding'
    language = 'en'
    requires_version = (0, 6, 36)

    description = 'Daily news from the New York Times (subscription version)'
    timefmt = ' [%b %d]'
    needs_subscription = True
    remove_tags_before = dict(id='article')
    remove_tags_after  = dict(id='article')
    remove_tags = [dict(attrs={'class':['articleTools', 'post-tools', 'side_tool','nextArticleLink',
                                        'nextArticleLink clearfix','columnGroup doubleRule','doubleRule','entry-meta',
                                        'icon enlargeThis','columnGroup  last','relatedSearchesModule']}),
                   dict({'class':re.compile('^subNavigation')}),
                   dict({'class':re.compile('^leaderboard')}),
                   dict({'class':re.compile('^module')}),
                   dict({'class':'metaFootnote'}),
                   dict(id=['inlineBox','footer', 'toolsRight', 'articleInline','login','masthead',
                            'navigation', 'archive', 'side_search', 'blog_sidebar','cCol','portfolioInline',
                            'side_tool', 'side_index','header','readerReviewsCount','readerReviews',
                            'relatedArticles', 'relatedTopics', 'adxSponLink']),
                   dict(name=['script', 'noscript', 'style','form','hr'])]
    encoding = decode
    no_stylesheets = True
    extra_css = '''
                .articleHeadline { margin-top:0.5em; margin-bottom:0.25em; }
                .credit { font-size: small; font-style:italic; line-height:1em; margin-top:5px; margin-left:0; margin-right:0; margin-bottom: 0; }
                .byline { font-size: small; font-style:italic; line-height:1em; margin-top:10px; margin-left:0; margin-right:0; margin-bottom: 0; }
                .dateline { font-size: small; line-height:1em;margin-top:5px; margin-left:0; margin-right:0; margin-bottom: 0; }
                .kicker { font-size: small; line-height:1em;margin-top:5px; margin-left:0; margin-right:0; margin-bottom: 0; }
                .timestamp { font-size: small; }
                .caption { font-size: small; line-height:1em; margin-top:5px; margin-left:0; margin-right:0; margin-bottom: 0; }
                a:link {text-decoration: none; }'''

    def get_browser(self):
        br = BasicNewsRecipe.get_browser()
        if self.username is not None and self.password is not None:
            br.open('http://www.nytimes.com/auth/login')
            br.select_form(name='login')
            br['USERID']   = self.username
            br['PASSWORD'] = self.password
            raw = br.submit().read()
            if 'Sorry, we could not find the combination you entered. Please try again.' in raw:
                raise Exception('Your username and password are incorrect')
            #open('/t/log.html', 'wb').write(raw)
        return br

    def get_masthead_url(self):
        masthead = 'http://graphics8.nytimes.com/images/misc/nytlogo379x64.gif'
        #masthead = 'http://members.cox.net/nickredding/nytlogo.gif'
        br = BasicNewsRecipe.get_browser()
        try:
            br.open(masthead)
        except:
            self.log("\nMasthead unavailable")
            masthead = None
        return masthead


    def get_cover_url(self):
        cover = None
        st = time.localtime()
        year = str(st.tm_year)
        month = "%.2d" % st.tm_mon
        day = "%.2d" % st.tm_mday
        cover = 'http://graphics8.nytimes.com/images/' + year + '/' +  month +'/' + day +'/nytfrontpage/scan.jpg'
        br = BasicNewsRecipe.get_browser()
        try:
            br.open(cover)
        except:
            self.log("\nCover unavailable")
            cover = None
        return cover

    def short_title(self):
        return 'New York Times'

    def parse_index(self):
        self.encoding = 'cp1252'
        soup = self.index_to_soup('http://www.nytimes.com/pages/todayspaper/index.html')
        self.encoding = decode

        def feed_title(div):
            return ''.join(div.findAll(text=True, recursive=True)).strip()

        articles = {}
        key = None
        ans = []
        url_list = []

        def handle_article(div):
            a = div.find('a', href=True)
            if not a:
                return
            url = re.sub(r'\?.*', '', a['href'])
            if not url.startswith("http"):
                return
            if not url.endswith(".html"):
                return
            if 'podcast' in url:
                return
            url += '?pagewanted=all'
            if url in url_list:
                return
            url_list.append(url)
            title = self.tag_to_string(a, use_alt=True).strip()
            #self.log("Title: %s" % title)
            description = ''
            pubdate = strftime('%a, %d %b')
            summary = div.find(True, attrs={'class':'summary'})
            if summary:
                description = self.tag_to_string(summary, use_alt=False)
            author = ''
            authorAttribution = div.find(True, attrs={'class':'byline'})
            if authorAttribution:
                author = self.tag_to_string(authorAttribution, use_alt=False)
            else:
                authorAttribution = div.find(True, attrs={'class':'byline'})
                if authorAttribution:
                    author = self.tag_to_string(authorAttribution, use_alt=False)
            feed = key if key is not None else 'Uncategorized'
            if not articles.has_key(feed):
                articles[feed] = []
            articles[feed].append(
                            dict(title=title, url=url, date=pubdate,
                                description=description, author=author,
                                content=''))
            


        # Find each instance of class="section-headline", class="story", class="story headline"
        for div in soup.findAll(True,
            attrs={'class':['section-headline', 'story', 'story headline','sectionHeader','headlinesOnly multiline flush']}):

            if div['class'] in ['section-headline','sectionHeader']:
                key = string.capwords(feed_title(div))
                articles[key] = []
                ans.append(key)
                #self.log('Section: %s' % key)                

            elif div['class'] in ['story', 'story headline'] :
                handle_article(div)
            elif div['class'] == 'headlinesOnly multiline flush':
                for lidiv in div.findAll('li'):
                    handle_article(lidiv)
                    
#        ans = self.sort_index_by(ans, {'The Front Page':-1,
#                                      'Dining In, Dining Out':1,
#                                     'Obituaries':2})
        ans = [(key, articles[key]) for key in ans if articles.has_key(key)]

        return ans

    def preprocess_html(self, soup):
        kicker_tag = soup.find(attrs={'class':'kicker'})
        if kicker_tag:
            tagline = self.tag_to_string(kicker_tag)
            #self.log("FOUND KICKER %s" % tagline)
            if tagline=='Op-Ed Columnist':
                img_div = soup.find('div','inlineImage module')
                #self.log("Searching for photo")
                if img_div:
                    img_div.extract()
                    #self.log("Photo deleted")
        refresh = soup.find('meta', {'http-equiv':'refresh'})
        if refresh is None:
            return soup
        content = refresh.get('content').partition('=')[2]
        raw = self.browser.open_novisit('http://www.nytimes.com'+content).read()
        return BeautifulSoup(raw.decode('cp1252', 'replace'))
tylau0 is offline   Reply With Quote
Old 10-31-2010, 09:24 PM   #9
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
I'm working on it--check back on Tuesday!
nickredding is offline   Reply With Quote
Old 11-28-2010, 08:02 PM   #10
prouss
Member
prouss began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Nov 2010
Device: Kindle
Hi, I'm new to Calibre and still find it a little confusing. I've been having difficulty getting it to download from NYT too so this seems to be the solution - do I just copy and paste the entire recipe to replace the existing one, click on Add Recipe and save? Thanks
prouss is offline   Reply With Quote
Old 11-28-2010, 11:42 PM   #11
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Kovid has updated the standard recipes--all you have to do is select "New York Times" or "New York Times Headlines" from the English recipes. Note that "New York Times" requires a (free) login which you can get from www.nytimes.com.
nickredding is offline   Reply With Quote
Old 11-29-2010, 03:37 PM   #12
prouss
Member
prouss began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Nov 2010
Device: Kindle
Thanks, it seemed to be having difficulties with my existing nyt account and log-in, so I created a new one just for Calibre and it worked perfectly.
prouss is offline   Reply With Quote
Old 12-02-2010, 09:21 PM   #13
Phoul
Dances with penguins
Phoul began at the beginning.
 
Phoul's Avatar
 
Posts: 54
Karma: 10
Join Date: Oct 2010
Device: Sony PRS-350
Quote:
Originally Posted by nickredding View Post
Kovid has updated the standard recipes--all you have to do is select "New York Times" or "New York Times Headlines" from the English recipes. Note that "New York Times" requires a (free) login which you can get from www.nytimes.com.
Hrm, I'm still unable to fetch the times through the normal means.
Phoul is offline   Reply With Quote
Old 12-03-2010, 10:56 AM   #14
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Quote:
Originally Posted by Phoul View Post
Hrm, I'm still unable to fetch the times through the normal means.
Can you be more specific?
nickredding is offline   Reply With Quote
Old 12-03-2010, 10:57 AM   #15
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Phoul - perhaps you are having the login problem that surfaced today due to NYT format changes -- see https://www.mobileread.com/forums/sho...d.php?t=109611 for the solution.
nickredding is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
NY Times problem scottsan Calibre 5 10-26-2010 09:49 AM
Calibre-NY Times problem moosejons_dad Calibre 15 03-18-2009 07:51 AM
Calibre 4.102-NY Times problem moosejons_dad Calibre 21 11-07-2008 09:05 PM
calibre - New York Times - Sony Library Problem Deputy-Dawg Calibre 5 06-21-2008 10:23 AM
NY Times problem radleyp Feedback 1 02-12-2003 02:04 PM


All times are GMT -4. The time now is 03:33 AM.


MobileRead.com is a privately owned, operated and funded community.