Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 12-10-2018, 11:12 AM   #1
NSILMike
Guru
NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.
 
Posts: 735
Karma: 35936
Join Date: Apr 2011
Location: Shrewsury, MA
Device: Lenovo Android Tablet
ESPN recipe fails

(the one by Kovid and Raman)

Trying to get latest version of recipe: espn
Python function terminated unexpectedly
HTTP Error 401: Unauthorized (Error Code: 1)
Traceback (most recent call last):
File "site.py", line 101, in main
File "site.py", line 78, in run_entry_point
File "site-packages\calibre\utils\ipc\worker.py", line 199, in main
File "site-packages\calibre\gui2\convert\gui_conversion.py", line 35, in gui_convert_recipe
File "site-packages\calibre\gui2\convert\gui_conversion.py", line 27, in gui_convert
File "site-packages\calibre\ebooks\conversion\plumber.py", line 1106, in run
File "site-packages\calibre\customize\conversion.py", line 244, in __call__
File "site-packages\calibre\ebooks\conversion\plugins\recipe_ input.py", line 135, in convert
File "site-packages\calibre\web\feeds\news.py", line 901, in __init__
File "<string>", line 82, in get_browser
File "site-packages\mechanize\_mechanize.py", line 254, in open
File "site-packages\mechanize\_mechanize.py", line 310, in _mech_open
mechanize._response.httperror_seek_wrapper: HTTP Error 401: Unauthorized
NSILMike is offline   Reply With Quote
Old 12-11-2018, 12:18 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,778
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I need ESPN account credentials to look at that.
kovidgoyal is offline   Reply With Quote
Old 12-11-2018, 07:32 AM   #3
NSILMike
Guru
NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.
 
Posts: 735
Karma: 35936
Join Date: Apr 2011
Location: Shrewsury, MA
Device: Lenovo Android Tablet
You can create a login, or log in with Facebook. And if you don't have a login how did you originally create the recipe?

Last edited by NSILMike; 12-13-2018 at 10:05 AM.
NSILMike is offline   Reply With Quote
Old 12-20-2018, 11:06 AM   #4
NSILMike
Guru
NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.
 
Posts: 735
Karma: 35936
Join Date: Apr 2011
Location: Shrewsury, MA
Device: Lenovo Android Tablet
Quote:
Originally Posted by kovidgoyal View Post
I need ESPN account credentials to look at that.
See my prior reply.
NSILMike is offline   Reply With Quote
Old 12-21-2018, 09:49 AM   #5
NSILMike
Guru
NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.
 
Posts: 735
Karma: 35936
Join Date: Apr 2011
Location: Shrewsury, MA
Device: Lenovo Android Tablet
Just downloaded Calibre 3.36 which says ESPN recipe is improved. Now it doesn't fail, but it downloads only links...
NSILMike is offline   Reply With Quote
Old 12-21-2018, 12:14 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,778
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
yeah I looked at it briefly, ESPN uses a complicated javascript based mechanism to login,which I dont have the time/interest to reverse engineer.
kovidgoyal is offline   Reply With Quote
Old 12-21-2018, 12:18 PM   #7
NSILMike
Guru
NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.
 
Posts: 735
Karma: 35936
Join Date: Apr 2011
Location: Shrewsury, MA
Device: Lenovo Android Tablet
Quote:
Originally Posted by kovidgoyal View Post
yeah I looked at it briefly, ESPN uses a complicated javascript based mechanism to login,which I dont have the time/interest to reverse engineer.
Thanks, I appreciate your efforts.
NSILMike is offline   Reply With Quote
Old 08-06-2020, 10:45 PM   #8
biffhero
Junior Member
biffhero began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2020
Device: kobo libre h20
Quote:
Originally Posted by kovidgoyal View Post
yeah I looked at it briefly, ESPN uses a complicated javascript based mechanism to login,which I dont have the time/interest to reverse engineer.
I think we can get a little closer if we don't try to log in to the site.

I don't know anything about calibre, but I did find some information that I think might be of help.

In the file ./resources/builtin_recipes.zip I found a file called espn.recipe.

Looking at it, and looking at the web site, I tried this web page:

http://sports.espn.go.com/espn/rss/nfl/news

which gave me a bunch of stuff, including a URL that looked like this:

https://www.espn.com/nfl/story/_/id/...king-full-list

I saw on line 109 something that looked interesting, so I tried to go to this page to get the story.

http://sports.espn.go.com/espn/print?id=29533526 which seems to work pretty well.

Looking at line 115, I saw that this sort of an URL was an interesting idea.

https://www.espn.com/espn/print?id=29533526&type=story

And that one works as well.

Maybe this is all that needs to be changed?

Thanks,
Rob
biffhero is offline   Reply With Quote
Old 08-06-2020, 11:00 PM   #9
biffhero
Junior Member
biffhero began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2020
Device: kobo libre h20
Well, I don't know how to know if it is working or not. It is not downloading because of age issues that I don't understand.

I am getting this message a lot:

Skipping article Bubbles are working for other sports. Why did the NFL decide against one? (Tue, 28 Jul, 2020 11:04) from feed www.espn.com - NFL as it is too old.


I'll keep poking around for a cache somewhere.

Thanks,
Rob
biffhero is offline   Reply With Quote
Old 08-06-2020, 11:12 PM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,778
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Set oldest_article in the recipe to control that. And you dont need to look in builtin_recipes.zip to edit recipes, calibre has UI for that. https://manual.calibre-ebook.com/news.html
kovidgoyal is offline   Reply With Quote
Old 08-07-2020, 02:10 AM   #11
biffhero
Junior Member
biffhero began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2020
Device: kobo libre h20
Thank you!

That was exactly where I needed to start. I copied some things from the other espn script, and other things I don't know what they do enough to copy them over and understand what is going on. Here's my script for now, in case anyone else wants to use it.

Code:
#!/usr/bin/env python2
# vim:fileencoding=utf-8
from __future__ import unicode_literals, division, absolute_import, print_function
from calibre.web.feeds.news import BasicNewsRecipe

class AdvancedUserRecipe1596778396(BasicNewsRecipe):
    title          = 'espn_modified'
    description = 'Sports news'
    __author__ = 'Rob Walker'
    language = 'en'
    no_stylesheets = True
    use_embedded_content = False
    remove_javascript = True
    encoding = 'ISO-8859-1'
    oldest_article = 7
    max_articles_per_feed = 100
    auto_cleanup   = True

    remove_tags_before = dict(name='font', attrs={'class': 'date'})
    remove_tags = [
        dict(name='font', attrs={'class': 'footer'}), dict(
            name='hr', noshade='noshade'),
        dict(name='img', src='/winnercomm/horseracing/DRF.jpg')
    ]

    extra_css = '''
                body{font-family:Verdana,Arial,Helvetica,sans-serif; font-size:x-small; font-weight:normal;}
                .subhead{color:#666666;font-family:Verdana,sans-serif; font-size:x-small; font-weight:bold;}
                .clearfix{font-family:Verdana,sans-serif; font-size:xx-small; }
                .date{ font-family:Verdana,Arial,Helvetica,sans-serif ; font-size:xx-small;color:#7A7A7A;}
                .byline{ font-family:Verdana,Arial,Helvetica,sans-serif ; font-size:xx-small;color:#666666;}
                .headline{font-family:Verdana,Arial,Helvetica,sans-serif ; font-size:large; font-weight:bold;}
                '''
    
    feeds          = [
        ('Top Headlines', 'https://www.espn.com/espn/rss/news'),
        ('NFL', 'https://www.espn.com/espn/rss/nfl/news'),
        ('NBA', 'https://www.espn.com/espn/rss/nba/news'),
        ('MLB', 'https://www.espn.com/espn/rss/mlb/news'),
        ('NHL', 'https://www.espn.com/espn/rss/nhl/news'),
        ('Golf', 'https://www.espn.com/espn/rss/golf/news'),
        ('RPM', 'https://www.espn.com/espn/rss/rpm/news'),
        ('Boxing', 'https://www.espn.com/espn/rss/boxing/news'),
        ('Soccer', 'https://www.espn.com/espn/rss/soccer/news'),
        ('NCB', 'https://www.espn.com/espn/rss/ncb/news'),
        ('NCF', 'https://www.espn.com/espn/rss/ncf/news'),
        ('NCAA', 'https://www.espn.com/espn/rss/ncaa/news'),
        ('Olympics', 'https://www.espn.com/espn/rss/oly/news'),
        ('Equestrian', 'https://www.espn.com/espn/rss/horse/news'),
    ]
    
    def preprocess_html(self, soup):
        for div in soup.findAll('div', style=True):
            if 'px' in div['style']:
                div['style'] = ''

        return soup

    def postprocess_html(self, soup, first_fetch):
        for div in soup.findAll('div', style=True):
            div['style'] = div['style'].replace('center', 'left')

        return soup

Last edited by kovidgoyal; 08-07-2020 at 02:49 AM.
biffhero is offline   Reply With Quote
Old 08-07-2020, 02:02 PM   #12
biffhero
Junior Member
biffhero began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2020
Device: kobo libre h20
OK, I'm starting to understand how this stuff works. I think I'm making progress, but I'm not sure.

The base URL has changed.

feeds = [
('Top Headlines', 'http://sports.espn.go.com/espn/rss/news'),
'http://sports.espn.go.com/espn/rss/nfl/news',
'http://sports.espn.go.com/espn/rss/nba/news',
'http://sports.espn.go.com/espn/rss/mlb/news',
'http://sports.espn.go.com/espn/rss/nhl/news',
'http://sports.espn.go.com/espn/rss/golf/news',
'http://sports.espn.go.com/espn/rss/rpm/news',
'http://sports.espn.go.com/espn/rss/tennis/news',
'http://sports.espn.go.com/espn/rss/boxing/news',
'http://soccernet.espn.go.com/rss/news',
'http://sports.espn.go.com/espn/rss/ncb/news',
'http://sports.espn.go.com/espn/rss/ncf/news',
'http://sports.espn.go.com/espn/rss/ncaa/news',
'http://sports.espn.go.com/espn/rss/outdoors/news',
# 'http://sports.espn.go.com/espn/rss/bassmaster/news',
'http://sports.espn.go.com/espn/rss/oly/news',
'http://sports.espn.go.com/espn/rss/horse/news'
]


Therefore, in print_version() we need

return 'http://sports.espn.go.com/espn/print?' + match.group(1) + '&type=story'


However, where I'm getting confused is where we get "match" setup.

When we land inside of print_version, the variable "url" is holding the number. For instance, this is a good URL. https://www.espn.com/espn/print?id=29581539&type=story But the 'url' variable is coming in with '29581539', and the 'match' variable is completely empty.

My current attempt has this in print_version(), which isn't working.

def print_version(self, url):
if 'eticket' in url:
return url.partition('&')[0].replace('story?', 'print?')
match = re.search(r'story\?(id=\d+)', url)
self.log.debug('url: %s' % (url))
self.log.debug('match: %s' % (match.group(1)))
match = 1
articleId = url
if match and 'soccernet' not in url and 'bassmaster' not in url:
# return 'http://sports.espn.go.com/espn/print?' + match.group(1) + '&type=story'

self.log.debug('i: %s' % (match.group(1)))

# https://www.espn.com/espn/print?id=29581539&type=story
# return 'http://www.espn.com/espn/print?' + match.group(1) + '&type=story'



I'll keep applying head to wall, but if this helps someone else get closer, that's good.
biffhero is offline   Reply With Quote
Old 08-07-2020, 02:15 PM   #13
biffhero
Junior Member
biffhero began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2020
Device: kobo libre h20
Sigh, that was completely wrong. Here is the correct information.

-------------



OK, I'm starting to understand how this stuff works. I think I'm making progress, but I'm not sure.

The base URL has changed.

feeds = [
('Top Headlines', 'https://www.espn.com/espn/rss/news'),
'https://www.espn.com/espn/rss/nfl/news',
'https://www.espn.com/espn/rss/nba/news',
'https://www.espn.com/espn/rss/mlb/news',
'https://www.espn.com/espn/rss/nhl/news',
'https://www.espn.com/espn/rss/golf/news',
'https://www.espn.com/espn/rss/rpm/news',
'https://www.espn.com/espn/rss/tennis/news',
'https://www.espn.com/espn/rss/boxing/news',
'https://www.espn.com/espn/rss/soccer/news',
# 'http://soccernet.espn.go.com/rss/news',
'https://www.espn.com/espn/rss/ncb/news',
'https://www.espn.com/espn/rss/ncf/news',
'https://www.espn.com/espn/rss/ncaa/news',
# 'https://www.espn.com/espn/rss/outdoors/news',
# 'http://sports.espn.go.com/espn/rss/bassmaster/news',
'https://www.espn.com/espn/rss/oly/news',
'https://www.espn.com/espn/rss/horse/news'
]

Therefore, in print_version() we need

return 'http://www.espn.com/espn/print?id=' + articleId + '&type=story'


However, where I'm getting confused is where we get "match" setup.

When we land inside of print_version, the variable "url" is holding the number. For instance, this is a good URL. https://www.espn.com/espn/print?id=29581539&type=story But the 'url' variable is coming in with '29581539', and the 'match' variable is completely empty.

My current attempt has this in print_version(), which isn't working.

def print_version(self, url):
if 'eticket' in url:
return url.partition('&')[0].replace('story?', 'print?')
match = re.search(r'story\?(id=\d+)', url)
self.log.debug('url: %s' % (url))
self.log.debug('match: %s' % (match.group(1)))
match = 1
articleId = url
if match and 'soccernet' not in url and 'bassmaster' not in url:
# return 'http://sports.espn.go.com/espn/print?' + match.group(1) + '&type=story'

self.log.debug('i: %s' % (match.group(1)))

# https://www.espn.com/espn/print?id=29581539&type=story
# return 'http://www.espn.com/espn/print?' + match.group(1) + '&type=story'
return 'http://www.espn.com/espn/print?id=' + articleId + '&type=story'



I'll keep applying head to wall, but if this helps someone else get closer, that's good.
biffhero is offline   Reply With Quote
Old 08-08-2020, 06:33 AM   #14
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,778
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
There you go: https://github.com/kovidgoyal/calibr...1b1879f8d2d13f
kovidgoyal is offline   Reply With Quote
Old 08-20-2020, 08:48 PM   #15
biffhero
Junior Member
biffhero began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2020
Device: kobo libre h20
I was out of town for a week, and I'm just getting back to this.

This works perfectly, thank you! I have imported it to ESPN_master, and it works great.

Thank you again,
Rob
biffhero is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
WSJ recipe fails mjfriedman Recipes 13 10-17-2019 02:09 PM
Newsweek recipe now fails NSILMike Recipes 6 08-02-2017 06:40 PM
ESPN recipe broken due to new print urls Odyseus Recipes 1 01-18-2012 12:23 AM
Recipe works when mocked up as Python file, fails when converted to Recipe ode Recipes 7 09-04-2011 04:57 AM
ESPN Recipe is no longer carrying Soccernet rylsfan Recipes 2 02-24-2011 10:33 AM


All times are GMT -4. The time now is 03:48 AM.


MobileRead.com is a privately owned, operated and funded community.