Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Other formats > LRF

Notices

Reply
 
Thread Tools Search this Thread
Old 03-21-2008, 11:28 PM   #241
Deputy-Dawg
Groupie
Deputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-books
 
Deputy-Dawg's Avatar
 
Posts: 153
Karma: 799
Join Date: Dec 2007
Device: sony prs505
Kovid,
Thanks for the fixed recipe for USAToday. Looks much better to these tired eyes. Also thanks for the tip about cron. I did not realize such a utility was available on the Mac. Maybe its time to take a look under the hood.

Searching the web I found a GUI for cron called croniX_3.0.2. When you run it gives the ability to create a custom crontab file.

When I run the following command from the bash terminal:

feeds2lrf --output=/users/billc/desktop/news.lrf desktop/books/nwa2.py

I produce an output file called news.lrf on my desktop. I then deleted the file and put the same command into cronniX and used the 'Run Now' commmand (under the 'Task' drop down menu) all I got was:

Running command
feeds2lrf --output=/users/billc/desktop/news.lrf desktop/books/nwa2.py
The output will appear below when the command has finished executing
Fetching feeds...

then the program goes off into lala land and produces no output. Clearly there is something wrong! Is there one of those cryptic commands like sh that should precede the main command? Or What?
Deputy-Dawg is offline   Reply With Quote
Old 03-22-2008, 01:07 AM   #242
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by ddavtian View Post
"import time" is already there and it works with web2lrf.
Attach it here.
kovidgoyal is online now   Reply With Quote
Old 03-22-2008, 01:08 AM   #243
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by Deputy-Dawg View Post
Kovid,
Thanks for the fixed recipe for USAToday. Looks much better to these tired eyes. Also thanks for the tip about cron. I did not realize such a utility was available on the Mac. Maybe its time to take a look under the hood.

Searching the web I found a GUI for cron called croniX_3.0.2. When you run it gives the ability to create a custom crontab file.

When I run the following command from the bash terminal:

feeds2lrf --output=/users/billc/desktop/news.lrf desktop/books/nwa2.py

I produce an output file called news.lrf on my desktop. I then deleted the file and put the same command into cronniX and used the 'Run Now' commmand (under the 'Task' drop down menu) all I got was:

Running command
feeds2lrf --output=/users/billc/desktop/news.lrf desktop/books/nwa2.py
The output will appear below when the command has finished executing
Fetching feeds...

then the program goes off into lala land and produces no output. Clearly there is something wrong! Is there one of those cryptic commands like sh that should precede the main command? Or What?
Use an absolute path to nwa2.py
kovidgoyal is online now   Reply With Quote
Old 03-22-2008, 01:55 AM   #244
ddavtian
Addict
ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.
 
Posts: 271
Karma: 332
Join Date: Nov 2003
Location: San Francisco, USA
Device: Sage, Elipsa, Oasis, Galaxy Tab 8U, S22U
Quote:
Originally Posted by kovidgoyal View Post
Attach it here.
Attached, as a txt file.

Thanks in advance.
Attached Files
File Type: txt wsj2.txt (2.6 KB, 332 views)
ddavtian is offline   Reply With Quote
Old 03-22-2008, 05:19 AM   #245
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Move the import statements to just above where the imported modules are used. A proper fix will be in the next release. Why aren't you using the built-in Wall Street Journal?
kovidgoyal is online now   Reply With Quote
Old 03-22-2008, 12:03 PM   #246
ddavtian
Addict
ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.
 
Posts: 271
Karma: 332
Join Date: Nov 2003
Location: San Francisco, USA
Device: Sage, Elipsa, Oasis, Galaxy Tab 8U, S22U
Thanks Kovid.

It helped, now it runs. But it didn't get any articles (jumped from "0% Starting download" to "100% Feeds downloaded"). I'll try to fix it myself.

Built-in WSJ is good but it doesn't have many articles from the paper edition. This one was getting all articles from paper.

David
ddavtian is offline   Reply With Quote
Old 03-22-2008, 12:11 PM   #247
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You can still run it using web2lrf instead of feeds2lrf
kovidgoyal is online now   Reply With Quote
Old 03-22-2008, 02:26 PM   #248
Deputy-Dawg
Groupie
Deputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-books
 
Deputy-Dawg's Avatar
 
Posts: 153
Karma: 799
Join Date: Dec 2007
Device: sony prs505
Boy you talk about being invincibly ignorant. I knew enough to use the absolute path to the saved file but it never occurred to me that you should use the absolute path to the recipe file. All of which is to say it works! Thanks.

Do you have any idea what the publication date is for the current edition of the Atlantic Monthly? I would like to set up a command in Crontab to capture it each month.
Deputy-Dawg is offline   Reply With Quote
Old 03-22-2008, 04:01 PM   #249
Deputy-Dawg
Groupie
Deputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-books
 
Deputy-Dawg's Avatar
 
Posts: 153
Karma: 799
Join Date: Dec 2007
Device: sony prs505
Kovid,
I downloaded the Atlantic Monthly recipe from your website with the intention of modifing it to capture the daily feed from them. I modified the recipe by as follows:

Code:
#!/usr/bin/env  python

##    Copyright (C) 2008 Kovid Goyal kovid@kovidgoyal.net
##    This program is free software; you can redistribute it and/or modify
##    it under the terms of the GNU General Public License as published by
##    the Free Software Foundation; either version 2 of the License, or
##    (at your option) any later version.
##
##    This program is distributed in the hope that it will be useful,
##    but WITHOUT ANY WARRANTY; without even the implied warranty of
##    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
##    GNU General Public License for more details.
##
##    You should have received a copy of the GNU General Public License along
##    with this program; if not, write to the Free Software Foundation, Inc.,
##    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
'''
thecurrent.theatlantic.com
'''

from libprs500.web.feeds.news import BasicNewsRecipe
from libprs500.ebooks.BeautifulSoup import BeautifulSoup

class TheAtlantic(BasicNewsRecipe):
    
    title = 'THeCrrent.The Atlantic'
    INDEX = 'http://thecurrent.theatlantic.com/'
    
    remove_tags_before = dict(name='div', id='storytop')
    remove_tags        = [dict(name='div', id='seealso')]
    extra_css          = '#bodytext {line-height: 1}'
    
    def parse_index(self):
        articles = []
        
        src = self.browser.open(self.INDEX).read()
        soup = BeautifulSoup(src, convertEntities=BeautifulSoup.HTML_ENTITIES)
        
        issue = soup.find('span', attrs={'class':'issue'})
        if issue:
            self.timefmt = ' [%s]'%self.tag_to_string(issue).rpartition('|')[-1].strip().replace('/', '-')
        
        for item in soup.findAll('div', attrs={'class':'item'}):
            a = item.find('a')
            if a and a.has_key('href'):
                url = a['href']
                url = 'http://www.theatlantic.com/'+url.replace('/doc', 'doc/print')
                title = self.tag_to_string(a)
                byline = item.find(attrs={'class':'byline'})
                date = self.tag_to_string(byline) if byline else ''
                description = ''
                articles.append({
                                 'title':title,
                                 'date':date,
                                 'url':url,
                                 'description':description
                                })
                
        
        return {'Daily Issue' : articles }
When I run it I get:

Macintosh-3:books billc$ feeds2lrf atlantic-1.py
Fetching feeds...
0% [----------------------------------------------------------------------]
Fetching feeds... Traceback (most recent call last):
File "/Users/billc/Downloads/libprs500-1.app/Contents/Resources/feeds2lrf.py", line 9, in <module>
main()
File "libprs500/ebooks/lrf/feeds/convert_from.pyo", line 52, in main
File "libprs500/web/feeds/main.pyo", line 141, in run_recipe
File "libprs500/web/feeds/news.pyo", line 411, in download
File "libprs500/web/feeds/news.pyo", line 514, in build_index
File "<string>", line 37, in parse_index
NameError: global name 'BeautifulSoup' is not defined
Macintosh-3:books billc$


But it seems to me that 'BeautifulSoup' is defined in line 22 e.g.

Code:
rom libprs500.ebooks.BeautifulSoup import BeautifulSoup
What have I done wrong?

I Went back and ran the unmodified recipe in terminal mode and got the same result.

Last edited by Deputy-Dawg; 03-22-2008 at 04:56 PM. Reason: added info
Deputy-Dawg is offline   Reply With Quote
Old 03-22-2008, 05:42 PM   #250
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by Deputy-Dawg View Post
Boy you talk about being invincibly ignorant. I knew enough to use the absolute path to the saved file but it never occurred to me that you should use the absolute path to the recipe file. All of which is to say it works! Thanks.

Do you have any idea what the publication date is for the current edition of the Atlantic Monthly? I would like to set up a command in Crontab to capture it each month.
Use the pseudo target @monthly in cron and it will be downloaded at 30-day intervals.
kovidgoyal is online now   Reply With Quote
Old 03-22-2008, 05:43 PM   #251
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
There's a bug that causes problems with custom recipes. Just copy the import statement to the line just above where it is used and you should be fine.
kovidgoyal is online now   Reply With Quote
Old 03-22-2008, 06:58 PM   #252
Deputy-Dawg
Groupie
Deputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-books
 
Deputy-Dawg's Avatar
 
Posts: 153
Karma: 799
Join Date: Dec 2007
Device: sony prs505
Kovid,
I modified the code as follows:

Code:
#!/usr/bin/env  python

##    Copyright (C) 2008 Kovid Goyal kovid@kovidgoyal.net
##    This program is free software; you can redistribute it and/or modify
##    it under the terms of the GNU General Public License as published by
##    the Free Software Foundation; either version 2 of the License, or
##    (at your option) any later version.
##
##    This program is distributed in the hope that it will be useful,
##    but WITHOUT ANY WARRANTY; without even the implied warranty of
##    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
##    GNU General Public License for more details.
##
##    You should have received a copy of the GNU General Public License along
##    with this program; if not, write to the Free Software Foundation, Inc.,
##    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
'''
theatlantic.com
'''
import re
from libprs500.web.feeds.news import BasicNewsRecipe

class TheAtlantic(BasicNewsRecipe):
    
    title = 'The Atlantic'
    INDEX = 'http://www.theatlantic.com/doc/current'
    
    remove_tags_before = dict(name='div', id='storytop')
    remove_tags        = [dict(name='div', id='seealso')]
    extra_css          = '#bodytext {line-height: 1}'
    
    def parse_index(self):
        articles = []
        
        src = self.browser.open(self.INDEX).read()
        from libprs500.ebooks.BeautifulSoup import BeautifulSoup
        soup = BeautifulSoup(src, convertEntities=BeautifulSoup.HTML_ENTITIES)

        issue = soup.find('span', attrs={'class':'issue'})
        if issue:
            self.timefmt = ' [%s]'%self.tag_to_string(issue).rpartition('|')[-1].strip().replace('/', '-')
        
        for item in soup.findAll('div', attrs={'class':'item'}):
            a = item.find('a')
            if a and a.has_key('href'):
                url = a['href']
                url = 'http://www.theatlantic.com/'+url.replace('/doc', 'doc/print')
                title = self.tag_to_string(a)
                byline = item.find(attrs={'class':'byline'})
                date = self.tag_to_string(byline) if byline else ''
                description = ''
                articles.append({
                                 'title':title,
                                 'date':date,
                                 'url':url,
                                 'description':description
                                })
                
        
        return {'Current Issue' : articles }
and now I get:

Macintosh-3:books billc$ feeds2lrf atlantic-2.py
Fetching feeds...
0% [----------------------------------------------------------------------]
Fetching feeds... Traceback (most recent call last):
File "/Users/billc/Downloads/libprs500.app/Contents/Resources/feeds2lrf.py", line 9, in <module>
main()
File "libprs500/ebooks/lrf/feeds/convert_from.pyo", line 52, in main
File "libprs500/web/feeds/main.pyo", line 141, in run_recipe
File "libprs500/web/feeds/news.pyo", line 411, in download
File "libprs500/web/feeds/news.pyo", line 515, in build_index
File "libprs500/web/feeds/__init__.pyo", line 193, in feeds_from_index
ValueError: too many values to unpack
Macintosh-3:books billc$
Deputy-Dawg is offline   Reply With Quote
Old 03-22-2008, 07:17 PM   #253
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The return statement should be
Code:
return [('Current Issue', articles)]
You should probably look at the latest atlantic profile in svn. As there we some changes.

Last edited by kovidgoyal; 03-22-2008 at 07:24 PM.
kovidgoyal is online now   Reply With Quote
Old 03-25-2008, 06:34 PM   #254
ddavtian
Addict
ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.
 
Posts: 271
Karma: 332
Join Date: Nov 2003
Location: San Francisco, USA
Device: Sage, Elipsa, Oasis, Galaxy Tab 8U, S22U
feeds2disk

Kovid, I tried to use "feeds2disk" for Newsweek (built-in profile gets very few articles from the latest issue) and got an error message:


C:\Misc\News\Newsweek>feeds2disk --feeds="['http://feeds.newsweek.com/newsweek/NationalNews','http://feeds.newsweek.com/headlines/business','http://feeds.newswe
ek.com/newsweek/WorldNews']"
Fetching feeds...
Traceback (most recent call last):
File "main.py", line 158, in <module>
File "main.py", line 153, in main
File "main.py", line 134, in run_recipe
UnboundLocalError: local variable 'is_profile' referenced before assignment


feeds2disk works fine with built-in profiles, but I always got this error when specifying the feed address.

David
ddavtian is offline   Reply With Quote
Old 03-25-2008, 06:47 PM   #255
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Will be fixed in the next release.
kovidgoyal is online now   Reply With Quote
Reply

Tags
libprs500, web2lrf


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
web2lrf to capture blog archive? Deputy-Dawg Sony Reader Dev Corner 1 02-14-2008 11:41 PM
web2lrf: La Repubblica alexxxm Sony Reader 1 11-13-2007 12:27 PM


All times are GMT -4. The time now is 12:04 PM.


MobileRead.com is a privately owned, operated and funded community.