Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 09-13-2010, 09:03 PM   #2716
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by Starson17 View Post
I'm getting a pretty clean version. I also run Adblock, but that only affects FireFox, not Calibre.
interesting cause look

picture one shows what it looks like when i let calibre fetch the feed.


the second picture shows what it looks like if i build the epub with
> ebook-convert test.recipe myrecipe.epub --test
Attached Thumbnails
Click image for larger version

Name:	junk cap in kindle for pc.JPG
Views:	368
Size:	64.2 KB
ID:	58164   Click image for larger version

Name:	clean when using test.JPG
Views:	341
Size:	125.8 KB
ID:	58165  
TonytheBookworm is offline  
Old 09-13-2010, 09:18 PM   #2717
bhandarisaurabh
Enthusiast
bhandarisaurabh began at the beginning.
 
Posts: 49
Karma: 10
Join Date: Aug 2009
Device: none
there is already a recipe for foreign policy but it covers rss feeds can anyone make the recipe for print edition
http://www.foreignpolicy.com/issues/current
thanks in advance
bhandarisaurabh is offline  
Advert
Old 09-13-2010, 09:26 PM   #2718
bhandarisaurabh
Enthusiast
bhandarisaurabh began at the beginning.
 
Posts: 49
Karma: 10
Join Date: Aug 2009
Device: none
Quote:
Originally Posted by TonytheBookworm View Post
Here you go I only done 2010. Each year appears to have different formatting but a years worth of stuff should be enough for now
the recipe just gave the recent articles for 13 sep not the entire print magazine
bhandarisaurabh is offline  
Old 09-13-2010, 11:28 PM   #2719
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by bhandarisaurabh View Post
the recipe just gave the recent articles for 13 sep not the entire print magazine
I'll look into it. sorry about that.
TonytheBookworm is offline  
Old 09-14-2010, 12:36 AM   #2720
sdow1
Connoisseur
sdow1 began at the beginning.
 
Posts: 55
Karma: 10
Join Date: Apr 2010
Location: new york city
Device: nook, ipad
Slate has no content

Not sure if this is the right thread, but since it's where most of the content for slate seems to be, thought I'd try here first.

For the past few days, every time I download the calibre feed for slate, I just get the cover and the Table of Contents (such as it is, since it just says "all articles"). No content whatsoever. I thought I'd wait through the weekend in case it was a matter of slate itself being relatively "dead" on the weekends, but today it was the same thing. I've also tried downloading at different times of the day, in case that was a problem, but I get the same thing. Cover, but no content.

Help!!
sdow1 is offline  
Advert
Old 09-14-2010, 01:13 AM   #2721
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by bhandarisaurabh View Post
the recipe just gave the recent articles for 13 sep not the entire print magazine
Alright here is the thing. That site has TOOOOOOONS of articles. The code below should work now. but what it does it goes through the links and starts with at the top down. I have the max article set to 50. So you will get 50 articles max and then it will stop. If you want 3000 then put in 3000 and hope for the best

There might very well be a more effective method of doing this. I personally do not know it. Secondly, someone with more knowledge than I do might know how to group it by the actual dates. I tested this on my end with the current code and received 50 unique articles for starting at the earliest one being in 9-15 2010

I have pretty much done all I know how at this point to do on this recipe and consider it "working but hopping along" if anyone else cares to take a stab at it. If you get it working 100 percent please share so I can learn from it.

Spoiler:

Code:
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup, re
class FIELDSTREAM(BasicNewsRecipe):
    title      = 'Down To Earth Archive'
    __author__ = 'Tonythebookworm'
    description = ''
    language = 'en'
    no_stylesheets = True
    publisher           = 'Tonythebookworm'
    category            = ''
    use_embedded_content= False
    no_stylesheets      = True
    oldest_article      = 365
    remove_javascript   = True
    remove_empty_feeds  = True
    masthead_url        = 'http://downtoearth.org.in/themes/DTE/images/DownToEarth-Logo.gif'
    
    
    max_articles_per_feed = 50 # only gets the first 50 articles
    INDEX = 'http://downtoearth.org.in'
    
    #I HAVE LEFT THE PRINT STATEMENTS IN HERE FOR DEBUGGING PURPOSES
    #Fill free to remove them.
    #This will only parse the 2010 archives.  The other ones can be added and SHOULD work.
    
    def parse_index(self):
        feeds = []
        for title, url in [
                            (u"2010 Archives", u"http://downtoearth.org.in/archives/2010"),
                             ]:
            articles = self.make_links(url)
            if articles:
                feeds.append((title, articles))
        return feeds
        
    def make_links(self, url):
        title = 'Temp'
        current_articles = []
        soup = self.index_to_soup(url)
        #print 'The soup is: ', soup
        for item in soup.findAll('div',attrs={'class':'views-field-nothing-2'}):
         # print 'item is: ', item
         
         link = item.find('a')
         linkhref = link['href']
         split1 = linkhref.split("/")
         date  = split1[3]
         print 'DATE IS :', date
         print 'the link is: ', link
            
            
         if link:
          url         = self.INDEX + link['href']
                
          soup = self.index_to_soup(url) 
          #print 'NEW SOUP IS: ', soup
        
           
         for items in soup.findAll('div',attrs={'id':'PageContent'}):
          for nodes in items.findAll('a', href=re.compile('/node')):
            
            
            
            if nodes is not None and not re.search('Next Issue', str(nodes)) and not re.search('Previous Issue', str(nodes)):
             
                print 'LINK2 EX!!! and here is that link: ', nodes['href']
                url         = nodes['href']
                
                title       = self.tag_to_string(nodes)
                
                print 'the title is: ', title
                print 'the url is: ', url
                print 'the title is: ', title
                current_articles.append({'title': date + '--' + title, 'url': url, 'description':'', 'date':''}) # append all this
            
        return current_articles
      

   
    def print_version(self, url):
        split1 = url.split("/")
        print 'THE SPLIT IS: ', split1        
        print_url = 'http://downtoearth.org.in/print' + '/' + split1[2]
        print 'THIS URL WILL PRINT: ', print_url # this is a test string to see what the url is it will return
        return print_url

Last edited by TonytheBookworm; 09-14-2010 at 01:57 AM. Reason: typo and fixed indent in post and added date to title
TonytheBookworm is offline  
Old 09-14-2010, 07:55 AM   #2722
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by TonytheBookworm View Post
interesting cause look
The recipe I tested produces nothing like that. I went back to check your post, and either the recipe you posted has changed or I copied the recipe from one of your other posts before testing it (most likely). If I get a chance, I'll go back and test the recipe in your original post.
Starson17 is offline  
Old 09-14-2010, 03:58 PM   #2723
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by TonytheBookworm View Post
When I run this recipe at the console with
ebook-convert test.recipe output_dir --test -vv > myrecipe.txt
I end up getting a nice formatted article with no junk.
Then when i take and import it into calibre to fully test it. I get junk.
I can (now) confirm I see it also. The weird thing is that you don't get the "junk" using ebook-convert even when that recipe is stripped down to the absolute bare minimum of a feed and nothing more. The junk on the right side disappears, and the comments on the bottom disappear.
Starson17 is offline  
Old 09-14-2010, 04:51 PM   #2724
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by Starson17 View Post
I can (now) confirm I see it also. The weird thing is that you don't get the "junk" using ebook-convert even when that recipe is stripped down to the absolute bare minimum of a feed and nothing more. The junk on the right side disappears, and the comments on the bottom disappear.
yeah, thought i was going crazy there for a second. I filed a bug report on this.

Last edited by TonytheBookworm; 09-14-2010 at 05:12 PM.
TonytheBookworm is offline  
Old 09-14-2010, 07:53 PM   #2725
bhandarisaurabh
Enthusiast
bhandarisaurabh began at the beginning.
 
Posts: 49
Karma: 10
Join Date: Aug 2009
Device: none
Smile

Quote:
Originally Posted by TonytheBookworm View Post
Alright here is the thing. That site has TOOOOOOONS of articles. The code below should work now. but what it does it goes through the links and starts with at the top down. I have the max article set to 50. So you will get 50 articles max and then it will stop. If you want 3000 then put in 3000 and hope for the best

There might very well be a more effective method of doing this. I personally do not know it. Secondly, someone with more knowledge than I do might know how to group it by the actual dates. I tested this on my end with the current code and received 50 unique articles for starting at the earliest one being in 9-15 2010

I have pretty much done all I know how at this point to do on this recipe and consider it "working but hopping along" if anyone else cares to take a stab at it. If you get it working 100 percent please share so I can learn from it.

Spoiler:

Code:
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup, re
class FIELDSTREAM(BasicNewsRecipe):
    title      = 'Down To Earth Archive'
    __author__ = 'Tonythebookworm'
    description = ''
    language = 'en'
    no_stylesheets = True
    publisher           = 'Tonythebookworm'
    category            = ''
    use_embedded_content= False
    no_stylesheets      = True
    oldest_article      = 365
    remove_javascript   = True
    remove_empty_feeds  = True
    masthead_url        = 'http://downtoearth.org.in/themes/DTE/images/DownToEarth-Logo.gif'
    
    
    max_articles_per_feed = 50 # only gets the first 50 articles
    INDEX = 'http://downtoearth.org.in'
    
    #I HAVE LEFT THE PRINT STATEMENTS IN HERE FOR DEBUGGING PURPOSES
    #Fill free to remove them.
    #This will only parse the 2010 archives.  The other ones can be added and SHOULD work.
    
    def parse_index(self):
        feeds = []
        for title, url in [
                            (u"2010 Archives", u"http://downtoearth.org.in/archives/2010"),
                             ]:
            articles = self.make_links(url)
            if articles:
                feeds.append((title, articles))
        return feeds
        
    def make_links(self, url):
        title = 'Temp'
        current_articles = []
        soup = self.index_to_soup(url)
        #print 'The soup is: ', soup
        for item in soup.findAll('div',attrs={'class':'views-field-nothing-2'}):
         # print 'item is: ', item
         
         link = item.find('a')
         linkhref = link['href']
         split1 = linkhref.split("/")
         date  = split1[3]
         print 'DATE IS :', date
         print 'the link is: ', link
            
            
         if link:
          url         = self.INDEX + link['href']
                
          soup = self.index_to_soup(url) 
          #print 'NEW SOUP IS: ', soup
        
           
         for items in soup.findAll('div',attrs={'id':'PageContent'}):
          for nodes in items.findAll('a', href=re.compile('/node')):
            
            
            
            if nodes is not None and not re.search('Next Issue', str(nodes)) and not re.search('Previous Issue', str(nodes)):
             
                print 'LINK2 EX!!! and here is that link: ', nodes['href']
                url         = nodes['href']
                
                title       = self.tag_to_string(nodes)
                
                print 'the title is: ', title
                print 'the url is: ', url
                print 'the title is: ', title
                current_articles.append({'title': date + '--' + title, 'url': url, 'description':'', 'date':''}) # append all this
            
        return current_articles
      

   
    def print_version(self, url):
        split1 = url.split("/")
        print 'THE SPLIT IS: ', split1        
        print_url = 'http://downtoearth.org.in/print' + '/' + split1[2]
        print 'THIS URL WILL PRINT: ', print_url # this is a test string to see what the url is it will return
        return print_url
thanks it worked like a charm it just fetched 2 extra articles from the past issue rest was fine
bhandarisaurabh is offline  
Old 09-16-2010, 06:35 AM   #2726
marbs
Zealot
marbs began at the beginning.
 
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
So I need a little help

Some of my articles are not being downloaded.
It tell me that it cant download and run with -vv to see why. How do you run with -vv? or can anyone help me with my recipe?
Spoiler:

Code:
class AdvancedUserRecipe1283848012(BasicNewsRecipe):
    description   = 'TheMarker'
    cover_url      = 'http://static.ispot.co.il/wp-content/upload/2009/09/themarker.jpg'
    title          = u'The Marker1'
    language       = 'he'
    simultaneous_downloads = 1
    delay                  = 4    
    remove_javascript     = True
    timefmt        = '[%a, %d %b, %Y]'
    oldest_article = 1
    max_articles_per_feed = 1000
    extra_css='body{direction: rtl;} .article_description{direction: rtl; } a.article{direction: rtl; } .calibre_feed_description{direction: rtl; }'
    feeds          = [(u'Head Lines', u'http://www.themarker.com/tmc/content/xml/rss/hpfeed.xml'), (u'TA Market', u'http://www.themarker.com/tmc/content/xml/rss/sections/marketfeed.xml'), (u'Real Estate', u'http://www.themarker.com/tmc/content/xml/rss/sections/realEstaterfeed.xml'), (u'Wall Street & Global', u'http://www.themarker.com/tmc/content/xml/rss/sections/wallsfeed.xml'), (u'Law', u'http://www.themarker.com/tmc/content/xml/rss/sections/lawfeed.xml'), (u'Media', u'http://www.themarker.com/tmc/content/xml/rss/sections/mediafeed.xml'), (u'Consumer', u'http://www.themarker.com/tmc/content/xml/rss/sections/consumerfeed.xml'), (u'Career', u'http://www.themarker.com/tmc/content/xml/rss/sections/careerfeed.xml'), (u'Car', u'http://www.themarker.com/tmc/content/xml/rss/sections/carfeed.xml'), (u'High Tech', u'http://www.themarker.com/tmc/content/xml/rss/sections/hightechfeed.xml'), (u'Investor Guide', u'http://www.themarker.com/tmc/content/xml/rss/sections/investorGuidefeed.xml')]
    def print_version(self, url):
       baseURL=url.replace('tmc/article.jhtml?ElementId=', 'ibo/misc/printFriendly.jhtml?ElementId=%2Fibo%2Frepositories%2Fstories%2Fm1_2000%2F')
       s= baseURL + '.xml'
       return s

What am i doing wrong?
4 the
Attached Files
File Type: txt my recipe.txt (1.8 KB, 310 views)

Last edited by marbs; 09-16-2010 at 03:54 PM. Reason: I have no idea why it doesnt indent. The recipe is attached too.
marbs is offline  
Old 09-16-2010, 11:01 AM   #2727
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by marbs View Post
Some of my articles are not being downloaded.
It tell me that it cant download and run with -vv to see why. How do you run with -vv? or can anyone help me with my recipe?
Spoiler:

class AdvancedUserRecipe1283848012(BasicNewsRecipe):
description = 'TheMarker'
cover_url = 'http://static.ispot.co.il/wp-content/upload/2009/09/themarker.jpg'
title = u'The Marker1'
language = 'he'
simultaneous_downloads = 1
delay = 4
remove_javascript = True
timefmt = '[%a, %d %b, %Y]'
oldest_article = 1
max_articles_per_feed = 1000
extra_css='body{direction: rtl;} .article_description{direction: rtl; } a.article{direction: rtl; } .calibre_feed_description{direction: rtl; }'
feeds = [(u'Head Lines', u'http://www.themarker.com/tmc/content/xml/rss/hpfeed.xml'), (u'TA Market', u'http://www.themarker.com/tmc/content/xml/rss/sections/marketfeed.xml'), (u'Real Estate', u'http://www.themarker.com/tmc/content/xml/rss/sections/realEstaterfeed.xml'), (u'Wall Street & Global', u'http://www.themarker.com/tmc/content/xml/rss/sections/wallsfeed.xml'), (u'Law', u'http://www.themarker.com/tmc/content/xml/rss/sections/lawfeed.xml'), (u'Media', u'http://www.themarker.com/tmc/content/xml/rss/sections/mediafeed.xml'), (u'Consumer', u'http://www.themarker.com/tmc/content/xml/rss/sections/consumerfeed.xml'), (u'Career', u'http://www.themarker.com/tmc/content/xml/rss/sections/careerfeed.xml'), (u'Car', u'http://www.themarker.com/tmc/content/xml/rss/sections/carfeed.xml'), (u'High Tech', u'http://www.themarker.com/tmc/content/xml/rss/sections/hightechfeed.xml'), (u'Investor Guide', u'http://www.themarker.com/tmc/content/xml/rss/sections/investorGuidefeed.xml')]
def print_version(self, url):
baseURL=url.replace('tmc/article.jhtml?ElementId=', 'ibo/misc/printFriendly.jhtml?ElementId=%2Fibo%2Frepositorie s%2Fstories%2Fm1_2000%2F')
s= baseURL + '.xml'
return s

What am i doing wrong?
4 the
Search this thread for "ebook-convert" to see how to use -vv. If you want help on your recipe, post it inside code tags to preserve indents, which are required. (Thanks for using spoiler tags, but they aren't enough to preserve indents. Just edit your post, add code tags inside the spoilers and repaste your indented recipe.)
Starson17 is offline  
Old 09-16-2010, 02:41 PM   #2728
dred
Junior Member
dred began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Sep 2010
Device: Kindle
BMJ recipe??

Can anyone help me out with a recipe for the British Medical Journal?

The rss page is at http://www.bmj.com/rss/

Unfortunately it's a fairly basic feed, and doesn't even tell you much inside a dedicated news reader. Is it possible to download the attached articles as well as the 'headline' onto Calibre?

Thanks
dred is offline  
Old 09-16-2010, 03:35 PM   #2729
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Starson17,
Hey sorry to ask this question yet again. I simply am not understanding it yet even after reading the documentation and some of the code you have posted. Basically I'm wondering why this will not work...
Spoiler:

Code:
def preprocess_html(self, soup):
      for credit_tag in soup.findAll('span', attrs={'class':['imageCredit rightFloat']}):
       p = Tag(soup, 'p')
       span.replaceWith(p)
       p.insert(0, span)
      
      return soup


What I'm trying to do is search for all the span tags that contain imageCredit... and then make the span tag a <p> tag. so it will format it better.
As a result though I get no soup and the article is blank
Here is the full code. I was just trying to clean up the ajc recipe a little bit.
Spoiler:

[code]
class AdvancedUserRecipe1282101454(BasicNewsRecipe):
title = 'The AJC'
__author__ = 'TonytheBookworm'
description = 'News from Atlanta and USA'
publisher = 'The Atlanta Journal'
category = 'news, politics, USA'
oldest_article = 1
max_articles_per_feed = 100
no_stylesheets = True

masthead_url = 'http://gawand.org/wp-content/uploads/2010/06/ajc-logo.gif'
extra_css = '''
h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;}
h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;}
p{font-family:Arial,Helvetica,sans-serif;font-size:small;}
body{font-family:Helvetica,Arial,sans-serif;font-size:small;}
'''


keep_only_tags = [
dict(name='div', attrs={'class':['cxArticleHeader']})
,dict(attrs={'id':['cxArticleText']})
]


remove_tags = [
dict(name='div' , attrs={'class':'cxArticleList' })
,dict(name='div' , attrs={'class':'cxFeedTease' })
,dict(name='div' , attrs={'class':'cxElementEnlarge' })
,dict(name='div' , attrs={'id':'cxArticleTools' })
]



feeds = [
('Breaking News', 'http://www.ajc.com/genericList-rss.do?source=61499'),
# -------------------------------------------------------------------
# Here are the different area feeds. Choose which ever one you wish to
# read by simply removing the pound sign from it. I currently have it
# set to only get the Cobb area
# --------------------------------------------------------------------
#('Atlanta & Fulton', 'http://www.ajc.com/section-rss.do?source=atlanta'),
#('Clayton', 'http://www.ajc.com/section-rss.do?source=clayton'),
#('DeKalb', 'http://www.ajc.com/section-rss.do?source=dekalb'),
#('Gwinnett', 'http://www.ajc.com/section-rss.do?source=gwinnett'),
#('North Fulton', 'http://www.ajc.com/section-rss.do?source=north-fulton'),
#('Metro', 'http://www.ajc.com/section-rss.do?source=news'),
#('Cherokee', 'http://www.ajc.com/section-rss.do?source=cherokee'),
('Cobb', 'http://www.ajc.com/section-rss.do?source=cobb'),
#('Fayette', 'http://www.ajc.com/section-rss.do?source=fayette'),
#('Henry', 'http://www.ajc.com/section-rss.do?source=henry'),
#('Q & A', 'http://www.ajc.com/genericList-rss.do?source=77197'),
('Opinions', 'http://www.ajc.com/section-rss.do?source=opinion'),
('Ga Politics', 'http://www.ajc.com/section-rss.do?source=georgia-politics-elections'),
# ------------------------------------------------------------------------
# Here are the different sports feeds. I only follow the Falcons, and Highschool
# but again
# You can enable which ever team you like by removing the pound sign
# ------------------------------------------------------------------------
#('Sports News', 'http://www.ajc.com/genericList-rss.do?source=61510'),
#('Braves', 'http://www.ajc.com/genericList-rss.do?source=61457'),
('Falcons', 'http://www.ajc.com/genericList-rss.do?source=61458'),
#('Hawks', 'http://www.ajc.com/genericList-rss.do?source=61522'),
#('Dawgs', 'http://www.ajc.com/genericList-rss.do?source=61492'),
#('Yellowjackets', 'http://www.ajc.com/genericList-rss.do?source=61523'),
('Highschool', 'http://www.ajc.com/section-rss.do?source=high-school'),
('Events', 'http://www.accessatlanta.com/section-rss.do?source=events'),
('Music', 'http://www.accessatlanta.com/section-rss.do?source=music'),
]

def preprocess_html(self, soup):
for credit_tag in soup.findAll('span', attrs={'class':['imageCredit rightFloat']}):
p = Tag(soup, 'p')
span.replaceWith(p)
p.insert(0, span)

return soup

#def print_version(self, url):
# return url.partition('?')[0] +'?printArticle=y'
[/code
TonytheBookworm is offline  
Old 09-16-2010, 04:06 PM   #2730
marbs
Zealot
marbs began at the beginning.
 
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
Quote:
Originally Posted by Starson17 View Post
Search this thread for "ebook-convert" to see how to use -vv. If you want help on your recipe, post it inside code tags to preserve indents, which are required. (Thanks for using spoiler tags, but they aren't enough to preserve indents. Just edit your post, add code tags inside the spoilers and repaste your indented recipe.)
Thanks for the quick reply. So i fixed the code (it is indented now).
I was able to run the test. Found the output folder. What do I look at now??
thanks again

Last edited by marbs; 09-16-2010 at 04:35 PM.
marbs is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom column read ? pchrist7 Calibre 2 10-04-2010 02:52 AM
Archive for custom screensavers sleeplessdave Amazon Kindle 1 07-07-2010 12:33 PM
How to back up preferences and custom recipes? greenapple Calibre 3 03-29-2010 05:08 AM
Donations for Custom Recipes ddavtian Calibre 5 01-23-2010 04:54 PM
Help understanding custom recipes andersent Calibre 0 12-17-2009 02:37 PM


All times are GMT -4. The time now is 12:09 AM.


MobileRead.com is a privately owned, operated and funded community.