Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 02-09-2011, 04:11 PM   #1
bleavett
Junior Member
bleavett began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Feb 2011
Device: Kindle 3
Recipe for The World Today (Chatham House)

Chatham House is one of the better known foreign policy think-tanks in London. Their monthly publication is called The World Today.

I didn't have to do much to get it to work, but I had real problems trying to format the text so it reads a bit better (eg split up the paragraphs). Perhaps someone could help? My attempt at this is in the recipe (attached) still but commented out.

Code:
__license__   = 'GPL v3'

class ChathamHouseTheWorldToday(BasicNewsRecipe):
    title          = u'Chatham House: The World Today'
    oldest_article = 40
    max_articles_per_feed = 100

    publisher = u'Chatham House'
    __author__ = u'Ben Leavett'
    comments = u'Calibre recipe by Ben Leavett'

    feeds          = [(u'The World Today', u'http://www.chathamhouse.org.uk/rss/16/')]

    # full content is in the RSS feed
    use_embedded_content = True

    page_with_cover_img = u'http://www.chathamhouse.org.uk/publications/twt/'

    '''
    Insert some line breaks into the HTML.
    '''
    def preprocess_html(self, soup):
        ''' BJL: this is intended to add in some line breaks 
where it finds '\n' characters. It successfully builds 'newspan' 
but the final call to 'replaceWith' only results in clearing the 
contents of 'it', it doesn't then do the insert part of the replace.

for it in soup.findAll('span'):
            # If we find at least one '\n' character in this span
            if it.string.find('\n') > -1:
                lines = it.string.split('\n')
                newspan = Tag(soup, 'span')

                i=0
                for line in lines:
                    p = Tag(soup, 'p')
                    p.insert(0, NavigableString(line))
                    newspan.insert(i, p)
                    i+=1
                
                it.replaceWith(newspan)
'''
                
        return soup

    def postprocess_html(self, soup, first_fetch):
        return soup

    def get_cover_url(self):
        soup = self.index_to_soup(self.page_with_cover_img)
        node = soup.find('div', {'id' : 'contentInner_subpage'}).h2.img

        self.log('Found cover URL: ' + node['src'])
        return node['src']

    def get_masthead_url(self):
        return u'http://www.chathamhouse.org.uk/images/main_logo.gif'
Attached Files
File Type: zip world_today.zip (956 Bytes, 143 views)

Last edited by bleavett; 02-10-2011 at 05:01 PM.
bleavett is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Recipe works when mocked up as Python file, fails when converted to Recipe ode Recipes 7 09-04-2011 04:57 AM
Big House vs Small House Publishing Steven Lake Writers' Corner 36 06-14-2011 08:45 AM
Need recipe for Christianity Today Men of Integrity Newsletter negiarcian Recipes 0 01-22-2011 12:04 AM
2 more Kindle 2 stories: USA Today and PC World jj2me News 4 02-26-2009 11:35 AM
World eBook fair starts today DaleDe Deals and Resources (No Self-Promotion or Affiliate Links) 5 07-19-2008 04:10 AM


All times are GMT -4. The time now is 08:45 PM.


MobileRead.com is a privately owned, operated and funded community.