Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 02-04-2010, 08:13 AM   #1351
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
New recipe for digital spy UK:
Attached Files
File Type: zip digitalspy.zip (1.8 KB, 169 views)
kiklop74 is offline  
Old 02-04-2010, 08:21 AM   #1352
Denny_
Member
Denny_ began at the beginning.
 
Posts: 12
Karma: 42
Join Date: Jan 2010
Device: Kindle
keep_only_tags = [dict(attrs={'class':['print-title','print-subtitle','print-author','print-date-issue','print-content']})]

I put this in the recipe and it worked very nicely. However, the author and date are not coming through. Do I need to add something else?

Denny
Denny_ is offline  
Advert
Old 02-04-2010, 08:26 AM   #1353
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
OK try this one:

Code:
keep_only_tags = [dict(attrs={'class':['print-title','print-subtitle','print-author','author','print-date','print-date-issue','print-content']})]
kiklop74 is offline  
Old 02-04-2010, 08:52 AM   #1354
Denny_
Member
Denny_ began at the beginning.
 
Posts: 12
Karma: 42
Join Date: Jan 2010
Device: Kindle
Brilliant. That worked. Thank you.

BTW, what's the best method to capture the cover image when the url changes each time. In this case the url includes the volume number, issue number, and the date.

Denny
Denny_ is offline  
Old 02-04-2010, 09:07 AM   #1355
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by Denny_ View Post
Brilliant. That worked. Thank you.

BTW, what's the best method to capture the cover image when the url changes each time. In this case the url includes the volume number, issue number, and the date.

Denny
This isn't a cover but I think this will give you a nice masthead for your Kindle.

Code:
    masthead_url = 'http://www.weeklystandard.com/sites/all/themes/weeklystandard/images/logo_red.png'
DoctorOhh is offline  
Advert
Old 02-04-2010, 10:10 AM   #1356
Denny_
Member
Denny_ began at the beginning.
 
Posts: 12
Karma: 42
Join Date: Jan 2010
Device: Kindle
I had included "print-logo" in the recipe that shows at the beginning of each article but that's a nice way to just include it at the beginning on the Kindle.

Thanks,

Denny
Denny_ is offline  
Old 02-04-2010, 10:22 AM   #1357
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by Denny_ View Post
I had included "print-logo" in the recipe that shows at the beginning of each article but that's a nice way to just include it at the beginning on the Kindle.

Thanks,

Denny
When you zip it up to send to this forum include the icon in the zip. I've attached it for you.
Attached Files
File Type: zip weeklystandardicon.zip (1.6 KB, 158 views)
DoctorOhh is offline  
Old 02-04-2010, 01:21 PM   #1358
gianfri
Connoisseur
gianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura about
 
Posts: 59
Karma: 4212
Join Date: Feb 2010
Device: Sony
Topeka Capital Journal recipe

Hello,

I am totally new to the ebook world and try to learn. I would like to have a recipe for the Topeka Capital Journal (http://cjonline.com/). I tried the "easy" way but all I can get is garbage. Thank you for any help you can provide!

Gianfranco
gianfri is offline  
Old 02-04-2010, 01:46 PM   #1359
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
New recipe for Topeka Journal:
Attached Files
File Type: zip cjonline.zip (1.7 KB, 176 views)
kiklop74 is offline  
Old 02-04-2010, 02:37 PM   #1360
Denny_
Member
Denny_ began at the beginning.
 
Posts: 12
Karma: 42
Join Date: Jan 2010
Device: Kindle
Walt,

1. why include the icon
2. I'm having trouble copying my recipe from calibre to Notepad. The indents change and the recipe won't work when it's copied back to calibre.

Denny
Denny_ is offline  
Old 02-04-2010, 04:25 PM   #1361
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 327
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
The Register (biting the hand that feeds IT)

Recipe for The Register -- a UK Information Technology news site.

Code:
#!/usr/bin/env  python
__license__   = 'GPL v3'
__copyright__ = '2010, Nick Redding'
'''
www.theregister.co.uk
'''
import string, re
from calibre import strftime
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup
from datetime import timedelta, datetime, date


class TheRegister(BasicNewsRecipe):
    title = u'The Register'
    language = 'en_GB'
    __author__ = 'Nick Redding'
    oldest_article = 2
    timefmt = '' # '[%b %d]'
    needs_subscription = False
    keep_only_tags = [dict(name='div', attrs={'id':'article'})]
    #remove_tags_before = []
    remove_tags = [
		{'id':['related-stories','ad-mpu1-spot'] },
		{'class':['orig-url','article-nav','wptl btm','wptl top']}
		]
    #remove_tags_after = []

    no_stylesheets = True
    extra_css = '''
                h2 {font-size: x-large; }
                h3 {font-size: large; font-weight: bold; }
                .byline {font-size: x-small; }
                .dateline {font-size: x-small; }
                '''
    def get_browser(self):
        br = BasicNewsRecipe.get_browser()
        return br

    def get_masthead_url(self):
        masthead = 'http://www.theregister.co.uk/Design/graphics/std/logo_414_80.png'
        br = BasicNewsRecipe.get_browser()
        try:
            br.open(masthead)
        except:
            self.log("\nMasthead unavailable")
            masthead = None
        return masthead

    def preprocess_html(self,soup):
        # this removes the explicit url after links
        for span_tag in soup.findAll('span','URL'):
            span_tag.previous.replaceWith(re.sub("\ \($","",self.tag_to_string(span_tag.previous)))
            span_tag.next.next.replaceWith(re.sub("^\)","",self.tag_to_string(span_tag.next.next)))
            span_tag.extract()
        return soup
                                   

    def parse_index(self):

        def decode_date(datestr):
            udate = datestr.strip().lower().split()
            m = ['jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec'].index(udate[1])+1
            d = int(udate[0])
            y = date.today().year
            return date(y,m,d)


        articles = {}
        key = None
        ans = []

        def parse_index_page(page_name,page_title):

            def article_title(tag):
                atag = tag.find('a',href=True)
                return ''.join(atag.findAll(text=True, recursive=False)).strip()

            def article_date(tag):
                t = tag.find(True, {'class' : 'date'})
                if t:
                    return ''.join(t.findAll(text=True, recursive=False)).strip()
                return ''

            def article_summary(tag):
                t = tag.find(True, {'class' : 'standfirst'})
                if t:
                    return ''.join(t.findAll(text=True, recursive=False)).strip()
                return ''

            def article_url(tag):
                atag = tag.find('a',href=True)
                url = atag['href']
                return url

            mainurl = 'http://www.theregister.co.uk'
            soup = self.index_to_soup(mainurl+page_name)
            # Find each instance of class="section-headline", class="story", class="story headline"
            for div in soup.findAll('div',attrs={'class':re.compile('^story-ref')}):
                # div contains all article data

                # check if article is too old
                datetag = div.find('span','date')
                if datetag:
                    dateline_string = self.tag_to_string(datetag,False)
                    a_date = decode_date(dateline_string)
                    earliest_date = date.today() - timedelta(days=self.oldest_article)
                    if a_date < earliest_date:
                        self.log("Skipping article dated %s" % dateline_string)
                        continue


                url = article_url(div)
                if 'http' in url:
                    continue
                url = mainurl + url + 'print.html'
                self.log("URL %s" % url)
                title = article_title(div)
                self.log("Title %s" % title)
                pubdate = article_date(div)
                self.log("Date %s" % pubdate)
                description = article_summary(div)
                self.log("Description %s" % description)
                author = ''
                if not articles.has_key(page_title):
                    articles[page_title] = []
                articles[page_title].append(dict(title=title,url=url,date=pubdate,description=description,author=author,content=''))


        parse_index_page('','Front Page')
        ans.append('Front Page')
        parse_index_page('/hardware','Hardware')
        ans.append('Hardware')
        parse_index_page('/software','Software')
        ans.append('Software')
        parse_index_page('/music_media','Music & Media')
        ans.append('Music & Media')
        parse_index_page('/networks','Networks')
        ans.append('Networks')
        parse_index_page('/security','Security')
        ans.append('Security')
        parse_index_page('/public_sector','Public Sector')
        ans.append('Public Sector')
        parse_index_page('/business','Business')
        ans.append('Business')
        parse_index_page('/science','Science')
        ans.append('Science')
        parse_index_page('/odds','Odds & Sods')
        ans.append('Odds & Sods')
        ans = [(key, articles[key]) for key in ans if articles.has_key(key)]
        return ans
nickredding is offline  
Old 02-04-2010, 04:26 PM   #1362
gianfri
Connoisseur
gianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura aboutgianfri has a spectacular aura about
 
Posts: 59
Karma: 4212
Join Date: Feb 2010
Device: Sony
Wow! Thanks!

Quote:
Originally Posted by kiklop74 View Post
New recipe for Topeka Journal:
Just amazing, for a newbie like me. Thanks!!
gianfri is offline  
Old 02-04-2010, 07:20 PM   #1363
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by Denny_ View Post
Walt,

1. why include the icon
2. I'm having trouble copying my recipe from calibre to Notepad. The indents change and the recipe won't work when it's copied back to calibre.

Denny
I use notepad++ it's free and will keep the spaces. Although sometimes it puts in a tab instead of spaces.

You can just paste the code in a post and wrap it in code tags (the #).
DoctorOhh is offline  
Old 02-04-2010, 07:22 PM   #1364
JIGACE
Member
JIGACE began at the beginning.
 
Posts: 21
Karma: 10
Join Date: Jul 2008
Device: EZ Reader Pocket Pro
Thumbs up thanks

Quote:
Originally Posted by kiklop74 View Post
New recipe for Read It Later website:
thanks for the recipe I was looking for one for this site, I tried to do it myself but I dont know nothing about programming... just 2 questions, how do I change the default image? and its there a way to show the pictures of the snips saved on read it later (retrieves only text) thank you.,

JIGACE is offline  
Old 02-04-2010, 09:18 PM   #1365
srvean
Junior Member
srvean began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2010
Device: none
Quote:
Originally Posted by kiklop74 View Post
You can accomplish that task by using instapaper.com. Calibre has a recipe for that site. Go to the website, register and start adding articles you want to read. Once you are ready download them using calibre instapaper recipe. No coding involved at all.
Thanks for the tip & it works 70% of the time. Problem is with RSS feeds. Occasionally I want to use RSS feed from a Blog or a discussion board and my fetch may not repeat more than once. Instapaper solution on RSS feed will not work as I cannot ask Calibre to do a recessive get from Instapaper recipe.
srvean is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom column read ? pchrist7 Calibre 2 10-04-2010 02:52 AM
Archive for custom screensavers sleeplessdave Amazon Kindle 1 07-07-2010 12:33 PM
How to back up preferences and custom recipes? greenapple Calibre 3 03-29-2010 05:08 AM
Donations for Custom Recipes ddavtian Calibre 5 01-23-2010 04:54 PM
Help understanding custom recipes andersent Calibre 0 12-17-2009 02:37 PM


All times are GMT -4. The time now is 07:07 AM.


MobileRead.com is a privately owned, operated and funded community.