Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 09-06-2010, 09:01 AM   #2656
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by poloman View Post
TonytheBookworm - sorry for sounding stupid, but in the code for TheDailyMash where you have print statements - where does it print to?
In ebook-convert recipename.recipe foldername --test -vv>recipename.txt it's in the .txt file. Alternatively, start the GUI with calibre-debug -g and it will appear there.
Starson17 is offline  
Old 09-06-2010, 10:55 AM   #2657
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by poloman View Post
TonytheBookworm - sorry for sounding stupid, but in the code for TheDailyMash where you have print statements - where does it print to? I can't see an output anywhere when Calibre is running - is it piped to a file, or does it flash by in the current job status screen?


ps - i used this to get rid of the links at the end of the articles - can probably bin the 'object' and other tags, but it works and I'm (slowly) learning!

remove_tags = [
dict(name=['object','link','script','span','iframe','hr'])
,dict(name='a', attrs={'alt':['Digg!','StumbleUpon!','Reddit!','Facebook!']})
,dict(name='a', attrs={'title':['Digg!','StumbleUpon!','Reddit!','Facebook!']})
]
I'm just now learning the stuff myself too as for the print statements that is a neat little trick Starson17 introduced to me. What you do is when building your recipes you use the following command line.
c:\Program Files\Calibre2>ebook-convert recipenamehere.recipe output_dir --test -vv > myrecipe.txt

so when you run that and you have for instance
print 'HERE IS WHAT WILL PRINT: ', print_url

then it will take and output to the file myrecipe.txt what the print_url is and you can find it easily in the log by looking for HERE IS WHAT WILL PRINT. It keeps you from having to guess per say.
TonytheBookworm is offline  
Old 09-06-2010, 12:08 PM   #2658
poloman
Enthusiast
poloman began at the beginning.
 
Posts: 25
Karma: 10
Join Date: Nov 2008
Device: PRS505, Kindle 3G
oh I see - I didn't realise there was the command line option - I'd been running them from the GUI! I'll give your tips a try - thanks!
poloman is offline  
Old 09-06-2010, 12:27 PM   #2659
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by poloman View Post
oh I see - I didn't realise there was the command line option - I'd been running them from the GUI! I'll give your tips a try - thanks!
My favorite sites for recipe info:

Read this and this and this and this.
Starson17 is offline  
Old 09-06-2010, 09:34 PM   #2660
bhandarisaurabh
Enthusiast
bhandarisaurabh began at the beginning.
 
Posts: 49
Karma: 10
Join Date: Aug 2009
Device: none
recipe for fast company has been already made but can anyone make the recipe for print edition using this link
http://www.fastcompany.com/magazine/148
thanks in advance
bhandarisaurabh is offline  
Old 09-06-2010, 10:05 PM   #2661
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by bhandarisaurabh View Post
recipe for fast company has been already made but can anyone make the recipe for print edition using this link
http://www.fastcompany.com/magazine/148
thanks in advance
Print edition? As in subscribed? Or As in whats on the page as you see it? Or the rss link that is over on the right hand side?

If your calling the "print edition" what you see currently on the screen when you go to that link I don't see the point in doing it. Because each month/week that the issue changes you are going to have to change the feed reference from 148 to Nth Or am I'm missing your question completely ?
TonytheBookworm is offline  
Old 09-07-2010, 03:54 AM   #2662
somedayson
Member
somedayson began at the beginning.
 
Posts: 13
Karma: 10
Join Date: Sep 2010
Device: K3
Thanks to both cynvision and TonytheBookworm for their helpful recipes and posts. My wife laughed uncontrollably when I said I'm trying to learn computer programming.

cynvision--the rss feeds are hidden. If you click on the red tab across the top, they show up in each section.

I really appreciate your help and have been able to use your work to customize some other things I'm interested in.

Grateful for the 178 pages here that I read through, and each of you helpful people,

Thanks so much!
Matt
somedayson is offline  
Old 09-07-2010, 09:06 AM   #2663
poloman
Enthusiast
poloman began at the beginning.
 
Posts: 25
Karma: 10
Join Date: Nov 2008
Device: PRS505, Kindle 3G
Starson17 - those links have rapidly been bookmarked - very useful! Thanks for posting them.
poloman is offline  
Old 09-07-2010, 10:08 AM   #2664
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by poloman View Post
Starson17 - those links have rapidly been bookmarked - very useful! Thanks for posting them.
You're welcome. I go back to them repeatedly, especially the recipe API and Beautiful Soup. Whenever I want to learn how to actually use something I see in the API or BS, I search for it in the built-in recipes in resources/recipes.
Starson17 is offline  
Old 09-07-2010, 10:28 AM   #2665
JvdW
Zealot
JvdW doesn't litterJvdW doesn't litter
 
Posts: 115
Karma: 150
Join Date: Jul 2008
Location: Netherlands Veenendaal
Device: Palm T5, Sony PRS-505, Nook Color
Quote:
Originally Posted by kovidgoyal View Post
@JvdW: nrcnext uses the parse index function to get a list of articles and the website has changed, so it fails. Unfortunately, as I don't read Dutch, it's hard for me to fix.
Thanks for the parse index function pointer. After reading up a bit and pasting part of the recipe into the editor of PortablePython I managed to find the problematic line:
Code:
for post in soup.findAll(True, attrs={'class' : 'post'}) :
should be:
Code:
for post in soup.findAll(True, attrs={'class' : 'post '}) :
Note the space after post!

This fixes this recipe but one problem remains now that it generates an epub.
How to get rid of the comments section. Can't remember that I had those a couple of weeks ago so that part must have changed also but adding things like:
remove_tags.append(dict(name = 'h3', attrs = {'class' : 'reacties'}))
doesn't seem to help much. But looking a bit further and experimenting I found the ideal mix. The complete recipe is as follows:
Spoiler:

Code:
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup, Tag

class NrcNextRecipe(BasicNewsRecipe):
    __license__  = 'GPL v3'
    __author__ = 'kwetal'
    language = 'nl'
    country = 'NL'
    version = 2

    title = u'nrcnext'
    publisher = u'NRC Media'
    category = u'News, Opinion, the Netherlands'
    description = u'Dutch newsblog from the Dutch daily newspaper nrcnext.'

    conversion_options = {'comments': description, 'language': language, 'publisher': publisher}

    no_stylesheets = True
    remove_javascript = True

    keep_only_tags = [dict(name='div', attrs={'id' : 'main'})]

    remove_tags = []
    remove_tags.append(dict(name = 'div', attrs = {'class' : 'meta'}))
    remove_tags.append(dict(name = 'p', attrs = {'class' : 'meta'}))
    remove_tags.append(dict(name = 'div', attrs = {'class' : 'datumlabel'}))
    remove_tags.append(dict(name = 'div', attrs = {'class' : 'sharing-is-caring'}))
    remove_tags.append(dict(name = 'div', attrs = {'class' : 'navigation'}))
    remove_tags.append(dict(name = 'div', attrs = {'class' : 'reageer'}))
    remove_tags.append(dict(name = 'div', attrs = {'class' : 'comment odd alt thread-odd thread-alt depth-1 reactie '}))
    remove_tags.append(dict(name = 'div', attrs = {'class' : 'comment even thread-even depth-1 reactie '}))
    remove_tags.append(dict(name = 'ul', attrs = {'class' : 'cats single'}))
    remove_tags.append(dict(name = 'ul', attrs = {'class' : 'cats onderwerpen'}))
    remove_tags.append(dict(name = 'ul', attrs = {'class' : 'cats rubrieken'}))
    remove_tags.append(dict(name = 'h3', attrs = {'class' : 'reacties'}))

	

    extra_css = '''
                body {font-family: verdana, arial, helvetica, geneva, sans-serif; text-align: left;}
                p.wp-caption-text {font-size: x-small; color: #666666;}
                h2.sub_title {font-size: medium; color: #696969;}
                h2.vlag {font-size: small; font-weight: bold;}
                '''

    def parse_index(self) :
        # Use the wesbite as an index. Their RSS feeds can be out of date.
        feeds = {}
        feeds[u'columnisten'] = u'http://www.nrcnext.nl/columnisten/'
        feeds[u'koken'] = u'http://www.nrcnext.nl/koken/'
        feeds[u'geld & werk'] = u'http://www.nrcnext.nl/geld-en-werk/'
        feeds[u'vandaag'] = u'http://www.nrcnext.nl'
        # feeds[u'city life in afrika']  = u'http://www.nrcnext.nl/city-life-in-afrika/'
        answer = []
        articles = {}
        indices = []

        for index, feed in feeds.items() :
            soup = self.index_to_soup(feed)
            for post in soup.findAll(True, attrs={'class' : 'post '}) :
                # Find the links to the actual articles and rember the location they're pointing to and the title
                a = post.find('a', attrs={'rel' : 'bookmark'})
                href = a['href']
                title = self.tag_to_string(a)
                if index == 'columnisten' :
                    # In this feed/page articles can be written by more than one author.
                    # It is nice to see their names in the titles.
                    flag = post.find('h2', attrs = {'class' : 'vlag'})
                    author = flag.contents[0].renderContents()
                    completeTitle = u''.join([author, u': ', title])
                else :
                    completeTitle = title

                # Add the article to a temporary list
                article = {'title' : completeTitle, 'date' : u'', 'url'  : href, 'description' : '<p>&nbsp;</p>'}
                if not articles.has_key(index) :
                    articles[index] = []
                articles[index].append(article)

            # Add the index title to a temporary list
            indices.append(index)

        # Now, sort the temporary list of feeds in the order they appear on the website
        # indices = self.sort_index_by(indices, {u'columnisten' : 1, u'koken' : 3, u'geld & werk' : 2, u'vandaag' : 0, u'city life in afrika' : 4})
        indices = self.sort_index_by(indices, {u'columnisten' : 1, u'koken' : 3, u'geld & werk' : 2, u'vandaag' : 0})
        # Apply this sort order to the actual list of feeds and articles
        answer = [(key, articles[key]) for key in indices if articles.has_key(key)]

        return answer

    def preprocess_html(self, soup) :
        if soup.find('div', attrs = {'id' : 'main', 'class' : 'single'}):
            tag = soup.find('div', attrs = {'class' : 'post'})
            if tag:
                h2 = tag.find('h2', 'vlag')
                if h2:
                    new_h2 = Tag(soup, 'h2', attrs = [('class', 'vlag')])
                    new_h2.append(self.tag_to_string(h2))
                    h2.replaceWith(new_h2)
                else:
                    h2 = tag.find('h2')
                    if h2:
                        new_h2 = Tag(soup, 'h2', attrs = [('class', 'sub_title')])
                        new_h2.append(self.tag_to_string(h2))
                        h2.replaceWith(new_h2)

                h1 = tag.find('h1')
                if h1:
                    new_h1 = Tag(soup, 'h1')
                    new_h1.append(self.tag_to_string(h1))
                    h1.replaceWith(new_h1)

                # Slows down my reader.
                for movie in tag.findAll('span', attrs = {'class' : 'vvqbox vvqvimeo'}):
                    movie.extract()
                for movie in tag.findAll('span', attrs = {'class' : 'vvqbox vvqyoutube'}):
                    movie.extract()
                for iframe in tag.findAll('iframe') :
                    iframe.extract()

                fresh_soup = self.getFreshSoup(soup)
                fresh_soup.body.append(tag)

                return fresh_soup
            else:
                # This should never happen and other famous last words...
                return soup

    def getFreshSoup(self, oldSoup):
        freshSoup = BeautifulSoup('<html><head><title></title></head><body></body></html>')
        if oldSoup.head.title:
            freshSoup.head.title.append(self.tag_to_string(oldSoup.head.title))
        return freshSoup


Kovid, will you please update the builtin recipe with this updated code?

Thanks in advance,

Joop
JvdW is offline  
Old 09-07-2010, 10:50 AM   #2666
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by Starson17 View Post
You're welcome. I go back to them repeatedly, especially the recipe API and Beautiful Soup. Whenever I want to learn how to actually use something I see in the API or BS, I search for it in the built-in recipes in resources/recipes.
Yeah, I have been doing like you stated with Ultra Edit. I will read the api or beautiful soup doc and then I'll do that find feature (i love that thing by the way) and then look at how you and Kovid and others did it and then scratch my head in some cases and try to make an educated question as a result Amazing stuff gotta say. And I just discovered the content server on calibre. For a while I thought it only told me what I had as in go to that page and see the books I have. I thought to myself okay i guess someone wants to see what others are reading. But low and behold you can actually download the book/blog whatever directly from your pc. Now that is killer I just take and schedule the news to download then use the kindle wherever i am and pull it from my calibre at home.
TonytheBookworm is offline  
Old 09-07-2010, 11:20 AM   #2667
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,210
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@JdvW: Done
kovidgoyal is online now  
Old 09-07-2010, 11:40 AM   #2668
poloman
Enthusiast
poloman began at the beginning.
 
Posts: 25
Karma: 10
Join Date: Nov 2008
Device: PRS505, Kindle 3G
Tony - I hope I didnt misunderstand your post - but did you know that you can set up an email address in Calibre and have it automatically send any news items to your free kindle address, so it will automatically download the sent news whenever you're in a wifi zone? Saves leaving the content server running too.

I have 15 feeds set up to run at 6am, then, as I'm getting ready to commute, without taking the Kindle out of my bag, I flick the power switch - it downloads everything so that when I sit down on the train, I have all the latest news at my fingertips - gotta love Calibre!

edit: doh - you have a kindle 2 - in that case, ignore me!
poloman is offline  
Old 09-07-2010, 11:54 AM   #2669
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by poloman View Post
Tony - I hope I didnt misunderstand your post - but did you know that you can set up an email address in Calibre and have it automatically send any news items to your free kindle address, so it will automatically download the sent news whenever you're in a wifi zone? Saves leaving the content server running too.

I have 15 feeds set up to run at 6am, then, as I'm getting ready to commute, without taking the Kindle out of my bag, I flick the power switch - it downloads everything so that when I sit down on the train, I have all the latest news at my fingertips - gotta love Calibre!

edit: doh - you have a kindle 2 - in that case, ignore me!
I have both. But I thought on the free kindle address it does just that. Takes and emails it the kindle (free). Then What are you doing going to say me@gmail.com on your kindle and then downloading it that way ? Never used the email part of it cause i didn't wanna be charged a data fee and wasn't sure if you could download the content from an email address. If i can download the content from an email address then that would be fine.. Fill me in on what you do exactly since you say you do this all the time. And I assume it doesn't cost you anything cause your on wifi
TonytheBookworm is offline  
Old 09-07-2010, 12:02 PM   #2670
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by TonytheBookworm View Post
Fill me in on what you do exactly since you say you do this all the time.
Everyone is so friendly here, I know it's a temptation to stray off topic, but you probably should go to another thread with this. The recipe thread is tough to wade through as it is with all the lengthy recipes. <GRIN>
Starson17 is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom column read ? pchrist7 Calibre 2 10-04-2010 02:52 AM
Archive for custom screensavers sleeplessdave Amazon Kindle 1 07-07-2010 12:33 PM
How to back up preferences and custom recipes? greenapple Calibre 3 03-29-2010 05:08 AM
Donations for Custom Recipes ddavtian Calibre 5 01-23-2010 04:54 PM
Help understanding custom recipes andersent Calibre 0 12-17-2009 02:37 PM


All times are GMT -4. The time now is 12:44 PM.


MobileRead.com is a privately owned, operated and funded community.