Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 05-05-2010, 09:29 AM   #1891
Tumaini
Junior Member
Tumaini began at the beginning.
 
Posts: 8
Karma: 10
Join Date: May 2010
Device: Bebook One (Hanlin v3)
Arbetaren (Swedish socialist newspaper, works great!)

Code:
class Arbetaren_SE(BasicNewsRecipe):
    title          = u'Arbetaren'
    __author__            = 'Joakim Lindskog'
    description           = 'Nyheter från Arbetaren'
    publisher             = 'Arbetaren'
    category              = 'news, politics, socialism, Sweden'
    oldest_article        = 7
    delay                 = 1
    max_articles_per_feed = 100
    no_stylesheets        = True
    use_embedded_content  = False
    encoding              = 'utf-8'
    language              = 'sv'

    conversion_options = {
                          'comment'   : description
                        , 'tags'      : category
                        , 'publisher' : publisher
                        , 'language'  : language
                        }

    keep_only_tags = [dict(name='div', attrs={'id':'article'})]
    remove_tags_before = dict(name='div', attrs={'id':'article'})
    remove_tags_after = dict(name='p',attrs={'id':'byline'})
    remove_tags = [
                     dict(name=['object','link','base']),
                     dict(name='p', attrs={'class':'print'}),
                     dict(name='a', attrs={'class':'addthis_button_compact'}),
                     dict(name='script')
                  ]

    feeds          = [(u'Nyheter', u'http://www.arbetaren.se/rss/arbetaren.rss?rev=123')]
Tumaini is offline  
Old 05-05-2010, 10:38 AM   #1892
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
Quote:
Originally Posted by mobilewilier View Post
Hi Kiklop

Would you be so kind as to start me off with a recipe for the South China Morning Post?

www.scmp.com
This site has a very complicated logon procedure. I have no time to work on that here is a starting point for you. You only need to resolve logon. The rest is done.
Attached Files
File Type: zip scmp.com.zip (3.8 KB, 174 views)
kiklop74 is offline  
Old 05-05-2010, 09:27 PM   #1893
mobilewilier
Connoisseur
mobilewilier ought to be getting tired of karma fortunes by now.mobilewilier ought to be getting tired of karma fortunes by now.mobilewilier ought to be getting tired of karma fortunes by now.mobilewilier ought to be getting tired of karma fortunes by now.mobilewilier ought to be getting tired of karma fortunes by now.mobilewilier ought to be getting tired of karma fortunes by now.mobilewilier ought to be getting tired of karma fortunes by now.mobilewilier ought to be getting tired of karma fortunes by now.mobilewilier ought to be getting tired of karma fortunes by now.mobilewilier ought to be getting tired of karma fortunes by now.mobilewilier ought to be getting tired of karma fortunes by now.
 
Posts: 53
Karma: 496648
Join Date: May 2010
Device: Sony PRS-600
Quote:
Originally Posted by kiklop74 View Post
This site has a very complicated logon procedure. I have no time to work on that here is a starting point for you. You only need to resolve logon. The rest is done.
Thanks so much... it works!!! many many thanks

WL

Last edited by mobilewilier; 05-05-2010 at 10:57 PM.
mobilewilier is offline  
Old 05-05-2010, 11:33 PM   #1894
Krittika Goyal
Vox calibre
Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.
 
Krittika Goyal's Avatar
 
Posts: 412
Karma: 1175230
Join Date: Jan 2009
Device: Sony reader prs700, kobo
I had this request on facebook. if someone can do it cause I am a little busy rt now..
The East Bay Express. http://www.eastbayexpress.com/ebx/Home

Thanks
Krittika Goyal is offline  
Old 05-07-2010, 09:11 AM   #1895
smargo
Member
smargo began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2007
Location: Switzerland
Device: Kindle Voyage, Kobo
Problem with my recipe for "Kommersant" Russian daily

Hi, I am trying to make a simple recipe for the best russian language newspaper Kommersant.
Here is the recipe:

Code:
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe1272297716(BasicNewsRecipe):
    title          = u'Kommersant'
    oldest_article = 7
    max_articles_per_feed = 100

    feeds          = [(u'Kommersant', u'http://feeds.kommersant.ru/RSS_Export/RU/daily.xml')]


    



def print_version(self,url):

        segments = url.split('=')
        article_id = segments[1]
        newurl = 'http://www.kommersant.ru/doc-rss.aspx?DocsID=' + article_id + '&print=true'

        return newurl
but it fails with the following error log (below).

what am i doing wrong ? Thanks!

Code:
ERROR: Conversion Error: <b>Failed</b>: Fetch news from Kommersant
It seems to download the articles fine:
Code:
Fetching http://www.kommersant.ru/doc-rss.aspx?DocsID=1365119
Downloaded article: Tencent разложила DST на активы // Mail.ru, "Вконтакте" и "Одноклассникам" прописали мультипликаторы from http://www.kommersant.ru
but then fails:

Code:
lxml.etree.XMLSyntaxError: Failed to parse QName 'font-size:', line 33, column 3710
smargo is offline  
Old 05-07-2010, 10:14 AM   #1896
olaf
Enthusiast
olaf is on a distinguished road
 
Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
When running a job to create a Kindle file from a recipe, I often look at the job details to see what progress is being made. Is there any way to save the column widths of the Job Details screen? Each time I go in, I need to expand the columns to see the detail I'm looking at. It would be nice to customize the column widths and have them stay fixed after that. (The total screen size of that panel as well)
olaf is offline  
Old 05-07-2010, 12:03 PM   #1897
smargo
Member
smargo began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2007
Location: Switzerland
Device: Kindle Voyage, Kobo
Kommersant

OK, now it's generally working,
Code:
from calibre.web.feeds.news import BasicNewsRecipe
class KommersantRecipe(BasicNewsRecipe):
    title          = u'Kommersant'
    oldest_article = 7
    max_articles_per_feed = 100
    feeds          = [(u'Kommersant', u'http://feeds.kommersant.ru/RSS_Export/RU/daily.xml')]

    def print_version(self,url):
       segments = url.split('=')
       article_id = segments[1]
       newurl = 'http://www.kommersant.ru/doc.aspx?DocsID=' + article_id + '&print=true'
       return newurl
but when I read it on Kindle, pagination does not work. When I am on the first page and press "Next pages" the aricle is skipped to the last page. What can be the problem?
Thanks all!
smargo is offline  
Old 05-07-2010, 12:15 PM   #1898
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,839
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@smargo: Use

conversion_options = {'linearize_tables':True}
kovidgoyal is offline  
Old 05-07-2010, 12:45 PM   #1899
smargo
Member
smargo began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2007
Location: Switzerland
Device: Kindle Voyage, Kobo
Kommersant

@kovidgoyal

Thanks! It workes!

Some cosmetics remains to be done, but I am happy.
smargo is offline  
Old 05-07-2010, 12:53 PM   #1900
Raoul O'Malley
Junior Member
Raoul O'Malley began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Feb 2010
Device: kindle
Internation Herald Tribune - Euro edition

is there a way to get the recipe for the IHT - Euro edition?

thanks so much
Raoul O'Malley is offline  
Old 05-08-2010, 04:01 AM   #1901
PaxtonReader
Member
PaxtonReader began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Apr 2010
Device: Kindle 2 Global
When I send a book back to my Kindle, is there a way to keep the original file name, without embedding the author's name to a shortened version?
PaxtonReader is offline  
Old 05-08-2010, 12:49 PM   #1902
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Discover Magazine recipe

Multipage implemented for multiple page articles,
New feeds,
Miscellaneous advertising and junk removed.
Code:
#!/usr/bin/env  python
__license__   = 'GPL v3'
__copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net'
__docformat__ = 'restructuredtext en'

'''
discovermagazine.com
'''

import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup, Tag

class DiscoverMagazine(BasicNewsRecipe):

    title = u'Discover Magazine'
    description = u'Science, Technology and the Future' 
    __author__ = 'Starson17' 
    language = 'en'

    oldest_article = 33
    max_articles_per_feed = 20
    no_stylesheets = True
    remove_javascript = True
    use_embedded_content  = False
    encoding = 'utf-8'
    extra_css = '.headline {font-size: x-large;} \n .fact {padding-top: 10pt}'
    
    remove_tags = [
                   dict(name='div', attrs={'id':['searchModule', 'mainMenu', 'tool-box']}),
                   dict(name='div', attrs={'id':['footer','teaser','already-subscriber','teaser-suite','related-articles']}),
                   dict(name='div', attrs={'class':['column']}),
                   dict(name='img', attrs={'src':'http://discovermagazine.com/onebyone.gif'})]

    remove_tags_after = [dict(name='div', attrs={'class':'listingBar'})]
   
    def append_page(self, soup, appendtag, position):
        pager = soup.find('span',attrs={'class':'next'})
        if pager:
           nexturl = pager.a['href']
           soup2 = self.index_to_soup(nexturl)
           texttag = soup2.find('div', attrs={'class':'articlebody'})
           newpos = len(texttag.contents)          
           self.append_page(soup2,texttag,newpos)
           texttag.extract()
           appendtag.insert(position,texttag)
    
    def preprocess_html(self, soup):
        mtag = '<meta http-equiv="Content-Language" content="en-US"/>\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>'
        soup.head.insert(0,mtag)    
        self.append_page(soup, soup.body, 3)
        pager = soup.find('div',attrs={'class':'listingBar'})
        if pager:
           pager.extract()        
        return soup
        
    def postprocess_html(self, soup, first_fetch):
        for tag in soup.findAll(text=re.compile('^This article is a sample')):
            tag.parent.extract()
        for tag in soup.findAll(['table', 'tr', 'td']):
            tag.name = 'div'
        for tag in soup.findAll('div', attrs={'class':'discreet advert'}):
            tag.extract()
        for tag in soup.findAll('hr', attrs={'size':'1'}):
            tag.extract()
        for tag in soup.findAll('br'):
            tag.extract()
        return soup        
 
    feeds = [
             (u'Technology', u'http://discovermagazine.com/topics/technology/rss.xml'), 
             (u'Health - Medicine', u'http://discovermagazine.com/topics/health-medicine/rss.xml'), 
             (u'Mind Brain', u'http://discovermagazine.com/topics/mind-brain/rss.xml'), 
             (u'Space', u'http://discovermagazine.com/topics/space/rss.xml'), 
             (u'Human Origins', u'http://discovermagazine.com/topics/human-origins/rss.xml'), 
             (u'Living World', u'http://discovermagazine.com/topics/living-world/rss.xml'), 
             (u'Environment', u'http://discovermagazine.com/topics/environment/rss.xml'), 
             (u'Physics & Math', u'http://discovermagazine.com/topics/physics-math/rss.xml'), 
             (u"20 Things you didn't know about...", u'http://discovermagazine.com/columns/20-things-you-didnt-know/rss.xml'), 
             (u'Fuzzy Math', u'http://discovermagazine.com/columns/fuzzy-math/rss.xml'), 
             (u'The Brain', u'http://discovermagazine.com/columns/the-brain/rss.xml'), 
             (u'What is This', u'http://discovermagazine.com/columns/what-is-this/rss.xml'),
             (u'Vital Signs', u'http://discovermagazine.com/columns/vital-signs/rss.xml'), 
             (u'Think Tech', u'http://discovermagazine.com/columns/think-tech/rss.xml'),
             (u'Future Tech', u'http://discovermagazine.com/columns/future-tech/rss.xml'),
             (u'Discover Interview', u'http://discovermagazine.com/columns/discover-interview/rss.xml'),
            ]
Starson17 is offline  
Old 05-10-2010, 09:32 AM   #1903
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
Russian news pack:
Kommersant
Izvestia
Ria Novosti
Argumenti & fakti
Attached Files
File Type: zip russia_news.zip (5.6 KB, 157 views)
kiklop74 is offline  
Old 05-10-2010, 12:30 PM   #1904
smargo
Member
smargo began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Aug 2007
Location: Switzerland
Device: Kindle Voyage, Kobo
@kiklop74

Your Kommersant recipe is great, thanks for this Russian pack!

Small wish: in the bottom of certain articles in Kommersant, there are links to the additional pages (of the same issue, they are not avaialble as links from rss). For example, on the page http://www.kommersant.ru/doc-rss.aspx?DocsID=1366511 there are links to page "2" - http://www.kommersant.ru/doc.aspx?DocsID=1366459 and page "3" - http://www.kommersant.ru/doc.aspx?DocsID=1366462. Is there any way to include these additional pages?
smargo is offline  
Old 05-10-2010, 12:52 PM   #1905
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
There is always a way. But right now I have no spare time nor will I have it in the foreseeable future. You are on your own on this one.
kiklop74 is offline  
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom column read ? pchrist7 Calibre 2 10-04-2010 02:52 AM
Archive for custom screensavers sleeplessdave Amazon Kindle 1 07-07-2010 12:33 PM
How to back up preferences and custom recipes? greenapple Calibre 3 03-29-2010 05:08 AM
Donations for Custom Recipes ddavtian Calibre 5 01-23-2010 04:54 PM
Help understanding custom recipes andersent Calibre 0 12-17-2009 02:37 PM


All times are GMT -4. The time now is 08:22 PM.


MobileRead.com is a privately owned, operated and funded community.