Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 07-09-2010, 09:51 AM   #2281
jasonfedelem
Zealot
jasonfedelem ought to be getting tired of karma fortunes by now.jasonfedelem ought to be getting tired of karma fortunes by now.jasonfedelem ought to be getting tired of karma fortunes by now.jasonfedelem ought to be getting tired of karma fortunes by now.jasonfedelem ought to be getting tired of karma fortunes by now.jasonfedelem ought to be getting tired of karma fortunes by now.jasonfedelem ought to be getting tired of karma fortunes by now.jasonfedelem ought to be getting tired of karma fortunes by now.jasonfedelem ought to be getting tired of karma fortunes by now.jasonfedelem ought to be getting tired of karma fortunes by now.jasonfedelem ought to be getting tired of karma fortunes by now.
 
jasonfedelem's Avatar
 
Posts: 118
Karma: 202232
Join Date: Jun 2010
Location: Texas
Device: Kindle Paperwhite Gen2
Quote:
Originally Posted by jasonfedelem View Post
That's weird. I'm looking under Fetch News under both "Austin" and "The Austin" but it doesn't show up...

Is there some extended lib of recipes? I'm running v0.7.7. and have 237 recipes listed under "English"
Never mind! I'm blind.... it was listed under just "Statesman".

I see that you are the author of it. Have you considered changing the listed name to "Austin American Statesmen"? I think that's what most people would look for...

Last edited by jasonfedelem; 07-09-2010 at 09:53 AM.
jasonfedelem is offline  
Old 07-09-2010, 10:34 AM   #2282
einstuerzende
Junior Member
einstuerzende began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jul 2010
Device: Kindle
Quote:
Originally Posted by rty View Post
Either I am getting senile today, but I think cn.wsj.com seems to have some kind of protection in place to prevent Calibre basic scraping method from the RSS page. We'll have to wait for experts to help.
Yeah, I think the dates on the RSS feed were wonky. Nothing got picked up the first time I put it through, only once it was allowed to pull older articles did it get anything. But even then it was pretty ugly (my python kung-fu is hella weak).

Experts?
einstuerzende is offline  
Old 07-09-2010, 09:25 PM   #2283
bhandarisaurabh
Enthusiast
bhandarisaurabh began at the beginning.
 
Posts: 49
Karma: 10
Join Date: Aug 2009
Device: none
AN SOMEONE MAKE RECIPE FOR WHARTON INDIA@ KNOWLEDGE
http://knowledge.wharton.upenn.edu/india/rss/

AND FINANCIAL EXPRESS PRINT EDITION WITHOUT USING FEEDS AND USING THE LINK
http://www.financialexpress.com/print/
bhandarisaurabh is offline  
Old 07-10-2010, 02:01 AM   #2284
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Quote:
Originally Posted by einstuerzende View Post
Yeah, I think the dates on the RSS feed were wonky.
Bingo! You got it! The wonky dates!

Just insert this magic line in your recipe and it should work!

timefmt = ' [%Y %b %d]'

I'll make the recipe for you.
rty is offline  
Old 07-10-2010, 03:19 AM   #2285
capidamonte
Not who you think I am...
capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.
 
capidamonte's Avatar
 
Posts: 374
Karma: 30283
Join Date: Jan 2010
Location: Honolulu
Device: PocketBook 360 -- Ivory
Could I request a recipe for the Calibre User Manual?

Perhaps it could test for changes so the server doesn't get bombed...

Thanks!
capidamonte is offline  
Old 07-10-2010, 04:08 AM   #2286
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by capidamonte View Post
Could I request a recipe for the Calibre User Manual?

Perhaps it could test for changes so the server doesn't get bombed...

Thanks!
This isn't a recipe but from the online user manual is the following.

Quote:
An e-book version of this User Manual is available in EPUB format. Because the User Manual uses advanced formatting, it is only suitable for use with the calibre e-book viewer.
DoctorOhh is offline  
Old 07-10-2010, 04:41 AM   #2287
capidamonte
Not who you think I am...
capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.
 
capidamonte's Avatar
 
Posts: 374
Karma: 30283
Join Date: Jan 2010
Location: Honolulu
Device: PocketBook 360 -- Ivory
Yeah, I just went looking for that, and couldn't see it. I'm sure it was there, b/c you found it, but for me it was invisible. Even though I'd seen it before.

Assuming it's updated when the webpage is updated, maybe I could write a script to download it and import it into Calibre using the command-line tools.

Still, it'd be elegant if Calibre provided its own updates via its News system, wouldn't it?
capidamonte is offline  
Old 07-10-2010, 06:42 AM   #2288
Dereks
Connoisseur
Dereks began at the beginning.
 
Posts: 57
Karma: 10
Join Date: Feb 2010
Device: Kindle Paperwhite 1
i saw there was a discussion about problems with google reader recipe a while ago? was is solved?
Dereks is offline  
Old 07-10-2010, 07:07 AM   #2289
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Quote:
Originally Posted by einstuerzende View Post
rty,

I've been fumbling around with making a recipe for cn.wsj.com without an awful lot of success. If you have time and are taking any requests, I'd appreciate whatever help you could give. I'm trying to get the Traditional character edition, which I think means throwing "big5" in front of everything (ex: http://cn.wsj.com/big5/20100708/FRX003561.asp)
http://chinese.wsj.com/gb/rss01.xml

Would you like to take a look at the recipe code below? It pulls all the correct articles but for some reason, the 'remove_tags_after' doesn't work on this particular site. Basically you want to remove everything after the Division with id='toolbar_tb'

Spoiler:

Code:
class AdvancedUserRecipe1278740771(BasicNewsRecipe):
    title          = u'WSJ 华尔街日报'
    __author__ = 'x'
    oldest_article = 14
    max_articles_per_feed = 100
    timefmt = ' [%Y %b %d]'
    feeds          = [
	#(u'要闻', u'http://chinese.wsj.com/gb/rss01.xml'),
	#(u'特写', u'http://chinese.wsj.com/gb/rss02.xml'),
	(u'国际财经', u'http://chinese.wsj.com/gb/rssglobal.xml'),
	#(u'能源与汽车', u'http://chinese.wsj.com/gb/rssautoene.xml')

	]
    language = 'zh-cn'
    pubisher  = 'Dow Jones & Company, Inc.'
    description           = 'Wall Stree Journal - Chinese edition'
    category              = 'News, Business'
    remove_javascript = True
    use_embedded_content   = False
    no_stylesheets = True
    encoding               = 'GB2312'
    #conversion_options = {'linearize_tables':True} 


    extra_css = '''
             @font-face { font-family: "DroidFont", serif, sans-serif;  src: url(res:///system/fonts/DroidSansFallback.ttf); }\n 
             body { 
                  margin-right: 8pt; 
                  font-family: 'DroidFont', serif;}
             .left_content {font-family: 'DroidFont', serif, sans-serif}
            '''
 
    remove_tags_after = [dict(name='div', attrs={'id':'toolbar_tb'})]
    keep_only_tags = [dict(name='div', attrs={'id':['headline','bodytext']})]
    remove_tags = [
                              dict(name='div', attrs={'id':['tabdiv','toolbar_tt','toolbar_tb','bottom1','sponsor','nav','column2']}),
                               ]

    def preprocess_html(self, soup):
        for item in soup.findAll(style=True):
           del item['style']
        for item in soup.findAll(width=True):
           del item['width']
        return soup
rty is offline  
Old 07-10-2010, 07:12 AM   #2290
bikecd
Junior Member
bikecd began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jul 2010
Device: Amazon Kindle
Hello!

Great program!!!! You all do great work! I do have one issue, however. I tried to create my own recipe for El Pais- a Spanish newspaper- since the recipes provided give the print version of the articles on the webpage. I tried to create a recipe to get the print version of articles of the PRINT EDITION each morning. But to no avail. I failed miserably! Any help in creating a recipe for El Pais to get only the article the in daily PRINT EDITION??

THANKS!!!
bikecd is offline  
Old 07-10-2010, 07:48 AM   #2291
sibermage
Junior Member
sibermage began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jul 2010
Device: Sony PRS600
Thumbs up

Quote:
Originally Posted by rty View Post
Here it is: Recipe for SINGTAO DAILY CANADA

Language: Chinese (Traditional)
Tested OK on B&N Nook e-reader.

Updated: Recipe updated to remove the hidden/bogus tab character that prevented the recipe to be imported into Calibre.
Thanks RTY. That worked.
sibermage is offline  
Old 07-10-2010, 11:14 AM   #2292
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Quote:
Originally Posted by jasonfedelem View Post
I see that you are the author of it. Have you considered changing the listed name to "Austin American Statesmen"? I think that's what most people would look for...
Nah, if you go to the website, you can tell that publisher doesn't seem to agree.

But here's another recipe you asked for: Waco Tribune.
Attached Files
File Type: zip Waco Tribune Herald.zip (664 Bytes, 216 views)

Last edited by rty; 07-14-2010 at 10:54 AM.
rty is offline  
Old 07-10-2010, 11:18 AM   #2293
mikegps1
Junior Member
mikegps1 began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jul 2010
Device: sony prs600
Times Online - subscription version

Since my last post a couple of days ago, I've tried to update the Times Online recipe as the paper now requires a subscription for access to newsfeeds.

My first attempt is below get errors in lines 41 and 43 can anyone help please?

BTW version 07.8 is great, calibre gets better all the time.


************************************************** ******
#!/usr/bin/env python

__license__ = 'GPL v3'
__copyright__ = '2008-2009, Darko Miletic <darko.miletic at gmail.com>'
'''
timesonline.co.uk
'''
import re

from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import Tag

class Timesonline(BasicNewsRecipe):
title = 'The Times Online'
__author__ = 'Darko Miletic and Sujata Raman'
description = 'UK news'
publisher = 'timesonline.co.uk'
category = 'news, politics, UK'
oldest_article = 2
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = False
simultaneous_downloads = 1
encoding = 'ISO-8859-1'
remove_javascript = True
language = 'en_GB'
recursions = 9
LOGIN = http://www.timesplus.co.uk/tto/news/...lightbox=false
keep_only_tags = [
dict(name='div', attrs= {'id':['region-column1and2-layout2']}),
{'class' : ['subheading']},
dict(name='div', attrs= {'id':['dynamic-image-holder']}),
dict(name='div', attrs= {'class':['article-author']}),
dict(name='div', attrs= {'id':['related-article-links']}),
]

remove_tags = [
dict(name=['embed','object','form','iframe']),
dict(name='span', attrs = {'class':'float-left padding-left-8 padding-top-2'}),
dict(name='div', attrs= {'id':['region-footer','region-column2-layout2','grid-column4','login-status','comment-sort-order']}),
dict(name='div', attrs= {'class': ['debate-quote-container','clear','your-comment','float-left related-attachements-container','float-left padding-bottom-5 padding-top-8','puff-top']}),
dict(name='span', attrs = {'id': ['comment-count']}),
dict(name='ul',attrs = {'id': 'read-all-comments'}),
dict(name='a', attrs = {'class':'reg-bold'}),
]

extra_css = '''
.small{font-family :Arial,Helvetica,sans-serif; font-size:x-small;}
.byline{font-family :Arial,Helvetica,sans-serif; font-size:x-small; background:#F8F1D8;}
.color-666{font-family :Arial,Helvetica,sans-serif; font-size:x-small; color:#666666; }
h1{font-family:Georgia,Times New Roman,Times,serif;font-size:large; }
.color-999 {color:#999999;}
.x-small {font-size:x-small;}
#related-article-links{font-family :Arial,Helvetica,sans-serif; font-size:small;}
h2{color:#333333;font-family :Georgia,Times New Roman,Times,serif; font-size:small;}
p{font-family :Arial,Helvetica,sans-serif; font-size:small;}
'''
feeds = [
(u'Top stories from Times Online', u'http://www.timesonline.co.uk/tol/feeds/rss/topstories.xml' ),
('Latest Business News', 'http://www.timesonline.co.uk/tol/feeds/rss/business.xml'),
('Economics', 'http://www.timesonline.co.uk/tol/feeds/rss/economics.xml'),
('World News', 'http://www.timesonline.co.uk/tol/feeds/rss/worldnews.xml'),
('UK News', 'http://www.timesonline.co.uk/tol/feeds/rss/uknews.xml'),
('Travel News', 'http://www.timesonline.co.uk/tol/feeds/rss/travel.xml'),
('Sports News', 'http://www.timesonline.co.uk/tol/feeds/rss/sport.xml'),
('Film News', 'http://www.timesonline.co.uk/tol/feeds/rss/film.xml'),
('Tech news', 'http://www.timesonline.co.uk/tol/feeds/rss/tech.xml'),
('Literary Supplement', 'http://www.timesonline.co.uk/tol/feeds/rss/thetls.xml'),
]

def get_cover_url(self):
cover_url = None
index = 'http://www.timesonline.co.uk/tol/newspapers/'
soup = self.index_to_soup(index)
link_item = soup.find(name = 'div',attrs ={'class': "float-left margin-right-15"})
if link_item:
cover_url = link_item.img['src']
return cover_url

def get_article_url(self, article):
return article.get('guid', None)

def get_browser(self):
br = BasicNewsRecipe.get_browser()
if self.username is not None and self.password is not None:
br.open(self.LOGIN)
br.select_form(name='loginForm')
br['username'] = self.username
br['password'] = self.password
br.submit()
return br

def preprocess_html(self, soup):
soup.html['xml:lang'] = self.language
soup.html['lang'] = self.language
mlang = Tag(soup,'meta',[("http-equiv","Content-Language"),("content",self.language)])
mcharset = Tag(soup,'meta',[("http-equiv","Content-Type"),("content","text/html; charset=ISO-8859-1")])
soup.head.insert(0,mlang)
soup.head.insert(1,mcharset)
return self.adeify_images(soup)

def postprocess_html(self,soup,first):
for tag in soup.findAll(text = ['Previous Page','Next Page']):
tag.extract()
return soup
mikegps1 is offline  
Old 07-11-2010, 06:43 AM   #2294
iLeaveYou
Junior Member
iLeaveYou began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jul 2010
Device: Kindle DX
WOW!!!
0.7.8 is just great.
Now everybody could access a Romanian recipe.
I am asking again if somebody could do a recipe for this:
http://www.realitatea.net/rss.html
They probably have the best rss feeds for the best Romanian News.
Thank you.

Last edited by iLeaveYou; 07-11-2010 at 06:47 AM.
iLeaveYou is offline  
Old 07-11-2010, 10:43 AM   #2295
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Dereks View Post
i saw there was a discussion about problems with google reader recipe a while ago? was is solved?
I solved it a few hours ago. It's still being tested - there's a dedicated thread with the fixed recipe, if you're interested.
Starson17 is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom column read ? pchrist7 Calibre 2 10-04-2010 02:52 AM
Archive for custom screensavers sleeplessdave Amazon Kindle 1 07-07-2010 12:33 PM
How to back up preferences and custom recipes? greenapple Calibre 3 03-29-2010 05:08 AM
Donations for Custom Recipes ddavtian Calibre 5 01-23-2010 04:54 PM
Help understanding custom recipes andersent Calibre 0 12-17-2009 02:37 PM


All times are GMT -4. The time now is 01:16 PM.


MobileRead.com is a privately owned, operated and funded community.