Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 09-23-2009, 07:05 AM   #751
GRiker
Comparer of the Ephemeris
GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.
 
Posts: 1,496
Karma: 424697
Join Date: Mar 2009
Device: iPad
MichaelMSeattle:

Add the following to your recipe:
Code:
	def print_version(self, url):
		return url + '?pagewanted=print'
This will append the necessary suffix to fetch the print version. You can find a description of the function here.

G
GRiker is offline  
Old 09-23-2009, 08:38 AM   #752
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
That will not work since NYT has quite good scraping protection.

This is the recipe that works for NYT magazine, same can be easily modified for other parts of NYT site.
Attached Files
File Type: zip nytmag.zip (1.3 KB, 200 views)
kiklop74 is offline  
Advert
Old 09-23-2009, 01:01 PM   #753
MichaelMSeattle
Enthusiast
MichaelMSeattle began at the beginning.
 
Posts: 30
Karma: 16
Join Date: Sep 2009
Device: sony prs-505/600
help with NYTMagazine

Quote:
Originally Posted by kiklop74 View Post
That will not work since NYT has quite good scraping protection.

This is the recipe that works for NYT magazine, same can be easily modified for other parts of NYT site.
Thanks very much for responding so quickly! I love how you were able to get the cover image.

Your recipe returned the main articles of the magazine but not the sub-sections (which are listed in the TOC). I modified the recipe to add the sub-section feeds and that only added those to the TOC.

For all the sub articles (not those in the main section) I just see:
"This article was downloaded by calibre from http://www.nytimes.com/2009/09/20/magazine/20Letters-t-001.html" (or whatever was the source).

I'm attaching the full recipe below. Thanks again for your help!
-Mike

==============================================
#!/usr/bin/env python

__license__ = 'GPL v3'
__copyright__ = '2009, Darko Miletic <darko.miletic at gmail.com>'
'''
nytimes.com/pages/magazine
'''

import time
from calibre.web.feeds.news import BasicNewsRecipe

class NewYorkTimesMagazine(BasicNewsRecipe):
title = 'The New York Times Magazine3'
__author__ = 'Darko Miletic'
description = 'News from New York'
publisher = 'The New York Times'
category = 'news, politics, US'
delay = 1
language = 'en_US'
oldest_article = 10
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = False
encoding = 'cp1252'
INDEX = 'http://www.nytimes.com/pages/magazine/'

conversion_options = {
'comments' : description
,'tags' : category
,'language' : language
,'publisher': publisher
}


keep_only_tags = [dict(name='div', attrs={'id':'article'})]

remove_tags = [
dict(name='div', attrs={'class':['header','nextArticleLink clearfix','correctionNote']})
,dict(name='div', attrs={'id':['toolsRight','articleInline','readerscomment','aut horId']})
,dict(name=['object','link'])
]

remove_tags_after = dict(name='div',attrs={'id':'pageLinks'})

feeds = [(u'Articles', u'http://feeds.nytimes.com/nyt/rss/Magazine' ),
(u'The Ethicist', u'http://ethicist.blogs.nytimes.com/feed/'),
(u'Medium', u'http://themedium.blogs.nytimes.com/feed/'),
(u'Motherload', u'http://parenting.blogs.nytimes.com/feed/')
]


def append_page(self, soup, appendtag, position):
pager = soup.find('div',attrs={'id':'pageLinks'})
if pager:
atag = pager.find('a',attrs={'title':'Next Page'})
if atag:
soup2 = self.index_to_soup('http://www.nytimes.com' + atag['href'])
st = soup2.find('div',attrs={'id':'articleInline'})
if st:
st.extract()
tt = soup2.find('div',attrs={'class':'nextArticleLink clearfix'})
if tt:
tt.extract()
texttag = soup2.find('div', attrs={'id':'articleBody'})
for it in texttag.findAll(style=True):
del it['style']
for it in texttag.findAll(attrs={'id':'authorId'}):
it.extract()
for it in texttag.findAll(attrs={'class':'correctionNote'}):
it.extract()
newpos = len(texttag.contents)
self.append_page(soup2,texttag,newpos)
pager.extract()
pager2 = texttag.find('div',attrs={'id':'pageLinks'})
if pager2:
pager2.extract()
texttag.extract()
appendtag.insert(position,texttag)

def get_article_url(self, article):
return article.get('guid', None)

def preprocess_html(self, soup):
for item in soup.findAll(style=True):
del item['style']
self.append_page(soup, soup.body, 3)
return soup

def get_cover_url(self):
cover = None
soup = self.index_to_soup(self.INDEX)
tag = soup.find('div',attrs={'id':'ABcolumnPromo'})
if tag:
st = time.strptime(tag.h3.string,'%m.%d.%Y')
year = str(st.tm_year)
month = "%.2d" % st.tm_mon
day = "%.2d" % st.tm_mday
cover = 'http://graphics8.nytimes.com/images/' + year + '/' + month +'/' + day +'/magazine/' + day +'cover-395.jpg'
return cover
MichaelMSeattle is offline  
Old 09-23-2009, 09:29 PM   #754
gregcd
Connoisseur
gregcd rocks like Gibraltar!gregcd rocks like Gibraltar!gregcd rocks like Gibraltar!gregcd rocks like Gibraltar!gregcd rocks like Gibraltar!gregcd rocks like Gibraltar!gregcd rocks like Gibraltar!gregcd rocks like Gibraltar!gregcd rocks like Gibraltar!gregcd rocks like Gibraltar!gregcd rocks like Gibraltar!
 
gregcd's Avatar
 
Posts: 90
Karma: 100000
Join Date: Jan 2009
Location: New Zealand
Device: prs-t1, prs-650 to sell
Hi all, I'm updating a custom recipie to change the base font size for a LRF with html2lrf_options =
However the recipie (New Scientist) already uses this, what is the command to use?
gregcd is offline  
Old 09-23-2009, 09:39 PM   #755
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
Quote:
Originally Posted by gregcd View Post
Hi all, I'm updating a custom recipie to change the base font size for a LRF with html2lrf_options =
However the recipie (New Scientist) already uses this, what is the command to use?
html2lrf_options is obsolete (applies to 0.5.x and earlier versions of calibre). You should use instead new directive:

conversion_options

See example here
kiklop74 is offline  
Advert
Old 09-24-2009, 09:30 AM   #756
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
Business Standard (India's daily newspaper)
Attached Files
File Type: zip business_standard.zip (1.2 KB, 185 views)
kiklop74 is offline  
Old 09-24-2009, 11:59 AM   #757
CABITSS
Member
CABITSS began at the beginning.
 
Posts: 13
Karma: 10
Join Date: Sep 2009
Device: amazonkindle
Custom Recipe

Can someone help and create a recipe for The Toronto Star
Thanks in advance
CABITSS is offline  
Old 09-24-2009, 12:11 PM   #758
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
It is already present in this thread:

https://www.mobileread.com/forums/sho...&postcount=747
kiklop74 is offline  
Old 09-24-2009, 05:10 PM   #759
Andreiko
Junior Member
Andreiko began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Sep 2009
Device: Kindle DX, Sony-505
I am again asking you guys to make the recipe from
inosmi.ru.
Here the rss: http://www.inosmi.ru/misc/export/xml...ranslation.xml

if this is possible, cuz i understand, it takes time.
Andreiko is offline  
Old 09-24-2009, 05:11 PM   #760
Andreiko
Junior Member
Andreiko began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Sep 2009
Device: Kindle DX, Sony-505
http://www.inosmi.ru/misc/export/xml...ranslation.xml

can someone please make a resipe out of this feed?
I know i have already asked, but maybe it went unoticable . Sorry for repeating.
Andreiko is offline  
Old 09-24-2009, 10:18 PM   #761
bhandarisaurabh
Enthusiast
bhandarisaurabh began at the beginning.
 
Posts: 49
Karma: 10
Join Date: Aug 2009
Device: none
Quote:
Originally Posted by kiklop74 View Post
Business Standard (India's daily newspaper)
thanks a lot you are a genius
bhandarisaurabh is offline  
Old 09-25-2009, 03:25 AM   #762
L4ur3nt
Junior Member
L4ur3nt began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Sep 2009
Device: Cybook Gen3
Quote:
Originally Posted by L4ur3nt View Post
Hello all,

I have a problem with this flux rss :

http://rss.futura-sciences.com/packfs

I got some stranges thinks in the text like this



Does somebody know why? Thank you very much!
L4ur3nt is offline  
Old 09-25-2009, 03:57 AM   #763
highwaykind
Connoisseur
highwaykind began at the beginning.
 
Posts: 54
Karma: 10
Join Date: May 2009
Device: Kindle Touch
Quote:
Originally Posted by kiklop74 View Post
Here goes:
Thank you!!
highwaykind is offline  
Old 09-25-2009, 07:54 AM   #764
olaf
Enthusiast
olaf is on a distinguished road
 
Posts: 43
Karma: 50
Join Date: May 2009
Device: Kindle3
What is the best way to change "smart quotes" (beginning quote, end quote) into a fixed single quote? My recipe is showing a special question mark character in each place where one of those quote marks occur.
olaf is offline  
Old 09-25-2009, 09:45 AM   #765
CABITSS
Member
CABITSS began at the beginning.
 
Posts: 13
Karma: 10
Join Date: Sep 2009
Device: amazonkindle
You are Brilliant

Quote:
Originally Posted by kiklop74 View Post
The Toronto Star:
Thank you so much.
You are brilliant.
Thanks once again for your quick and workable response.
CABITSS is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom column read ? pchrist7 Calibre 2 10-04-2010 02:52 AM
Archive for custom screensavers sleeplessdave Amazon Kindle 1 07-07-2010 12:33 PM
How to back up preferences and custom recipes? greenapple Calibre 3 03-29-2010 05:08 AM
Donations for Custom Recipes ddavtian Calibre 5 01-23-2010 04:54 PM
Help understanding custom recipes andersent Calibre 0 12-17-2009 02:37 PM


All times are GMT -4. The time now is 09:05 PM.


MobileRead.com is a privately owned, operated and funded community.