Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 06-05-2011, 06:53 PM   #1
purcelljf
Enthusiast
purcelljf began at the beginning.
 
Posts: 29
Karma: 10
Join Date: Aug 2010
Device: ipod touch
Recipe Request: WSJ Spanish Edition

Hi all, I am fairly new to Calibre

It doesn't appear there is a recipe for the Spanish edition of the Wall Street Journal. http://online.wsj.com/public/page/espanol-inicio.html

I tried the simple approach of just adding the RSS feed http://online.wsj.com/xml/rss/3_7687.xml, but that doesn't appear to work, so I guess I need a more custom recipe.

I have started looking at some of the tutorials, but the first scenario didn't seem to apply since the print layout version of the pages end with "#printMode" instead of being preceded by additional html path as in a example I saw.

Anyway, While I am going to continue plugging along in studying the tutorials, I thought somebody might have a suggestion to cut down on my learning curve. Thanks.
purcelljf is offline   Reply With Quote
Old 06-17-2011, 03:46 AM   #2
sexymax15
Enthusiast
sexymax15 began at the beginning.
 
sexymax15's Avatar
 
Posts: 30
Karma: 12
Join Date: Jun 2011
Location: India
Device: Kindle 3g
No need to use print version, even if if you use print version by using " def print_version(self, url):return url + '#printMode'"
you dont get a print page in calibre. It will parse all the image,header,footer etc.Here's my recipe, it works fine.Fetches all the articles no problem detected.


Quote:
#created by sexymax15 ....sexymax15@gmail.com
#Wall Street Journal(Spanish) recipe
import re

from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.chardet import xml_to_unicode

class AdvancedUserRecipe1308289809(BasicNewsRecipe):
title = u'Wall Street Journal(Spanish)'
oldest_article = 7
max_articles_per_feed = 20
use_embedded_content = False

remove_empty_feeds = True
no_stylesheets = True
remove_javascript = True
remove_tags = [dict(name='img'),{'class':['header','articleSection first','articleThumbnail_1','insettipUnit insetZoomTarget','insetZoomTargetBox','insettipBox ','insettip']}]
keep_only_tags = {'class':['articlePage','byline','articleHeadlineBox headlineType-newswire']}
extra_css = ''' h1 {font-family:georgia,serif;font-size: large} '''

feeds = [(u'Wall Street Journal(Spanish)', u'http://online.wsj.com/xml/rss/3_7687.xml')]
Screenshot:









Attached Files
File Type: zip Wall Street Journal(Spanish)_1121.zip (695 Bytes, 235 views)
sexymax15 is offline   Reply With Quote
Advert
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
New recipe in Spanish: ABC.es rickydh Recipes 0 02-11-2011 11:33 AM
One new recipe and other one updated (In Spanish) desUBIKado Recipes 3 01-19-2011 03:58 AM
Where's the wsj recipe? mdovell Calibre 7 09-10-2010 08:29 AM
WSJ-Sony won't make Xmas for Daily Edition advocate2 News 1 11-18-2009 06:52 PM
WSJ Easter Edition from Public Library OrcaBlue Sony Reader 2 06-20-2008 02:31 AM


All times are GMT -4. The time now is 12:00 AM.


MobileRead.com is a privately owned, operated and funded community.