View Single Post
Old 07-08-2010, 07:53 PM   #2275
einstuerzende
Junior Member
einstuerzende began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jul 2010
Device: Kindle
rty,

I've been fumbling around with making a recipe for cn.wsj.com without an awful lot of success. If you have time and are taking any requests, I'd appreciate whatever help you could give. I'm trying to get the Traditional character edition, which I think means throwing "big5" in front of everything (ex: http://cn.wsj.com/big5/20100708/FRX003561.asp)

On another note, I made a couple of modifications to the BBC Chinese recipe to make it pull the Traditional version, otherwise it's pretty much as rty built it; sharing in case it's of use to anyone:

Spoiler:
class AdvancedUserRecipe1277443634(BasicNewsRecipe):
title = u'BBC 中文網'
oldest_article = 7
max_articles_per_feed = 100

feeds = [
(u'\u4e3b\u9801', u'http://www.bbc.co.uk/zhongwen/trad/index.xml'),
(u'\u570B\u969B\u65b0\u805e', u'http://www.bbc.co.uk/zhongwen/trad/world/index.xml'),
(u'\u5169\u5CB8\u4E09\u5730', u'http://www.bbc.co.uk/zhongwen/trad/china/index.xml'),
(u'\u91D1\u878D\u8CA1\u7D93', u'http://www.bbc.co.uk/zhongwen/trad/business/index.xml'),
(u'\u7DB2\u4E0A\u4E92\u52D5', u'http://www.bbc.co.uk/zhongwen/trad/interactive/index.xml'),
(u'\u97F3\u8996\u5716\u7247', u'http://www.bbc.co.uk/zhongwen/trad/multimedia/index.xml'),
(u'\u5206\u6790\u8A55\u8AD6', u'http://www.bbc.co.uk/zhongwen/trad/indepth/index.xml')
]
extra_css = '''
@font-face {font-family: "DroidFont", serif, sans-serif; src: url(res:///system/fonts/DroidSansFallback.ttf); }\n
body {margin-right: 8pt; font-family: 'DroidFont', serif;}\n
h1 {font-family: 'DroidFont', serif;}\n
.articledescription {font-family: 'DroidFont', serif;}
'''
__author__ = 'rty'
__version__ = '1.0'
language = 'zh-HANT'
pubisher = 'British Broadcasting Corporation'
description = 'BBC news in Chinese'
category = 'News, Chinese'
remove_javascript = True
use_embedded_content = False
no_stylesheets = True
encoding = 'UTF-8'
conversion_options = {'linearize_tables':True}
masthead_url = 'http://wscdn.bbc.co.uk/zhongwen/trad/images/1024/brand.jpg'
keep_only_tags = [
dict(name='h1'),
dict(name='p', attrs={'class':['primary-topic','summary']}),
dict(name='div', attrs={'class':['bodytext','datestamp']}),
]
einstuerzende is offline