View Single Post
Old 06-25-2010, 12:28 PM   #16
ctos
Enthusiast
ctos has learned how to read e-booksctos has learned how to read e-booksctos has learned how to read e-booksctos has learned how to read e-booksctos has learned how to read e-booksctos has learned how to read e-booksctos has learned how to read e-books
 
Posts: 27
Karma: 881
Join Date: Feb 2010
Location: Beijing, China
Device: Nook, G1
Quote:
Originally Posted by rty View Post
It works well on my recipe for BBC Chinese (http://www.bbc.co.uk/zhongwen/simp/indepth/index.xml)

Spoiler:

Code:
class AdvancedUserRecipe1277443634(BasicNewsRecipe):
    title          = u'BBC Chinese'
    oldest_article = 7
    max_articles_per_feed = 100

    feeds          = [
	#(u'\u4e3b\u9875', u'http://www.bbc.co.uk/zhongwen/simp/index.xml'), 
	#(u'\u5206\u6790\u8bc4\u8bba', u'http://www.bbc.co.uk/zhongwen/simp/indepth/index.xml')
	]
    extra_css = '''
    	@font-face {font-family: "DroidFont", serif, sans-serif;  src: url(res:///system/fonts/DroidSansFallback.ttf); }\n
	body {margin-right: 8pt; font-family: 'DroidFont', serif;}
                    h1 {font-family: 'DroidFont', serif, sans-serif}
            '''
    __author__            = 'rty'
    __version__            = '1.0'
    language = 'zh-HANS'
    pubisher  = 'British Broadcasting Corporation'
    description           = 'BBC news in Chinese'
    category              = 'News, Chinese'
    remove_javascript = True
    use_embedded_content   = False
    no_stylesheets = True
    encoding               = 'UTF-8'
    conversion_options = {'linearize_tables':True} 
    masthead_url = 'http://wscdn.bbc.co.uk/zhongwen/simp/images/1024/brand.jpg'
    keep_only_tags = [
                              dict(name='h1'),
                              dict(name='p', attrs={'class':['primary-topic','summary']}),
                              dict(name='div', attrs={'class':['bodytext','datestamp']}), 
                              ]


But there's still some problem on the XML Feed page. Please look at the first photo. Look at the ??????? characters on the article summary/description lines in the XML feed page. The article itself is fine.

From what I observed, the problem only happens on XML Feed page with UTF-8 encoding.

Any idea how to solve this?
Spoiler:




did you try this?

- open the epub file with zip tool and find the .css file in it
- search all "font-family:" part and replace the font behind it with the font you have defined
ctos is offline   Reply With Quote