Quote:
Originally Posted by kovidgoyal
The folowing recipe works for me with quotes preserved:
|
Are you sure? There are various authors writing in various styles, some of them using the "classical" keyboard quotes - which are, indeed, preserved. I tried with three computers, and it's always the same result, even taking in account your proposal: The more "typographical" quotes are transformed into the replacement character, might they be single or double.
I played a lot, but found no resoöution, whereas, downloading another newspaper with similar structure (i. e. using "classical" and "typographic" quotes, everything is allright, the encoding beeing ISO-8859-1. The recipe of it is:
Code:
#!/usr/bin/env python2
# vim:fileencoding=utf-8
# License: GPLv3 Copyright: 2016, Kovid Goyal <kovid at kovidgoyal.net>
from __future__ import (unicode_literals, division, absolute_import,
print_function)
from calibre.web.feeds.recipes import BasicNewsRecipe
def classes(classes):
q = frozenset(classes.split(' '))
return dict(attrs={'class': lambda x: x and frozenset(x.split()).intersection(q)})
class BerlinerZeitung(BasicNewsRecipe):
title = 'Berliner Zeitung'
__author__ = 'Kovid Goyal'
language = 'de'
description = 'Berliner Zeitung RSS'
timefmt = ' [%d.%m.%Y]'
ignore_duplicate_articles = {'title', 'url'}
remove_empty_feeds = True
# oldest_article = 7.0
no_stylesheets = True
remove_javascript = True
use_embedded_content = False
publication_type = 'newspaper'
keep_only_tags = [
classes('dm_article_body dm_article_header'),
]
remove_tags = [
classes('dm_article_share'),
]
feeds = [x.split() for x in [
'Berlin http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23699382-asYahooFeed.xml',
'Brandenburg http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23699570-asYahooFeed.xml',
'Politik http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23699614-asYahooFeed.xml',
'Wirtschaft http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23699644-asYahooFeed.xml',
'Sport http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23699874-asYahooFeed.xml',
'Kultur http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23700020-asYahooFeed.xml',
'Panorama http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23700178-asYahooFeed.xml',
'Wissen http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23700222-asYahooFeed.xml',
'Digital http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23700594-asYahooFeed.xml',
'Ratgeber http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23700190-asYahooFeed.xml',
]]