|
|
#16 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,088
Karma: 11562565
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
|
|
|
|
|
|
#17 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,088
Karma: 11562565
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Code:
encoding = 'windows-1252' |
|
|
|
| Advert | |
|
|
|
|
#18 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,088
Karma: 11562565
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Sorry, but the problem as such tickles me.
Could it be that there is something wrong with the preprocess_regexps? I tried to simply replace A by B, using: Code:
preprocess_regexps = [
(re.compile(r'A', re.DOTALL),
lambda match: 'B'),
|
|
|
|
|
|
#19 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,670
Karma: 28549304
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
make sure you have the right indentation for preprocess_regexps it should be at the same level as the title for example.
|
|
|
|
|
|
#20 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,088
Karma: 11562565
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
It looks like this:
Code:
class AdvancedUserRecipe1295262156(BasicNewsRecipe):
title = u'kath.net'
__author__ = 'Bobus'
description = u'Katholische Nachrichten'
oldest_article = 7
language = 'de'
max_articles_per_feed = 100
no_stylesheets = True
encoding = 'iso-8859-1'
preprocess_regexps = [
(re.compile(r'A', re.DOTALL),
lambda match: 'B'),
Edit': The missing square bracket is, in fact, there (fault at copy/paste). Last edited by Leonatus; 05-17-2019 at 10:42 AM. |
|
|
|
| Advert | |
|
|
|
|
#21 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,670
Karma: 28549304
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
post the actual recipe file as a zipped up attachment, too much chance of things chaning with copy paste
|
|
|
|
|
|
#22 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,088
Karma: 11562565
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Here we are!
|
|
|
|
|
|
#23 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,670
Karma: 28549304
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The folowing recipe works for me with quotes preserved:
Code:
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe1295262156(BasicNewsRecipe):
title = u'kath.net'
__author__ = 'Bobus'
description = u'Katholische Nachrichten'
oldest_article = 7
language = 'de'
max_articles_per_feed = 100
no_stylesheets = True
encoding = 'cp1252'
feeds = [(u'kath.net', u'https://www.kath.net/2005/xml/index.xml')]
def print_version(self, url):
return url + "/print/yes"
def get_browser(self, *a, **kwargs):
kwargs['verify_ssl_certificates'] = False
return BasicNewsRecipe.get_browser(self, *a, **kwargs)
extra_css = 'td.textb {font-size: medium;}'
|
|
|
|
|
|
#24 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,088
Karma: 11562565
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Nope, no success here. Same appearance as always.
|
|
|
|
|
|
#25 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,088
Karma: 11562565
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Perhaps is it important to know that I reiceive the news via Joel Goguen's "KoboTouch-extended"-plugin in the Kobo-epub format (kepub), as usual with basically all my books.
|
|
|
|
|
|
#26 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,670
Karma: 28549304
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
look at the downloaded epub file using the calibre viewer first.
|
|
|
|
|
|
#27 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,088
Karma: 11562565
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Yes, that's what I do, and that's why my last post is superfluous, because above I did already tell that the appearance in Calibre's ebook viewer is the same as on my ereader. I apologize!
|
|
|
|
|
|
#28 | |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,088
Karma: 11562565
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Quote:
Are you sure? There are various authors writing in various styles, some of them using the "classical" keyboard quotes - which are, indeed, preserved. I tried with three computers, and it's always the same result, even taking in account your proposal: The more "typographical" quotes are transformed into the replacement character, might they be single or double. I played a lot, but found no resoöution, whereas, downloading another newspaper with similar structure (i. e. using "classical" and "typographic" quotes, everything is allright, the encoding beeing ISO-8859-1. The recipe of it is: Code:
#!/usr/bin/env python2
# vim:fileencoding=utf-8
# License: GPLv3 Copyright: 2016, Kovid Goyal <kovid at kovidgoyal.net>
from __future__ import (unicode_literals, division, absolute_import,
print_function)
from calibre.web.feeds.recipes import BasicNewsRecipe
def classes(classes):
q = frozenset(classes.split(' '))
return dict(attrs={'class': lambda x: x and frozenset(x.split()).intersection(q)})
class BerlinerZeitung(BasicNewsRecipe):
title = 'Berliner Zeitung'
__author__ = 'Kovid Goyal'
language = 'de'
description = 'Berliner Zeitung RSS'
timefmt = ' [%d.%m.%Y]'
ignore_duplicate_articles = {'title', 'url'}
remove_empty_feeds = True
# oldest_article = 7.0
no_stylesheets = True
remove_javascript = True
use_embedded_content = False
publication_type = 'newspaper'
keep_only_tags = [
classes('dm_article_body dm_article_header'),
]
remove_tags = [
classes('dm_article_share'),
]
feeds = [x.split() for x in [
'Berlin http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23699382-asYahooFeed.xml',
'Brandenburg http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23699570-asYahooFeed.xml',
'Politik http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23699614-asYahooFeed.xml',
'Wirtschaft http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23699644-asYahooFeed.xml',
'Sport http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23699874-asYahooFeed.xml',
'Kultur http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23700020-asYahooFeed.xml',
'Panorama http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23700178-asYahooFeed.xml',
'Wissen http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23700222-asYahooFeed.xml',
'Digital http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23700594-asYahooFeed.xml',
'Ratgeber http://www.berliner-zeitung.de/blueprint/servlet/xml/berliner-zeitung/23700190-asYahooFeed.xml',
]]
|
|
|
|
|
|
|
#29 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,088
Karma: 11562565
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Eventually, I came to the conclusion that the problem cannot be resolved, as the characters in question are already replaced on the rss-page of the journal. There appear - strange enough - empty quadrats instead of the quotes, for example. Why this happens, and why there are no problems when the rss-feed is subscribed on a computer, this will remain a secret of the deepest depths of the internet, at least for my simple mind.
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Should I go for a replacement? | n33raj18 | Amazon Kindle | 14 | 08-28-2014 08:18 AM |
| Replacement Character Frustration | amo48 | Sigil | 4 | 05-18-2012 01:43 PM |
| Touch Replacement Plan | PeterT | Kobo Reader | 3 | 06-18-2011 09:09 PM |
| regex for character replacement, em-dash questions | cybmole | Calibre | 3 | 10-18-2010 04:09 PM |
| PRS-600 So, should I ask for a replacement? | ziegl027 | Sony Reader | 8 | 01-25-2010 11:40 AM |