|
|
#31 |
|
Enthusiast
![]() Posts: 34
Karma: 10
Join Date: Dec 2012
Device: Kindle 4 & Kindle PW 3G
|
Hi Divingduck,
thanks for your hints. - Well the debug directory is what I already used for my last post. - The bit with the print statements comes in handy, however, when I try to fill these with the regexps, e.g. like this: Code:
print '*** c-overline tag --->:', (re.compile(r'(<span class="c-overline">[^<]*)(</span>)', re.DOTALL|re.IGNORECASE), lambda match: match.group(1) + ': ' + match.group(2))
print '*** hcf-location-mark --->:', (re.compile(r'(<span class="hcf-location-mark">[^<]*)(</span>)', re.DOTALL|re.IGNORECASE), lambda match: match.group(1) + '. ' + match.group(2))
Code:
*** c-overline tag --->: (<_sre.SRE_Pattern object at 0x7fdef18de540>, <function <lambda> at 0x7fdee09ddaa0>) *** hcf-location-mark --->: (<_sre.SRE_Pattern object at 0x7fdee0dcb5e8>, <function <lambda> at 0x7fdee09ddaa0>) I think I'm really stuck here, and this is quite frustrating. Thanks a lot in advance. Hegi. |
|
|
|
|
|
#32 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,150
Karma: 1404167
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
Your welcome.
I had a bit time to take a closer look at the problem. There are two things I saw. One is, to remember when a regex will happen. You are using preprocess_regexps. This means this refer to the downloaded HTML as source input. Therefore you can check debug\input\ as your source for the regex to find out how the downloaded HTML file looks for calibre at the moment you are manipulate the file. Second problem is the class you are looking for include spaces in its name and that do not to work (I think that had never work). Taking that in account, I would make it slightly different. I don't take care about the complete class string, I look only for the end of the class name for a unique identification: ... c-overline--article"> ... </span> ... Code:
(re.compile(r'(c-overline--article">[^>]*)(</span>)', re.DOTALL|re.IGNORECASE), lambda match: match.group(1) + ': ' + match.group(2)) |
|
|
|
| Advert | |
|
|
|
|
#33 |
|
Enthusiast
![]() Posts: 34
Karma: 10
Join Date: Dec 2012
Device: Kindle 4 & Kindle PW 3G
|
Thanks Divingduck,
... as usual, the problem lies in open sight and once you know the solution, everything seems simple and easy. I took the freedom to merge my earlier fork from your recipe with your actual version, to come up with an improved version. - Please feel free to review and edit or enhance even further. My evolutionary changes over the last five years:
For Amazone Kindle [4|Paperwhite] these settings work nicely: Code:
# if you want to reduce size for an b/w or E-ink device, uncomment the following 4 lines:
compress_news_images = True
#compress_news_images_auto_size = 16
scale_news_images = (400,300)
compress_news_images_max_size = 35
Thanks again and looking forward to your comments. Hegi. |
|
|
|
|
|
#34 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,150
Karma: 1404167
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
Thanks, you are welcome. It's fine for me.
![]() DD PS: No need to ask for approval. I like your changes for the recipe. |
|
|
|
|
|
#35 |
|
Enthusiast
![]() Posts: 34
Karma: 10
Join Date: Dec 2012
Device: Kindle 4 & Kindle PW 3G
|
Hi Divingduck,
I noticed for some time, that for some format of articles, pictures are no longer downloaded with the recipe, while for other articles it still works. Havn't had a chance yet to dig deeper, but wonder, if you maybe had already a look at it? Cheers, Hegi. |
|
|
|
| Advert | |
|
|
|
|
#36 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,150
Karma: 1404167
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
Hi Hegi,
Hope you are well these day's. Yes I did, but quite some time ago
|
|
|
|
|
|
#37 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,150
Karma: 1404167
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
I made a quick update as I saw a small issue today.
@hegi , forgot to mention, you need to integrate your additional code for your kindle. I still use my old Sony device ![]() Best regards, DD Last edited by Divingduck; 12-29-2020 at 06:01 AM. |
|
|
|
|
|
#38 | ||
|
Enthusiast
![]() Posts: 34
Karma: 10
Join Date: Dec 2012
Device: Kindle 4 & Kindle PW 3G
|
Hi Divingduck,
thanks a lot for your quick reply and the new version of the recipe. According to your comments, this is the from March 2018, when we wrote about this the last time. - However, when doing a quick diff on the versions, there seems to be some changes. I currently load both versions (mine and this one) to compare the output. Cut it be, that you ommitted updating the comments (date/version) on your last adaptations? According to my analysis the most relevant difference between our versions is the following code within my recipe: Quote:
Quote:
Thanks a lot in advance ... Hegi. |
||
|
|
|
|
|
#39 |
|
Enthusiast
![]() Posts: 34
Karma: 10
Join Date: Dec 2012
Device: Kindle 4 & Kindle PW 3G
|
Hi Divingduck,
... just saw that you send another post on Tuesday, while I was preparing mine. Sorry for the confusion caused (if any) ![]() As it appears, your change fixed the picture-gallery-issue ... well at least almost . Within the galleries, there is some "extra content" - mostly "internal adds". - Have a look here: https://www.wiwo.de/politik/deutschl.../26760374.html. - As a result only the first 5 pics go into the ebook, and the additional text is only for 7 out of 17 in ... I suspect these are tweaks to nag readers to buy premium ...Due to the changes in the Articles (The Teaser-Text no longer seems to start with a location), this additional code of mine (no longer included in your version) for the css seems deprecated: Code:
.hcf-location-mark {font-style: italic; font-weight:bold}
.c-overline {font-size: 1em; text-align: left;font-style: normal; font-weight:bold}
I won't play with this any longer for now ... well let's say for this year .Thanks and all the best to you Hegi. |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| LWN.net Weekly News recipe | davide125 | Recipes | 22 | 11-12-2014 10:44 PM |
| Business Week Recipe duplicates | Mixx | Recipes | 0 | 09-16-2012 07:43 AM |
| beam-ebooks.de: Recipe to download weekly new content? | Rince123 | Recipes | 0 | 01-02-2012 04:39 AM |
| Recipe for Sunday Business Post - Ireland | anne.oneemas | Recipes | 15 | 12-13-2010 06:13 PM |
| Recipe for Business Spectator (Australia) | RedDogInCan | Recipes | 1 | 12-01-2010 01:34 AM |