![]() |
#1 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Jan 2012
Device: SGT
|
remove_tags_after with more values
Hello,
I want to put feeds from more than a website into a single .epub. Basicly it works quiet good, but I run into problems with the details. "feeds" obviously is an array: PHP Code:
But "remove_tags_before" and "remove_tags_after" requiere an single value. Is it possible to rewrite the code so that I can use an array as well? I may have to rewrite (overwrite) a function of "BasicNewsRecipe". But which one? Please, can you help me? I am new to phyton but I got some expirience in other programming languages, e.g. Java, VB, C#. So please feel free to talk a bit more technicaly. regards Tom |
![]() |
![]() |
![]() |
#2 | |
Junior Member
![]() Posts: 9
Karma: 10
Join Date: Jan 2012
Device: SONY PRS-T1
|
Quote:
|
|
![]() |
![]() |
Advert | |
|
![]() |
#3 | |||
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Jan 2012
Device: SGT
|
Yes, I can. Sorry for that, I thought it was obviously because I use more than one newssource.
I need this for the "Neue Züricher Zeitung" and for the "Rheinische Post" and certainly for some more. All those newssources should be collected in _one_ single EPUB File. Quote:
Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#4 |
Junior Member
![]() Posts: 9
Karma: 10
Join Date: Jan 2012
Device: SONY PRS-T1
|
I see what you mean now. I didn't read it well enough. For multiple feeds, it does make sense what you ask.
Can't you work with; IF this feed is selected than remove_before and after is THIS, and so on? |
![]() |
![]() |
![]() |
#5 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,347
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
No, you cannot have per feed values in remove_tags_* . You will need to implement the cleanup yourself, in preprocess_html
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Jan 2012
Device: SGT
|
Thanks kovidgoyal,
I've seen "preprocess_html" yesterday by myself and started to cope with soup. If I got a solution, I'll post it here, but I think it'll take some time. Lerning Python + a new lib + less spare time. ![]() I'll keep you up-to-date. |
![]() |
![]() |
![]() |
#7 | |
Zealot
![]() Posts: 103
Karma: 10
Join Date: Sep 2013
Device: Kindle Paperwhite (2012)
|
Quote:
remove_tags_after shows this example: Code:
remove_tags_after = [dict(id='content')] Code:
TypeError: find() argument after ** must be a mapping, not list (BTW: I would love if one could use lists in these cases as well. I am writing a recipe for one magazine and it uses some special formatting for certain articles, so for those articles I have to somehow re-implement remove_tags_before myself, hopefully it wont be that hard:-)). |
|
![]() |
![]() |
![]() |
#8 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,347
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Lists containing dicts work perfectly well for remove_tag_after. That has nothing to do with them being per-feed.
remove_tag_after can be either a dict or a list containing dicts. |
![]() |
![]() |
![]() |
#9 |
Zealot
![]() Posts: 103
Karma: 10
Join Date: Sep 2013
Device: Kindle Paperwhite (2012)
|
Sorry, my bad. I meant remove_tag_before. True, the example for that is not a list, but it still links to the remove_tags syntax that says it takes a list of dicts. A sentence saying that remove_tag_before only accepts single dicts and not lists of them would be helpful (making it accepts lists would be even better:-)).
|
![]() |
![]() |
![]() |
#10 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,347
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You can match multiple kinds of things with a single dict, but here you go:
https://github.com/kovidgoyal/calibr...dd72bb84cfd12e |
![]() |
![]() |
![]() |
#11 |
Zealot
![]() Posts: 103
Karma: 10
Join Date: Sep 2013
Device: Kindle Paperwhite (2012)
|
Thanks!
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Boolean custom column values | sengian | Library Management | 4 | 01-05-2012 05:39 PM |
Recalculate all author sort values | sparrowdclxvi | Library Management | 8 | 01-05-2012 11:48 AM |
Help finding Metadata Names and Values? | Sabardeyn | ePub | 3 | 04-02-2010 11:16 PM |
Could we adjust the time-out values? | Darqref | Feedback | 9 | 01-04-2010 02:43 PM |
PRS-500 layout values in cache.xml | kenbaldwin | Sony Reader Dev Corner | 12 | 03-03-2009 07:02 PM |