Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 01-17-2012, 06:45 AM   #1
niederrhymer
Junior Member
niederrhymer began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2012
Device: SGT
remove_tags_after with more values

Hello,

I want to put feeds from more than a website into a single .epub. Basicly it works quiet good, but I run into problems with the details.

"feeds" obviously is an array:

PHP Code:
feeds = [
        (
u'FAZ - Politik',u'http://www.faz.net/aktuell/politik/?rssview=1'),
        (
u'RP - Duesseldorf',u'http://feeds.rp-online.de/rp-online/rss/duesseldorf-stadt?format=xml'),
        (
u'International',u'http://www.nzz.ch/nachrichten/international?rss=true')    

Also "remove_tags".

But "remove_tags_before" and "remove_tags_after" requiere an single value.

Is it possible to rewrite the code so that I can use an array as well? I may have to rewrite (overwrite) a function of "BasicNewsRecipe". But which one?

Please, can you help me?

I am new to phyton but I got some expirience in other programming languages, e.g. Java, VB, C#. So please feel free to talk a bit more technicaly.

regards
Tom
niederrhymer is offline   Reply With Quote
Old 01-17-2012, 07:08 AM   #2
roedi06
Junior Member
roedi06 began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Jan 2012
Device: SONY PRS-T1
Quote:
But "remove_tags_before" and "remove_tags_after" requiere an single value.

Is it possible to rewrite the code so that I can use an array as well? I may have to rewrite (overwrite) a function of "BasicNewsRecipe". But which one?
Can you explain me what te point is of multiple "remove_tags_before" and "remove_tags_after". If you use more of those, you are basicly implementing "remove_tags". The functions "before" and "after" are a bonus to make life easier if you just want to keep a certain portion, without the hassle of removing every other tag. From withing those tags, you need to specify other remove_tags ofcourse.
roedi06 is offline   Reply With Quote
Advert
Old 01-17-2012, 08:00 AM   #3
niederrhymer
Junior Member
niederrhymer began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2012
Device: SGT
Yes, I can. Sorry for that, I thought it was obviously because I use more than one newssource.

I need this for the "Neue Züricher Zeitung" and for the "Rheinische Post" and certainly for some more.

All those newssources should be collected in _one_ single EPUB File.

Quote:
Originally Posted by roedi06 View Post
Can you explain me what te point is of multiple "remove_tags_before" and "remove_tags_after".
"remove_tags" is quite good if you only want to remove some pics, or pic undertitle. It's a mess to substitute "remove_tags_after" and "remove_tags_before" with it.

Quote:
If you use more of those, you are basicly implementing "remove_tags".
That's exactly what I want and I want it depending on my current source, e.g. "remove_tags_before = dict(id='headline')" for the Rheinische Post and "remove_tags_before = dict(name='p', attrs={'class':'dachzeile'})" for the NZZ. And I don't want to have two recipes and two epub files.

Quote:
The functions "before" and "after" are a bonus to make life easier if you just want to keep a certain portion, without the hassle of removing every other tag. From withing those tags, you need to specify other remove_tags ofcourse.
Any ideas?
niederrhymer is offline   Reply With Quote
Old 01-17-2012, 11:42 AM   #4
roedi06
Junior Member
roedi06 began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Jan 2012
Device: SONY PRS-T1
I see what you mean now. I didn't read it well enough. For multiple feeds, it does make sense what you ask.

Can't you work with; IF this feed is selected than remove_before and after is THIS, and so on?
roedi06 is offline   Reply With Quote
Old 01-17-2012, 11:48 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,347
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
No, you cannot have per feed values in remove_tags_* . You will need to implement the cleanup yourself, in preprocess_html
kovidgoyal is offline   Reply With Quote
Advert
Old 01-18-2012, 03:15 AM   #6
niederrhymer
Junior Member
niederrhymer began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2012
Device: SGT
Thanks kovidgoyal,

I've seen "preprocess_html" yesterday by myself and started to cope with soup. If I got a solution, I'll post it here, but I think it'll take some time. Lerning Python + a new lib + less spare time.

I'll keep you up-to-date.
niederrhymer is offline   Reply With Quote
Old 08-20-2016, 09:44 PM   #7
sup
Zealot
sup began at the beginning.
 
Posts: 103
Karma: 10
Join Date: Sep 2013
Device: Kindle Paperwhite (2012)
Quote:
Originally Posted by kovidgoyal View Post
No, you cannot have per feed values in remove_tags_* . You will need to implement the cleanup yourself, in preprocess_html
If that is indeed the case, would you please update the documentation here ?

remove_tags_after shows this example:

Code:
remove_tags_after = [dict(id='content')]
which is actually wrong because if used like that (with the "[]" making it into a list), I got this:

Code:
TypeError: find() argument after ** must be a mapping, not list
Also, the documentation should note that even if the basic syntax is the same as with remove_tags, it must not be a list.

(BTW: I would love if one could use lists in these cases as well. I am writing a recipe for one magazine and it uses some special formatting for certain articles, so for those articles I have to somehow re-implement remove_tags_before myself, hopefully it wont be that hard:-)).
sup is offline   Reply With Quote
Old 08-20-2016, 11:15 PM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,347
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Lists containing dicts work perfectly well for remove_tag_after. That has nothing to do with them being per-feed.

remove_tag_after can be either a dict or a list containing dicts.
kovidgoyal is offline   Reply With Quote
Old 08-21-2016, 07:28 AM   #9
sup
Zealot
sup began at the beginning.
 
Posts: 103
Karma: 10
Join Date: Sep 2013
Device: Kindle Paperwhite (2012)
Sorry, my bad. I meant remove_tag_before. True, the example for that is not a list, but it still links to the remove_tags syntax that says it takes a list of dicts. A sentence saying that remove_tag_before only accepts single dicts and not lists of them would be helpful (making it accepts lists would be even better:-)).
sup is offline   Reply With Quote
Old 08-21-2016, 09:41 AM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,347
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You can match multiple kinds of things with a single dict, but here you go:

https://github.com/kovidgoyal/calibr...dd72bb84cfd12e
kovidgoyal is offline   Reply With Quote
Old 08-21-2016, 09:51 AM   #11
sup
Zealot
sup began at the beginning.
 
Posts: 103
Karma: 10
Join Date: Sep 2013
Device: Kindle Paperwhite (2012)
Thanks!
sup is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Boolean custom column values sengian Library Management 4 01-05-2012 05:39 PM
Recalculate all author sort values sparrowdclxvi Library Management 8 01-05-2012 11:48 AM
Help finding Metadata Names and Values? Sabardeyn ePub 3 04-02-2010 11:16 PM
Could we adjust the time-out values? Darqref Feedback 9 01-04-2010 02:43 PM
PRS-500 layout values in cache.xml kenbaldwin Sony Reader Dev Corner 12 03-03-2009 07:02 PM


All times are GMT -4. The time now is 08:00 AM.


MobileRead.com is a privately owned, operated and funded community.