![]() |
#1 |
Zealot
![]() Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
postprocess_html
i have a recipe that i am working on.
it has a few tags in the middle od the article text like this: <p> </p> and some like this: <p> </p> there is now way to remove them with remove_tag. thought of something like this: Spoiler:
once i add this function, the recipe does not give me any articles. am i using it right? on an other recipe i am working on i want to use the description form the rss feed replace with a tag in the article it self. can i the description as one of the variables that postprocess_html gets? what is the name of the description variable in calibre? something along the lines of Spoiler:
or something like that? |
![]() |
![]() |
![]() |
#2 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Code:
preprocess_regexps = [ (re.compile(r'<p> </p>', re.DOTALL|re.IGNORECASE), lambda match: '') ] Quote:
Code:
def parse_feeds (self): feeds = BasicNewsRecipe.parse_feeds(self) for feed in feeds: for article in feed.articles[:]: print 'article.text_summary is: ', article.text_summary return feeds |
||
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Zealot
![]() Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
the preprocess_regexps was a stroke of genius.
![]() as for the pre/postprocess_html, i just cant get the article soup and the article.text_summary to meet in one function. have any ideas? ![]() |
![]() |
![]() |
![]() |
#4 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
self.article_title_list[self.n] inside postprocess_html to access it and self.n = self.n + 1 inside postprocess_html to increment the index. I didn't check to see that the order of the created list correctly matched the order the articles were accessed by postprocess_html, so you may have to deal with that. Last edited by Starson17; 11-01-2010 at 02:11 PM. |
|
![]() |
![]() |
![]() |
#5 |
Zealot
![]() Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
got the idea. i think i can figure that out. let me work on it a bit...
any new thoughts on my maya recipe? |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
![]() |
![]() |
![]() |
#9 |
Zealot
![]() Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
the site is "maya.tase.co.il".
i feel i should explain more exactly what it is. it is the local stock exchange reports. on the 1st page that opens you get all the reports from all the companies for that day. you may want all the reports from yesterday. you might also want all the reports from one company going back to 1/1/2000. i thought this would be possible, but they are really not making my life easy. |
![]() |
![]() |
![]() |
#10 |
Zealot
![]() Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
how do i use extra_css to make the article it self go rtl. i am missing the XXXX. extra_css='XXXX{direction: rtl;}'
|
![]() |
![]() |
![]() |
#11 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
![]() |
![]() |
![]() |
#12 |
Zealot
![]() Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
i have done it before. its my curse.
![]() in any case all i need to know is how to change the CSS on the article body it self. in other words, the article body name... |
![]() |
![]() |
![]() |
#13 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
![]() |
![]() |
![]() |
#14 |
Zealot
![]() Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
ill give it a try.
you haven't shared your idea about maya yet. |
![]() |
![]() |
![]() |
#15 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
self.article_title_list[self.n] inside postprocess_html to access it and self.n = self.n + 1 inside postprocess_html to increment the index. I didn't check to see that the order of the created list correctly matched the order the articles were accessed by postprocess_html, so you may have to deal with that. |
|
![]() |
![]() |