Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 06-07-2012, 09:31 PM   #1
cnfmsu
Member
cnfmsu began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Jan 2012
Device: Nook
Recipe for seekingalpha.com

Hi Kovid,

I am building a feed from seekingalpha.com. One thing I like is to include the "comments" from reader. I have tried very options and can not get this to work.

Here is my situation:

1> Recipe file is in below
2> I run this in the debug mode, and check the "input" folder: the "comments" are not part of the html file at all.
3> I tried with commenting out all the keep_only_tag.append below. That will leave the keep_only_tag as "". the comments still not appear in the html file
4> I open the webpage and "view page source", the "comments" are not as part of the source.
5> However, when I exam with firebug, "comments" are there.
6> When I save the web page, the "comments" are there.
7> you may check out this page:
http://seekingalpha.com/article/6431...nk?source=feed

What did I do wrong? Am I missing something?
I need some direction on this, please!

Regards,



class AdvancedUserRecipe1335053294(BasicNewsRecipe):
title = u'SA RSS'
no_stylesheets = True
use_embedded_content = False
remove_javascript = True
auto_cleanup = False
keep_only_tags = []
remove_tags = []

# heading
keep_only_tags.append(dict(name='div', attrs={'id':'page_header'}))
# author profile
keep_only_tags.append(dict(name='div', attrs={'class':'the_pic'}))
keep_only_tags.append(dict(name='div', attrs={'class':'followup_contributor_info_text'}))
keep_only_tags.append(dict(name='div', attrs={'class':'author_info_nav'}))
keep_only_tags.append(dict(name='div', attrs={'class':'user_followers_following'}))
# article body
keep_only_tags.append(dict(name='div', attrs={'id':'article_body'}))
# comments
# keep_only_tags.append(dict(name='div', attrs={'id':'content_follow_up'}))
# keep_only_tags.append(dict(name='div', attrs={'class':'comments_with_more'}))
# keep_only_tags.append(dict(name='div', attrs={'id':'comments_section'}))
# keep_only_tags.append(dict(name='div', attrs={'id':'comments'}))
# keep_only_tags.append(dict(name='ul', attrs={'id':'talkback_list'}))
# keep_only_tags.append(dict(name='div', attrs={'id':'comment_container'}))
# keep_only_tags.append(dict(name='div', attrs={'class':'base_level'}))
# keep_only_tags.append(dict(name='div', attrs={'class':'com_cont'}))

feeds = [(u'Most Popular Articles', u'http://seekingalpha.com/listing/most-popular-articles.xml')]
cnfmsu is offline   Reply With Quote
Old 06-08-2012, 12:43 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,231
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That will most likely be because the comments are being loaded via javascript. Turn off javascript in firefox and you likely wont see any comments. calibre's news download system doesn't support javascript.
kovidgoyal is offline   Reply With Quote
Advert
Old 06-08-2012, 01:29 PM   #3
cnfmsu
Member
cnfmsu began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Jan 2012
Device: Nook
Kovid,
I checked that the "comments" are not part of the webpage once I disable javascript.
Do you have any plan to include javascript for future release?
cnfmsu is offline   Reply With Quote
Old 06-08-2012, 10:44 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,231
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Not in the near future. Add javascript support to web scrapers is not a trivial task.
kovidgoyal is offline   Reply With Quote
Old 06-14-2012, 05:24 PM   #5
cnfmsu
Member
cnfmsu began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Jan 2012
Device: Nook
Hi Kovid

I have a question from this scenario:
1> I am building the articles list without the RSS:
articles.append(dict(title=title, url=url, description=desc, date=date))
2> I like to control how the article is list within the same section
I add another element to the “dict”. E.g.:
articles.append(dict(sortseq=sortseq, title=title, url=url, description=desc, date=date))
3> right before the “feeds.append”, I do a articles.sort()

But it is not sort by “sortseq”.
I have tried with these kind of format:
s = sorted(s, key = lambda x: (x[1], x[2]))
s = sorted(s, key = operator.itemgetter(1, 2))
s.sort(key = operator.itemgetter(1, 2))

Still have the issue. My questions are :
- Am I ok to add “sortseq” to the list?
- If yes, am I doing this right on the sort()? Do you have an example for me?

Regards,
cnfmsu is offline   Reply With Quote
Advert
Old 06-14-2012, 10:05 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,231
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
key=lambda x:x['sortseq']
kovidgoyal is offline   Reply With Quote
Old 06-14-2012, 10:40 PM   #7
cnfmsu
Member
cnfmsu began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Jan 2012
Device: Nook
That works!
Thanks you very much.
cnfmsu is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Recipes - seekingalpha.com cnfmsu Recipes 0 01-24-2012 07:56 PM
Recipe works when mocked up as Python file, fails when converted to Recipe ode Recipes 7 09-04-2011 04:57 AM
New Recipe UtahJames Recipes 3 04-18-2011 08:02 PM
I need some help with a recipe jefferson_frantz Recipes 14 11-22-2010 02:06 PM
Recipe Help Please estral Calibre 1 06-11-2009 02:35 PM


All times are GMT -4. The time now is 05:25 PM.


MobileRead.com is a privately owned, operated and funded community.