Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 06-19-2013, 11:20 AM   #1
JeffreyZhao
Junior Member
JeffreyZhao began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jun 2013
Device: Kindle Paperwhite
"keep_only_tags" doesn't work?

I'm using the test recipe to crawl infoq.com:

Code:
class InfoQ_Test(BasicNewsRecipe):
    title = u'InfoQ Test'
    auto_cleanup = True
    no_stylesheets = True
    
    keep_only_tags = [dict(id=['content'])]

    def parse_index(self):
        items = []
        
        items.append({ 'title': 'Article1', 'url': 'http://www.infoq.com/news/2013/06/stratos-2' })
        items.append({ 'title': 'Article2', 'url': 'http://www.infoq.com/news/2013/06/document-messaging-analysis' })
                
        return [("Default", items)]
I want to keep the "div" with id="content" only from the whole page, but calibre just remove all the elements under "body". We could remove the "keep_only_tags" settings to get the article content successfully, but I just want to know why it doesn't work with "keep_only_tags".

Thanks
JeffreyZhao is offline   Reply With Quote
Old 06-19-2013, 11:05 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,840
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Remove

auto_cleanup = True
kovidgoyal is online now   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Library doesn't work, eInk side stuck with "Home" screen andr2k enTourage eDGe 4 08-04-2012 08:37 AM
"Most Recent First" doesn't work any more Len666 Amazon Kindle 9 06-03-2012 06:56 AM
T1: plugboard "title_sort" doesn't work? salines Devices 0 11-05-2011 09:17 AM
Border's "store" doesn't work? EldRick Kobo Reader 4 06-29-2011 10:11 PM
The option "--extra-css" doesn't work slex Conversion 2 02-19-2011 06:26 AM


All times are GMT -4. The time now is 09:27 PM.


MobileRead.com is a privately owned, operated and funded community.