Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-08-2021, 11:25 PM   #1
njpig
Connoisseur
njpig began at the beginning.
 
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
keep_only_tags and the order of the related contents

Hi,

The order of the classes in keep_only_tags controls the order how the related contents display on the html page, right?

e.g, these two keep_only_tags will get contents on the page displayed in different order.

keep_only_tags = [
dict(attrs={'class': re.compile('^SplitScreenContentHeaderHed')}), <---
dict(attrs={'class': re.compile('^SplitScreenContentHeaderDek')}),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderByline')}),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderPublishDate') }),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderLedeBlock')}) ,
dict(attrs={'class': re.compile('^SplitScreenContentHeaderCaption')}),
]

keep_only_tags = [
dict(attrs={'class': re.compile('^SplitScreenContentHeaderDek')}),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderByline')}),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderPublishDate') }),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderLedeBlock')}) ,
dict(attrs={'class': re.compile('^SplitScreenContentHeaderCaption')}),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderHed')}), <---
]
njpig is offline   Reply With Quote
Old 10-09-2021, 12:12 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
yes they do. combine them all into one if you dont care about order with this function

Code:
def prefixed_classes(classes):
    q = frozenset(classes.split(' '))

    def matcher(x):
        if x:
            for candidate in frozenset(x.split()):
                for x in q:
                    if candidate.startswith(x):
                        return True
        return False
    return {'attrs': {'class': matcher}}
keep_onlly_tags = [prefixed_classes('prefix1 prefix2 prefix3')]
kovidgoyal is offline   Reply With Quote
Old 10-09-2021, 05:51 AM   #3
njpig
Connoisseur
njpig began at the beginning.
 
Posts: 95
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
Quote:
Originally Posted by kovidgoyal View Post
yes they do. combine them all into one if you dont care about order with this function

Code:
def prefixed_classes(classes):
    q = frozenset(classes.split(' '))

    def matcher(x):
        if x:
            for candidate in frozenset(x.split()):
                for x in q:
                    if candidate.startswith(x):
                        return True
        return False
    return {'attrs': {'class': matcher}}
keep_onlly_tags = [prefixed_classes('prefix1 prefix2 prefix3')]
Thanks a lot!
njpig is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Different keep_only_tags and remove_tags for different feeds steinarb Recipes 2 07-27-2014 04:07 PM
Table of contents and related spine links incorrect slicknick001 Sigil 9 12-11-2013 04:15 PM
keep_only_tags and findAll boocko Recipes 3 11-18-2010 11:59 AM
keep_only_tags ultimatebuster Calibre 4 03-19-2010 07:49 PM
Kindle one manga problem - starts on wrong page (not order related) Shike Amazon Kindle 1 02-13-2010 11:41 PM


All times are GMT -4. The time now is 06:06 PM.


MobileRead.com is a privately owned, operated and funded community.