![]() |
#1 |
Zealot
![]() Posts: 106
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
keep_only_tags and the order of the related contents
Hi,
The order of the classes in keep_only_tags controls the order how the related contents display on the html page, right? e.g, these two keep_only_tags will get contents on the page displayed in different order. keep_only_tags = [ dict(attrs={'class': re.compile('^SplitScreenContentHeaderHed')}), <--- dict(attrs={'class': re.compile('^SplitScreenContentHeaderDek')}), dict(attrs={'class': re.compile('^SplitScreenContentHeaderByline')}), dict(attrs={'class': re.compile('^SplitScreenContentHeaderPublishDate') }), dict(attrs={'class': re.compile('^SplitScreenContentHeaderLedeBlock')}) , dict(attrs={'class': re.compile('^SplitScreenContentHeaderCaption')}), ] keep_only_tags = [ dict(attrs={'class': re.compile('^SplitScreenContentHeaderDek')}), dict(attrs={'class': re.compile('^SplitScreenContentHeaderByline')}), dict(attrs={'class': re.compile('^SplitScreenContentHeaderPublishDate') }), dict(attrs={'class': re.compile('^SplitScreenContentHeaderLedeBlock')}) , dict(attrs={'class': re.compile('^SplitScreenContentHeaderCaption')}), dict(attrs={'class': re.compile('^SplitScreenContentHeaderHed')}), <--- ] |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
yes they do. combine them all into one if you dont care about order with this function
Code:
def prefixed_classes(classes): q = frozenset(classes.split(' ')) def matcher(x): if x: for candidate in frozenset(x.split()): for x in q: if candidate.startswith(x): return True return False return {'attrs': {'class': matcher}} |
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Zealot
![]() Posts: 106
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
Quote:
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Different keep_only_tags and remove_tags for different feeds | steinarb | Recipes | 2 | 07-27-2014 04:07 PM |
Table of contents and related spine links incorrect | slicknick001 | Sigil | 9 | 12-11-2013 04:15 PM |
keep_only_tags and findAll | boocko | Recipes | 3 | 11-18-2010 11:59 AM |
keep_only_tags | ultimatebuster | Calibre | 4 | 03-19-2010 07:49 PM |
Kindle one manga problem - starts on wrong page (not order related) | Shike | Amazon Kindle | 1 | 02-13-2010 11:41 PM |