|
|
#1 |
|
Zealot
![]() Posts: 108
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
keep_only_tags and the order of the related contents
Hi,
The order of the classes in keep_only_tags controls the order how the related contents display on the html page, right? e.g, these two keep_only_tags will get contents on the page displayed in different order. keep_only_tags = [ dict(attrs={'class': re.compile('^SplitScreenContentHeaderHed')}), <--- dict(attrs={'class': re.compile('^SplitScreenContentHeaderDek')}), dict(attrs={'class': re.compile('^SplitScreenContentHeaderByline')}), dict(attrs={'class': re.compile('^SplitScreenContentHeaderPublishDate') }), dict(attrs={'class': re.compile('^SplitScreenContentHeaderLedeBlock')}) , dict(attrs={'class': re.compile('^SplitScreenContentHeaderCaption')}), ] keep_only_tags = [ dict(attrs={'class': re.compile('^SplitScreenContentHeaderDek')}), dict(attrs={'class': re.compile('^SplitScreenContentHeaderByline')}), dict(attrs={'class': re.compile('^SplitScreenContentHeaderPublishDate') }), dict(attrs={'class': re.compile('^SplitScreenContentHeaderLedeBlock')}) , dict(attrs={'class': re.compile('^SplitScreenContentHeaderCaption')}), dict(attrs={'class': re.compile('^SplitScreenContentHeaderHed')}), <--- ] |
|
|
|
|
|
#2 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,617
Karma: 28549044
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
yes they do. combine them all into one if you dont care about order with this function
Code:
def prefixed_classes(classes):
q = frozenset(classes.split(' '))
def matcher(x):
if x:
for candidate in frozenset(x.split()):
for x in q:
if candidate.startswith(x):
return True
return False
return {'attrs': {'class': matcher}}
|
|
|
|
| Advert | |
|
|
|
|
#3 | |
|
Zealot
![]() Posts: 108
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
|
Quote:
|
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Different keep_only_tags and remove_tags for different feeds | steinarb | Recipes | 2 | 07-27-2014 05:07 PM |
| Table of contents and related spine links incorrect | slicknick001 | Sigil | 9 | 12-11-2013 05:15 PM |
| keep_only_tags and findAll | boocko | Recipes | 3 | 11-18-2010 12:59 PM |
| keep_only_tags | ultimatebuster | Calibre | 4 | 03-19-2010 08:49 PM |
| Kindle one manga problem - starts on wrong page (not order related) | Shike | Amazon Kindle | 1 | 02-14-2010 12:41 AM |