View Single Post
Old 10-08-2021, 11:25 PM   #1
njpig
Zealot
njpig began at the beginning.
 
Posts: 108
Karma: 10
Join Date: Sep 2020
Device: kindle paperwhite3/Oasis2
keep_only_tags and the order of the related contents

Hi,

The order of the classes in keep_only_tags controls the order how the related contents display on the html page, right?

e.g, these two keep_only_tags will get contents on the page displayed in different order.

keep_only_tags = [
dict(attrs={'class': re.compile('^SplitScreenContentHeaderHed')}), <---
dict(attrs={'class': re.compile('^SplitScreenContentHeaderDek')}),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderByline')}),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderPublishDate') }),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderLedeBlock')}) ,
dict(attrs={'class': re.compile('^SplitScreenContentHeaderCaption')}),
]

keep_only_tags = [
dict(attrs={'class': re.compile('^SplitScreenContentHeaderDek')}),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderByline')}),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderPublishDate') }),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderLedeBlock')}) ,
dict(attrs={'class': re.compile('^SplitScreenContentHeaderCaption')}),
dict(attrs={'class': re.compile('^SplitScreenContentHeaderHed')}), <---
]
njpig is offline   Reply With Quote