|
|
#1 | |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 645
Karma: 85520
Join Date: May 2021
Device: kindle
|
Foreign affairs cover fails
Quote:
removed line 156 and changed 157 replace tags (or just replace('small', 'large') https://cdn-live.foreignaffairs.com/...over_large.jpg .webp ? itok=LUFlkUCK if we remove .webp link is not working.. Last edited by unkn0wn; 05-03-2022 at 12:33 PM. |
|
|
|
|
|
|
#2 |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 645
Karma: 85520
Join Date: May 2021
Device: kindle
|
MIT tech review.. cover image fails to load
Code:
self.cover_url = soup.find(
"div", attrs={"class":lambda name: name.startswith("magazineHero__image") if name else False}).find(
"img",
src=True, attrs = {"class":lambda x: x.startswith('image__img') if x else False}
)['src']
also remove_attributes = ['height', 'width'] Last edited by unkn0wn; 05-03-2022 at 02:31 PM. |
|
|
|
| Advert | |
|
|
|
|
#3 |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 645
Karma: 85520
Join Date: May 2021
Device: kindle
|
https://github.com/kovidgoyal/calibr...agazine.recipe
Cover fails Code:
def get_cover_url(self):
cover_url = None
soup = self.index_to_soup('https://www.india-seminar.com/')
citem = soup.find('img', src = lambda x: x and 'covers' in x)
if citem:
cover_url = "https://www.india-seminar.com/" + citem['src']
return cover_url
remove_attributes = ['style', 'height', 'width'] |
|
|
|
|
|
#4 | |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 645
Karma: 85520
Join Date: May 2021
Device: kindle
|
India Seminar
https://github.com/kovidgoyal/calibr...agazine.recipe import re and add these lines (from 42) to skip url if tag to string is empty. At present it returns without titles in ToC Quote:
Last edited by unkn0wn; 06-01-2022 at 02:24 AM. |
|
|
|
|
|
|
#5 | |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 645
Karma: 85520
Join Date: May 2021
Device: kindle
|
https://github.com/kovidgoyal/calibr...s_today.recipe
business today default magazine page is for next edition.. and they keep adding articles.. I changed it to choose present edition and not the future edition thats still under construction. from line 28 Code:
def parse_index(self):
soup = self.index_to_soup('https://www.businesstoday.in/magazine')
issue = soup.find(attrs={'class': 'view-id-latest_issue_magzine'})
a = issue.findAll('a', href=lambda x: x and x.startswith('/magazine/issue/'))[1]
url = a['href']
self.log('issue =', url)
soup = self.index_to_soup('https://www.businesstoday.in' + url)
tag = soup.find(attrs={'class': 'issue-image'})
if tag:
self.cover_url = tag.find('img')['src']
section = None
sections = {}
Quote:
|
|
|
|
|
| Advert | |
|
|
|
|
#6 |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 645
Karma: 85520
Join Date: May 2021
Device: kindle
|
https://github.com/kovidgoyal/calibr...merican.recipe
scientific american cover and tags line 14 Code:
keep_classes = {'article-header', 'article-content',
'article-media', 'article-author', 'article-text',
'feature-article--header', 'feature-article--header-title',
'opinion-article__header-title', 'author-bio' }
remove_classes = {'aside-banner', 'moreToExplore', 'article-footer', 'flex-column--25', 'article-author__suggested'}
Code:
select = Select(self.index_to_soup(url, as_tree=True))
cover = [x.get('src', '') for x in select('main .product-detail__image img')][0].split('?')[0]
self.cover_url = cover + '?w=800'
feeds = []
and masthead_url = 'https://static.scientificamerican.com/sciam/assets/Image/newsletter/salogo.png' |
|
|
|
|
|
#7 | ||
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 645
Karma: 85520
Join Date: May 2021
Device: kindle
|
foreign affairs
the comments section and issue section articles are the same.. I think adding ignore duplicates is much easier.. Quote:
Quote:
|
||
|
|
|
|
|
#8 |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 645
Karma: 85520
Join Date: May 2021
Device: kindle
|
Nautilus https://github.com/kovidgoyal/calibr...autilus.recipe
COVER method change.. i think oldest article needs to be 60 oldest_article = 60 # days Code:
def get_cover_url(self):
soup = self.index_to_soup('https://www.presspassnow.com/nautilus/issues/')
div = soup.find('div', **classes('image-fade_in_back'))
if div:
self.cover_url = div.find('img', src=True)['src']
return getattr(self, 'cover_url', self.cover_url)
|
|
|
|
|
|
#9 |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 645
Karma: 85520
Join Date: May 2021
Device: kindle
|
Swarajya mag https://github.com/kovidgoyal/calibr...warajya.recipe
adding description Code:
if url.startswith('/'):
url = 'https://swarajyamag.com' + url
title = self.tag_to_string(a)
d = a.find_previous_sibling('a', **classes('_2nEd_'))
if d:
desc = 'By ' + self.tag_to_string(d)
self.log(title, ' at ', url, '\n', desc)
ans.append({'title': title, 'url': url, 'description': desc})
return [('Articles', ans)]
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Foreign Affairs recipe broken? | vikshek | Recipes | 5 | 09-06-2022 11:05 AM |
| Foreign Affairs recipe not working | iwayasu | Recipes | 3 | 08-19-2019 09:09 AM |
| Foreign Affairs recipe broken | cornspicious | Recipes | 29 | 02-06-2019 07:00 AM |
| Foreign Affairs fails to fetch | tamur93 | Recipes | 6 | 07-17-2015 11:58 AM |
| Foreign Affairs-Free | tdonline | Recipes | 2 | 03-11-2012 10:51 PM |