Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 02-23-2022, 05:17 AM   #1
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 442
Karma: 82686
Join Date: May 2021
Device: kindle
Update Indian express

fixing some tags and removing unnecessary banners

https://github.com/kovidgoyal/calibr...00db42e010677d

Code:
remove_attributes = ['style','height','width']
    ignore_duplicate_articles = {'url'}
    
    keep_only_tags = [
        classes('heading-part full-details')
    ]
    remove_tags = [
        dict(name='nav', attrs={'class':'ie-breadcrumb'}),
        dict(name='div', attrs={'id':'ie_story_comments'}),
        dict(name='div', attrs={'class':['ie-int-campign-ad','custom_read_button','unitimg','copyright']}),
        dict(name='img', attrs={'src':'https://images.indianexpress.com/2021/06/explained-button-300-ie.jpeg'}),
        dict(name='a', attrs={'href':'https://indianexpress.com/section/explained/?utm_source=newbanner'}),
        dict(name='img', attrs={'src':'https://images.indianexpress.com/2021/06/opinion-button-300-ie.jpeg'}),
        dict(name='a', attrs={'href':'https://indianexpress.com/section/opinion/?utm_source=newbanner'}),
        classes('share-social appstext storytags pdsc-related-modify news-guard'),

Last edited by unkn0wn; 02-23-2022 at 05:20 AM.
unkn0wn is offline   Reply With Quote
Old 02-23-2022, 09:41 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
https://github.com/kovidgoyal/calibr...1f2022d995c99f
kovidgoyal is online now   Reply With Quote
Advert
Old 04-04-2022, 02:15 AM   #3
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 442
Karma: 82686
Join Date: May 2021
Device: kindle
update

Quote:
masthead_url = 'https://indianexpress.com/wp-content/themes/indianexpress/images/indian-express-logo-n.svg'
Quote:
extra_css = '#storycenterbyline {font-size:small};'
Code:
def get_cover_url(self):
        soup = self.index_to_soup('https://www.magzter.com/IN/The-Indian-Express-Ltd./The-Indian-Express-Mumbai/Newspaper/')
        for citem in soup.findAll('meta', content=lambda s: s and s.endswith('view/3.jpg')):
            return citem['content']
Quote:
remove_tags = [
classes('share-social appstext storytags ie-int-campign-ad ie-breadcrumb custom_read_button unitimg copyright pdsc-related-modify news-guard')
bold parts to be added. (story-tags to be replaced with storytags)

more feeds
Quote:
('Research', 'https://indianexpress.com/section/research/feed/'),
('UPSC-CSE Key','https://indianexpress.com/section/upsc-current-affairs/feed/'),
('World','https://indianexpress.com/section/world/feed/'),
('Business', 'https://indianexpress.com/section/business/feed/'),

Last edited by unkn0wn; 04-04-2022 at 03:09 AM.
unkn0wn is offline   Reply With Quote
Old 04-04-2022, 03:18 AM   #4
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 442
Karma: 82686
Join Date: May 2021
Device: kindle
also cover url for hindustan times

https://github.com/kovidgoyal/calibr...n_times.recipe

found that its much easier to get covers from magzter.

Code:
def get_cover_url(self):
        soup = self.index_to_soup('https://www.magzter.com/IN/HT-Digital-Streams-Ltd./Hindustan-Times-Delhi/Newspaper/')
        for citem in soup.findAll('meta', content=lambda s: s and s.endswith('view/3.jpg')):
            return citem['content']
We can get so many daily updated cover urls from here with same code and changing links.
unkn0wn is offline   Reply With Quote
Old 04-04-2022, 03:56 AM   #5
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 442
Karma: 82686
Join Date: May 2021
Device: kindle
more cover urls for other recipes

India today update https://github.com/kovidgoyal/calibr...a_today.recipe

Code:
extra_css = '[itemprop^="description"] {font-size: small; font-style: italic;}'
    
    def get_cover_url(self):
        soup = self.index_to_soup('https://www.magzter.com/IN/India-Today-Group/India-Today/News/')
        for citem in soup.findAll('meta', content=lambda s: s and s.endswith('view/3.jpg')):
            return citem['content']
we cant get this cover from default website


THE WEEK
India

https://github.com/kovidgoyal/calibr...he_week.recipe

Cover url and other updates..

Code:
def get_cover_url(self):
        soup = self.index_to_soup('https://www.magzter.com/IN/Malayala_Manorama/THE_WEEK/Business/')
        for citem in soup.findAll('meta', content=lambda s: s and s.endswith('view/3.jpg')):
            return citem['content']
the quality of the cover url within the present recipe is very low.

remove all from line 36-57(end) ( present recipe won't load images within text of the article) (images are within src tag)
add below

Code:
keep_only_tags = [
        dict(name='h1'),
		dict(name='div', attrs={'class':['article-title','article-image','articlecontentbody section']}),
        ]
        
    remove_tags = [
        dict(name='div', attrs={'class':'highlights section'}),
        ]
Financial Express

cover url
Code:
def get_cover_url(self):
        soup = self.index_to_soup('https://www.magzter.com/IN/The-Indian-Express-Ltd./Financial-Express-Mumbai/Business/')
        for citem in soup.findAll('meta', content=lambda s: s and s.endswith('view/3.jpg')):
            return citem['content']
unkn0wn is offline   Reply With Quote
Advert
Old 04-06-2022, 11:09 AM   #6
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 442
Karma: 82686
Join Date: May 2021
Device: kindle
Times of india

Quote:
remove_tags = [dict(name='div', attrs={'class': 'success_screen poll_withoutLogin hide'})]
Code:
def get_cover_url(self):
        soup = self.index_to_soup('https://www.magzter.com/IN/Bennett-Coleman-and-Company-Limited/The-Times-of-India-Delhi/Newspaper/')
        for citem in soup.findAll('meta', content=lambda s: s and s.endswith('view/3.jpg')):
            return citem['content']
i didnt know we had working recipe for ToI and it ranks 60 in usage https://calibre-ebook.com/dynamic/recipe-usage (Are these stats up-to-date?)

LiveMint

why not use same img everyday for livemint.

Quote:
masthead_url = 'https://images.livemint.com/static/livemint-logo-v1.svg'

cover_url = 'https://epsfs.hindustantimes.com/MINT/2022/04/06/Delhi/Delhi/5_01/9376f23b_01_mr.jpg'

Last edited by unkn0wn; 04-06-2022 at 11:56 AM.
unkn0wn is offline   Reply With Quote
Old 04-07-2022, 06:07 AM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Yes, the stats are up-to-date
kovidgoyal is online now   Reply With Quote
Old 04-07-2022, 01:03 PM   #8
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 442
Karma: 82686
Join Date: May 2021
Device: kindle
Code:
def get_cover_url(self):
        soup = self.index_to_soup('https://www.magzter.com/IN/Bennett-Coleman-and-Company-Limited/The-Times-of-India-Delhi/Newspaper/')
        for citem in soup.findAll('meta', content=lambda s: s and s.endswith('view/3.jpg')):
            return citem['content']
uhh.. you missed to add this cover url part to TOI.

its just that fetching the daily front page as cover makes it much more interesting.. sorry i kept asking you to make so many changes..
unkn0wn is offline   Reply With Quote
Old 04-07-2022, 09:31 PM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
If you send pull requests on github it will be easier to ensure I dont miss anything.
kovidgoyal is online now   Reply With Quote
Old 04-30-2022, 12:54 AM   #10
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 442
Karma: 82686
Join Date: May 2021
Device: kindle
removing new div in indian express

(this new div is adding too much unnecessary stuff to all the articles)
add premium-story to remove_tags classes

new feed
('Political Pulse', 'https://indianexpress.com/section/india/political-pulse/feed/'),
('India', 'https://indianexpress.com/section/india/feed/'),

Last edited by unkn0wn; 04-30-2022 at 12:56 AM.
unkn0wn is offline   Reply With Quote
Old 05-02-2022, 02:25 AM   #11
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 442
Karma: 82686
Join Date: May 2021
Device: kindle
similar problem with financial express
https://github.com/kovidgoyal/calibr...e_india.recipe

add remove_tags = [classes('parent_also_read')]
unkn0wn is offline   Reply With Quote
Old 05-02-2022, 06:01 PM   #12
bheeshmpita
Member
bheeshmpita began at the beginning.
 
Posts: 21
Karma: 10
Join Date: Apr 2022
Device: android tablet
Quote:
Originally Posted by unkn0wn View Post
(this new div is adding too much unnecessary stuff to all the articles)
add premium-story to remove_tags classes
,
how to get rid of 'best of Express Premium' appearing repeatedly in indian express fetched news?
bheeshmpita is offline   Reply With Quote
Old 05-03-2022, 02:40 AM   #13
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 442
Karma: 82686
Join Date: May 2021
Device: kindle
the changes were already made.. load from default IE recipe.
unkn0wn is offline   Reply With Quote
Old 05-08-2022, 12:32 PM   #14
bheeshmpita
Member
bheeshmpita began at the beginning.
 
Posts: 21
Karma: 10
Join Date: Apr 2022
Device: android tablet
Quote:
Originally Posted by unkn0wn View Post
the changes were already made.. load from default IE recipe.
oh ok, is there any way my custom recipe gets updated with changes with the change in the base recipe?
bheeshmpita is offline   Reply With Quote
Old 05-09-2022, 12:56 AM   #15
unkn0wn
Evangelist
unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.unkn0wn can do the Funky Gibbon.
 
Posts: 442
Karma: 82686
Join Date: May 2021
Device: kindle
no.. just use the default recipe
unkn0wn is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Updated feeds for Indian Express unkn0wn Recipes 2 01-27-2022 04:49 AM
Indian Express misses some articles nikstar007 Recipes 1 08-30-2016 08:10 AM
daily express update scissors Recipes 0 11-22-2014 03:18 AM
New Musical Express update 9/6/12 scissors Recipes 0 06-09-2012 07:53 AM
Indian Express Recipe sexymax15 Recipes 0 06-16-2011 06:06 AM


All times are GMT -4. The time now is 02:34 AM.


MobileRead.com is a privately owned, operated and funded community.