Help: outlook magazine India

unkn0wn · 04-30-2022, 01:34 AM

present recipe doesn't work anymore.

new recipe:

Code:

import json
from calibre.web.feeds.news import BasicNewsRecipe

class outlook(BasicNewsRecipe):
    title = 'Outlook Magazine'
    __author__ = 'unkn0wn'
    description = ''
    language = 'en_IN'
    use_embedded_content = False
    no_stylesheets = True
    remove_javascript = True
    remove_attributes = ['height', 'width', 'style']
    ignore_duplicate_articles = {'url'}
    
    
    def parse_index(self):
        soup = self.index_to_soup('https://www.outlookindia.com/magazine/archive')
        issue = soup.find(**classes('issue_listing'))
        a = issue.find('a', href=lambda x: x and x.startswith('/magazine/issue/'))
        url = a['href']
        self.log('Downloading issue:', url)
        self.cover_url = a.find('img', attrs={'src': True})['src']
        soup = self.index_to_soup('https://www.outlookindia.com' + url)
        ans = []

        for h3 in soup.findAll(['h3', 'h4'], attrs={'class':'tk-kepler-std-condensed-subhead'}):
            a = h3.find('a', href = lambda x: x)
            url = a['href']
            title = self.tag_to_string(a)
            desc = h3.find_next_sibling('p')
            desc = self.tag_to_string(desc)
            self.log('\t\tFound article:', title)
            self.log('\t\t\t', url)
            self.log('\t\t\t\t', desc)
            ans.append({
                'title': title,
                'url': url,
                'description': desc})
        return [('Articles', ans)]

help me with extracting json with preprocess_raw_html
soup = self.index_to_soup(raw)
script = soup.find('script', type="application/ld+json")

example json from outlook. (save as json)

Spoiler:

Quote:

{"@context":"https:\/\/schema.org","@type":"NewsArticle","mainEntityOfPag e":"https:\/\/www.outlookindia.com\/magazine\/national\/cadres-of-political-parties-the-unsung-heroes-of-democracy-magazine-191473","headline":"Cadres Of Political Parties: The Unsung Heroes Of Democracy","inLanguage":"en","datePublished":"2022-04-15T11:41:52+05:30","dateModified":"2022-04-22T13:13:35+05:30","keywords":["Democracy","Political Parties","Politics","BSP","Dalits","BJP","CPI(M)", "DMK","Trinamool Congress (TMC)","Congress","AAP: Aam Aadmi Party"],"articleBody":"Democracy is unthinkable without political parties. And without a committed base of volunteers, workers and cadres, it is hard for any party to survive in the competitive arena of electoral politics. While the burgeoning rese*a*rch on the transactional relationship between politicians and voters has improved our understanding about the role of political workers as middlemen, the research on their role as activist of political parties is conspicuous by its absence.\r\n\r\nEven the knowledge of basic things such as the demographic profile of workers of a par*ty largely relies on proxy measures such as its voting base. We assume that a BJP worker is more likely to be upper caste and urban, and a BSP activist would be a Dalit. Whereas the social base of political parties keeps shrinking and expanding, as we know from the experience of the BJP and the BSP over the past decade.\r\n\r\nALSO READ: Decoding The Journey Of Rashtriya Swayamsevak Sangh\r\n\r\nSimilarly, we have rarely inquired into the levels of ideological attachment that political workers display. Do they see themselves as ideological messengers of a party, or merely as loyal soldiers of an ambitious politician? Do political intermed*iaries display partisan loyalties, or do they change political preferences at the drop of a hat? We also have very little knowledge of what proportion of such workers see politics as a full-time vocation and main source of earning their living. What do part-time political workers expect in return from their engagement in this field?\r\n\r\nALSO READ: Social Awakening, Individual Character Building: What RSS Expects From Its Cadres\r\n\r\nAs party activists, politician’s agent, ticket aspi*rant, broker, fixer, etc all rolled into one, political workers are the most critical element of an understudied aspect of Indian politics. And this gaping hole is having serious consequences in interpreting the changing nature of Indian democracy.\r\n\r\nAmong other things, They put up posters, collect money, mobilise supporters and convince them to turn out in large numbers at the polling booth. \r\n\r\nEach year, India witnesses the emergence of dozens of new parties, but few manage to enter the Lok Sabha or Vidhan Sabha, and fewer manage to survive beyond two election cycles. Elec*t*i*ons are also becoming increasingly costly, and yet the number of contesting candidates is also growingly rapidly, despite the fact that a vast majority of candidates lose their deposits. \r\n\r\nALSO READ: Unfailing Commitment Of CPI(M) Red Brigades\r\n\r\nIt is not uncommon for such contradictory tend*e*ncies to co-exist. The rapid proliferation of political parties and candidates has given rise to a sch*olarly consensus in defining the nature and character of Indian politics. Yet, empirical realit*ies continue to defy the foundation of this consensus. The picture is much more complex than what the simple depictions and provocative generalisations tend to assume. To be clear, this assertion is not to suggest that the prevailing wisdom is mythical and has no connection with real world.  \r\n\r\nFor example, the scholarly consensus that Ind*ian political parties have very thin penetration on the ground is rooted in the fact that institutional rules governing organisational life of India’s part*ies remain weak. Most of them, except ideologica*lly visible parties such as the BJP, lack physical inf*rastructure such as party offices below distr*ict-level, paid staff, formal relationship with civil society-based affiliates (e.g. trade unions), training mod*ules for cadres, among other things. Only some state-level parties like the CPI(M) in Kerala, DMK in Tamil Nadu and the BSP in Uttar Pradesh used to display similar traits.\r\n\r\nALSO READ: Death Of The Political Cadre: What Impacted Congress Party's Student And Youth Wings\r\n\r\nAmid this backdrop of organisati*o*nal weakness, it is surprising that political parties in India tend to perf*orm some core functions very well. First, they manage to mobilise a huge cache of human resources (workers and volunteers) and financial resources during elections. The spectacle of campaigning in India (though now much muted by the Election Commission’s restrictions) has perhaps no parallel on this planet. And very few would dou*bt the competitive nature of Indian elections. Across vast stretches of India’s political landscape, there are only a handful of constituencies that any party can boast as a long-term stronghold.\r\n\r\nOf course, we all have heard about BJP’s much-*publicised panna pramukhs (activist responsible for one page of the electoral roll) in recent times, but our knowledge of workers affiliated to other parties remains limited to them acting primarily as vote mobilisers. They put posters, mobilise supporters for rallies and procession during the campaign, convince their supporters to turn out in large numbers at the polling booth, collect donat*i*ons for their parties and local candidates, among other things. The data presented in Figure 1 bears testimony to the fact that during campaigns, these workers-as-canvassers manage to reach out to a large number of households.\r\n\r\n\r\nFigure 1: Increased Political Canvassing During Lok Sabha Elections Centre for Policy Research\r\n\r\n\r\nALSO READ: Shiv Sena In New Avtar: How Far Will The Tamed Beast Go?\r\n\r\nA majority of these workers often engage in such mobilisational activities with the aspiration that one day they will rise up the ranks. The party will reward them for their hard work and may even nominate them as candidates. Lakhs if not milli*ons carry this hope every election cycle, and yet a majority of them continue volunteering at the same position where they started.\r\n\r\nSecond, India’s political parties often get depic*ted as machines that are assembled months bef*ore an election, before getting disassembled. If this is true, then what about the consensus that India is a patronage democracy, in which the main role of cadres is to act as intermediaries between citizens and the State? Or, do we think everyone in the system—voters, political workers, politicians, parties—are just freewheelers who keep shifting loyalties. If this is true, then what explains the stable patterns of electoral competition and resilience of existing power structures?  \r\n\r\nALSO READ: RJD: Time For Reappraisal Of Party Cadres\r\n\r\nIt is true that the capacity of the Indian State to meet the demand of its citizens effectively has con*tinued to remain low; especially in a scenario when a large segment of the population is critica*lly dependent on State services for their well-being. While distributive politics often occurs thr*ough partisan channels and ethnic networks, citizens turn to their local party workers to assist them in navigating government offices, meeting with administrative officials, or simply getting work done. In the process, these activists sometime engage in charging what some may describe as a ‘convenience fee’, while others call it ‘rent-extraction’. Notwithstanding the nomenclature, very few political workers (or brokers) earn eno*ugh to live a comfortable life.\r\n\r\nALSO READ: The Slow And Steady Decline Of BSP\r\n\r\nWhat motivates these political workers to continue volunteering with such little rewards? Dur*ing the course of our research for our book Ide*o*l*ogy and Identity: The Changing Party Systems of India that extensively uses National Election Stu*dy (NES) surveys conducted by Lokniti-CSDS, Pradeep Chhibber and I made some limited obse*r*vations regarding party members. Since 1971, NES surveys have asked respondents not only whi*ch party they voted for, but also whether they identify or feel close to any party, and whether they are registered members of any party. The data indicates that since 1999, the proportion of those who identify with a political party and are a member of one, has been consistently above 30 per cent and 10 per cent respectively. While in com*parative terms, the level of party identificat*ion in India is much lower than in Western demo*cracies, but if one looks at party membership—the most advanced form of political identification—the metric compares favourably with the West.\r\n\r\nALSO READ: AAP: The Anti-corruption Zealots On The March\r\n\r\nSimilarly, consistent with the literature on ideological contestation in the arena of party politics in Western democracies, the party identifiers as well as the party members in India display greater commitment to the party’s ideological worldview than the voters. And contrary to the claims of voters and political elites of changing political preferences, we find robust patterns of ideological com*petition in Indian politics. This is neither to downplay the outsized role charismatic leaders play in driving voters, or the role of money and patronage during elections. In our framework, leaders are heuristic or communicating devices who covey complex ideological and policy platforms to voters in simple language. The party members act as messengers in this framework. And the patronage effects remain marginal in structuring the ele*ction narrative. After all, the dismal rates at which incumbents get re-nominated and re-elected sho*uld serve a reminder to exaggerated claims of pat*ronage politics driving electoral outcomes.\r\n\r\nALSO READ: AAP At The Grassroots And Its Big Strides In Gujarat\r\n\r\nFinally, we noted that the social composit*ion of those who identify with a party and report being a member, has undergone a major transformation. While the bottom half of the soc*ial pyramid—low*er castes, the poor, women, relig*ious min*orities, less educated, etc—*con*ti**nue to rem*ain proporti*o*nally under-represented, their share has risen significantly since the 1970s. In that sense, these two metrics reflects democratisation. In addition, the partisan composition of these two met*rics has closely foll*owed the trajectory of cha*nging party systems in India. The data in Figure 2 on party identification reflects the alte*red posit*ion of the Congress and the BJP in the political setup.\r\n\r\n\r\nFigure 2: Increasing Political Identification with the BJP Centre for Policy Research\r\n\r\n\r\nALSO READ: Cadres Keep Kashmir's National Conference Relevant\r\n\r\nIn conclusion, these political workers in many ways remain the unsung heroes of any political setup, carrying the burden of democracy on their shoulders. Their engagement in everyday political life carries both; the promises as well as the pitfalls of our democratic project. Needless to say, we must focus our gaze on political parties and their volunteers to understand the changing nat*ure of Indian democracy.\r\n\r\n***\r\n\r\n(This appeared in the print edition as "Unsung Heroes of Democracy&quot

\r\n\r\n(Views expressed are personal)\r\n\r\n\r\nRahul Verma is a Fellow at the Centre for Policy Research, New Delhi\r\n","description":"As activist, agent, ticket aspirant, broker, fixer, etc., all rolled into one, the political worker is the least studied element of Indian politics \r\n","image":{"@type":"ImageObject","url":"https: \/\/imgnew.outlookindia.com\/uploadimage\/library\/16_9\/16_9_5\/IMAGE_1649866541.webp","height":"675","width":"120 0"},"author":[{"@type":"Person","name":"Rahul Verma","url":"https:\/\/www.outlookindia.com\/author\/rahul-verma-3540"}],"isPartOf":{"@type":["CreativeWork","Product"],"name":"Cadres Of Political Parties: The Unsung Heroes Of Democracy","productID":"outlookindia.com:basic","d escription":"As activist, agent, ticket aspirant, broker, fixer, etc., all rolled into one, the political worker is the least studied element of Indian politics \r\n","sku":"https:\/\/subscriptions.outlookindia.com\/"},"publisher":{"id":"https:\/\/www.outlookindia.com\/","@type":"NewsMediaOrganization","name":"Outl ook India","url":"https:\/\/www.outlookindia.com\/","logo":{"@type":"ImageObject","url":"https:\/\/www.outlookindia.com\/images\/home_new_v4\/logo_outlook.svg","height":"60","width":"600"}},"i sAccessibleForFree":"false"}

i think its really simple json but i dont know how to extract and convert to html..

kovidgoyal · 04-30-2022, 04:34 AM

https://github.com/kovidgoyal/calibr...ae099c036a7c0f

unkn0wn · 04-30-2022, 05:48 AM

unkn0wn · 05-01-2022, 03:43 AM

Financial Times has feeds..

and the json is also very similar to above..

Code:

import json, re
from calibre.web.feeds.news import BasicNewsRecipe

class ft(BasicNewsRecipe):
    title = 'Financial Times'
    language = 'en'
    __author__ = "Kovid Goyal"
    description = 'The Financial Times is one of the world’s leading news organisations, recognised internationally for its authority, integrity and accuracy.'
    oldest_article = 1.5
    max_articles_per_feed = 50
    no_stylesheets = True
    remove_javascript = True
    ignore_duplicate_articles = {'url'}
    remove_attributes = ['style', 'width', 'height']
    
    def get_cover_url(self):
        soup = self.index_to_soup('https://www.todayspapers.co.uk/the-financial-times-front-page-today/')
        tag = soup.find('div', attrs={'class': 'elementor-image'})
        if tag:
            self.cover_url = tag.find('img')['src']
        return getattr(self, 'cover_url', self.cover_url)
    
    feeds = [
        ('World', 'https://www.ft.com/world?format=rss'),
        ('US', 'https://www.ft.com/world?format=rss'),
        ('Companies', 'https://www.ft.com/companies?format=rss'),
        ('Tech', 'https://www.ft.com/technology?format=rss'),
        ('Markets', 'https://www.ft.com/companies?format=rss'),
        ('Climate', 'https://www.ft.com/climate-capital?format=rss'),
        ('Opinion', 'https://www.ft.com/opinion?format=rss'),
        ('Life & Arts', 'https://www.ft.com/life-arts?format=rss'),
        ('how to spend it', 'https://www.ft.com/htsi?format=rss'),
        ]
        
calibre_most_common_ua = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.87 Safari/537.36'

I tried pre_processing json stuff.. but i keep getting errors..

kovidgoyal · 05-01-2022, 10:37 AM

https://github.com/kovidgoyal/calibr...2704fb6a7ee713

unkn0wn · 05-01-2022, 10:56 AM

Thanks. I'm sorry but i made an error in the feed links..

The US feed is supposed to be https://www.ft.com/us?format=rss instead of 'world'.
and the Markets feed https://www.ft.com/markets?format=rss
I didn't notice this before as I was just trying to make json to html work.

unkn0wn · 05-02-2022, 02:36 AM

it wouldnt matter if you dont make these changes..

in outlook recipe.. adding description

desc = h3.find_next_sibling('p')
desc = self.tag_to_string(desc)

ans.append({
'title': title,
'url': url,
'description': desc})

FT recipe
masthead_url = 'https://im.ft-static.com/m/img/masthead_main.jpg'
and maybe put opinion feed before world feed..

why remove embeded images

there must be a way

unkn0wn · 05-03-2022, 06:57 AM

https://www.ft.com/todaysnewspaper/
There's also uk edition.. this edition might load automatically based on region.

I just changed the feeds part from recipe to parse feeds from print page.

I never thought to look for this page before.. the number of articles in print edition are very less compared to feeds.

the cover_url is uk edition.. and uk edition has more sections and more articles.. like FT big read which is missing in intl edition.. maybe change the soup link to uk edition. (has all intl articles)

change NoArticles text to 'The Financial Times Newspaper is not published on Sundays.'

unkn0wn · 05-03-2022, 08:17 AM

after small changes..

unkn0wn · 05-08-2022, 02:00 AM

I found that outlook magazine from issue archives isn't the latest (a week older)..

I changed recipe to find latest

https://github.com/kovidgoyal/calibr...k_india.recipe

changes to lines 16-24

Code:

def parse_index(self):
        soup = self.index_to_soup('https://www.outlookindia.com/')
        a = soup.find('a', href=lambda x: x and x.startswith('/magazine/issue/'))
        url = a['href']
        self.log('Downloading issue:', url)
        soup = self.index_to_soup('https://www.outlookindia.com' + url)
        cover = soup.find(**classes('listingPage_lead_story'))
        self.cover_url = cover.find('img', attrs = {'src': True})['src']
        ans = []

unkn0wn · 05-24-2022, 10:08 AM

turns out the latest edition loads all articles without the need to extract from json.. while the previous editions from archive page needed subscription. (there are no image links in json while normal page loads images)

so, I just hashed/commented out the json code for future use, and changed other stuff.

https://github.com/kovidgoyal/calibr...k_india.recipe

There's another monthly outlook business magazine requires the exact same code while changing links.

and another recipe for Business Today Magazine (somewhat similar to India Today).

05-01-2022, 10:56 AM	#6
unkn0wn Guru Posts: 615 Karma: 85520 Join Date: May 2021 Device: kindle	Thanks. I'm sorry but i made an error in the feed links.. The US feed is supposed to be https://www.ft.com/us?format=rss instead of 'world'. and the Markets feed https://www.ft.com/markets?format=rss I didn't notice this before as I was just trying to make json to html work. Last edited by unkn0wn; 05-01-2022 at 12:01 PM.

05-02-2022, 02:36 AM	#7
unkn0wn Guru Posts: 615 Karma: 85520 Join Date: May 2021 Device: kindle	it wouldnt matter if you dont make these changes.. in outlook recipe.. adding description desc = h3.find_next_sibling('p') desc = self.tag_to_string(desc) ans.append({ 'title': title, 'url': url, 'description': desc}) FT recipe masthead_url = 'https://im.ft-static.com/m/img/masthead_main.jpg' and maybe put opinion feed before world feed.. why remove embeded images there must be a way Last edited by unkn0wn; 05-02-2022 at 03:22 AM.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
outlook India recipe error	mitra	Recipes	2	02-19-2016 11:59 PM
Outlook magazine India cover	Doc_A	Recipes	0	01-09-2016 10:23 AM
Outlook India not accessible for last 2 weeks on calibre	Doc_A	Recipes	8	06-07-2014 11:37 AM
Caravan Magazine India	Yash912	Recipes	0	09-08-2013 09:39 AM
PwC study: Outlook for magazine publishing in the digital age	TadW	News	0	07-02-2008 05:16 AM

04-30-2022, 04:34 AM	#2
kovidgoyal creator of calibre Posts: 45,330 Karma: 27182818 Join Date: Oct 2006 Location: Mumbai, India Device: Various	https://github.com/kovidgoyal/calibr...ae099c036a7c0f

04-30-2022, 05:48 AM	#3
unkn0wn Guru Posts: 615 Karma: 85520 Join Date: May 2021 Device: kindle

05-01-2022, 10:37 AM	#5
kovidgoyal creator of calibre Posts: 45,330 Karma: 27182818 Join Date: Oct 2006 Location: Mumbai, India Device: Various	https://github.com/kovidgoyal/calibr...2704fb6a7ee713

Advert

Advert