07-21-2022, 10:20 AM | #1 |
Evangelist
Posts: 444
Karma: 82686
Join Date: May 2021
Device: kindle
|
indian express update
reordered feeds and #'d some .. the output is already large.
remove_tags updated and some other stuff. https://github.com/kovidgoyal/calibr...express.recipe |
07-22-2022, 02:28 AM | #2 |
Evangelist
Posts: 444
Karma: 82686
Join Date: May 2021
Device: kindle
|
Live Mint update
Can we italicize unresolved links to differentiate between it and resolved links
I've tried something like this in postprocess_html.. it didn't work. (changes even resolved links too) Is there another way? Code:
def postprocess_html(self, soup): for unresolved in soup.findAll('a', href=lambda x: x and x.startswith('http')): unresolved['id'] = 'unres-d' extra_css = '#unres-d{font-style:italic;} |
07-22-2022, 04:00 AM | #3 |
creator of calibre
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
No, that processing is done after postprocess is run. You could do it my implementing postprocess_book in the recipe but that isnt so easy.
|
08-11-2022, 07:53 AM | #4 | ||
Evangelist
Posts: 444
Karma: 82686
Join Date: May 2021
Device: kindle
|
Indian Express
Quote:
Quote:
|
||
08-18-2022, 01:44 AM | #5 |
Evangelist
Posts: 444
Karma: 82686
Join Date: May 2021
Device: kindle
|
Live Mint update
https://github.com/kovidgoyal/calibr...ivemint.recipe
Code:
def preprocess_raw_html(self, raw, *a): if '<script>var wsjFlag=true;</script>' in raw: m = re.search(r'type="application/ld\+json">[^<]+?"@type": "NewsArticle"', raw) raw1 = raw[m.start():] raw1 = raw1.split('>', 1)[1].strip() data = json.JSONDecoder().raw_decode(raw1)[0] value = data['hasPart']['value'] body = data['articleBody'] + '</p> <p>' + re.sub(r'([a-z]\.|[0-9]\.)([A-Z])', r'\1 <p> \2', value) body = '<div class="FirstEle"> <p>' + body + '</p> </div>' raw = re.sub(r'<div class="FirstEle">([^}]*)</div>', body, raw) return raw else: return raw and this to extra_css = .summary{font-style:italic; color:#404040;} of same part. and resolve_internal_links = True Last edited by unkn0wn; 08-18-2022 at 03:23 AM. |
08-18-2022, 01:58 AM | #6 | ||
Evangelist
Posts: 444
Karma: 82686
Join Date: May 2021
Device: kindle
|
Indian Express
https://github.com/kovidgoyal/calibr...express.recipe
remove lines 110-112 and replace with Code:
h1 = soup.find('h1') if h1: h2 = h1.findNext('h2') if h2: h2.name = 'p' h2['id'] = 'sub-d' Quote:
extra_css additions Quote:
Last edited by unkn0wn; 08-18-2022 at 03:20 AM. |
||
08-18-2022, 02:13 AM | #7 |
Evangelist
Posts: 444
Karma: 82686
Join Date: May 2021
Device: kindle
|
Nautilus update
https://github.com/kovidgoyal/calibr...autilus.recipe
Code:
def preprocess_html(self, soup): for img in soup.findAll('img', attrs={'data-src': True}): img['src'] = img['data-src'].split('?')[0] for figcaption in soup.findAll('figcaption'): figcaption['id']='fig-c' for ul in soup.findAll('ul', attrs={'class': ['breadcrumb', 'article-list_item-byline', 'channel-article-author', 'article-author']}): ul.name = 'span' for li in ul.findAll('li'): li.name = 'p' return soup Code:
extra_css = ''' .article-list_item-byline{font-size:small;} blockquote{color:#404040; text-align:center;} #fig-c{font-size:small;} em{color:#202020;} .breadcrumb{color:gray; font-size:small;} .article-author{font-size:small;} ''' Last edited by unkn0wn; 08-18-2022 at 05:07 AM. |
08-18-2022, 08:21 AM | #8 |
creator of calibre
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Done and I suggest you just attach the modified recipe files, easier for you and me.
|
08-18-2022, 09:46 AM | #9 |
Evangelist
Posts: 444
Karma: 82686
Join Date: May 2021
Device: kindle
|
Okay.. i thought for small changes this would be easier.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Update Indian express | unkn0wn | Recipes | 15 | 06-11-2022 04:41 AM |
Updated feeds for Indian Express | unkn0wn | Recipes | 2 | 01-27-2022 04:49 AM |
Indian Express misses some articles | nikstar007 | Recipes | 1 | 08-30-2016 08:10 AM |
daily express update | scissors | Recipes | 0 | 11-22-2014 03:18 AM |
Indian Express Recipe | sexymax15 | Recipes | 0 | 06-16-2011 06:06 AM |