|
![]() |
|
Thread Tools | Search this Thread |
![]() |
#1 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 616
Karma: 85520
Join Date: May 2021
Device: kindle
|
MIT Technology Review, the recipe still works but without header content.
Code:
articleHeaderRegex= '^.*contentHeader__wrapper.*$' editorLetterHeaderRegex = "^.*contentHeader--vertical__wrapper.*$" articleContentRegex = "^.*contentbody__wrapper.*$" imagePlaceHolderRegex = "^.*image__placeholder.*$" advertisementRegex = "^.*sliderAd__wrapper.*$" keep_only_tags = [ dict(name='header', attrs={'class': re.compile(editorLetterHeaderRegex, re.IGNORECASE)}), dict(name='header', attrs={'class': re.compile(articleHeaderRegex, re.IGNORECASE)}), dict(name='div', attrs={'class': re.compile(articleContentRegex, re.IGNORECASE)}) ] remove_tags = [ dict(name="aside"), dict(name="svg"), dict(name="blockquote"), dict(name="img", attrs={'class': re.compile(imagePlaceHolderRegex, re.IGNORECASE)}), dict(name="div", attrs={'class': re.compile(advertisementRegex, re.IGNORECASE)}), https://github.com/kovidgoyal/calibre/blob/3dd95981398777f3c958e733209f3583e783b98c/recipes/mit_technology_review.recipe Only the contentBody__wrapper works which is the body & most of the article. the contentHeader__wrapper is to be changed, but from what i found is that there's different header tags for different articles. contentArticleHeader--fullBleed__intro--30Y0q contentArticleHeader__title--rp01p contentArticleHeader--vertical__intro--2soVS help find an easier way to do this. |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,342
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
|
![]() |
![]() |
Advert | |
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Complete Works of Plato - A PDF Header Question | Blaineoreski | Conversion | 7 | 06-07-2023 11:18 PM |
recipe for Technology Review - german | schuster | Recipes | 1 | 06-05-2016 07:17 AM |
MIT Technology Review print/bimonthly | truth1ness | Recipes | 7 | 04-15-2015 12:43 AM |
Calibre: Header entfernen nicht mit aktueller Version ?? | KimJ | Software | 5 | 01-06-2010 12:39 AM |
Sony Reader reviewed by MIT Technology Review | Bob Russell | Sony Reader | 38 | 11-09-2006 05:04 PM |