![]() |
#1 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 615
Karma: 85520
Join Date: May 2021
Device: kindle
|
The Hindu recipe is not recognizing articles from a section
Code:
Found section: Sci-tech & Agri https://www.thehindu.com/todays-paper/tp-features/tp-sci-tech-and-agri/ Found section: Others https://www.thehindu.com/todays-paper/tp-miscellaneous/tp-others/ Found article: Celebrations break out as farmers from Punjab, Haryana reach home https://www.thehindu.com/todays-paper/tp-miscellaneous/tp-others/celebrations-break-out-as-farmers-from-punjab-haryana-reach-home/article37936952.ece Found article: Trinamool promises Rs. 5,000 per month for women in Goa . . . all other sections and articles load perfectly. https://www.thehindu.com/archive/print/2021/12/12/ https://github.com/kovidgoyal/calibr...s/hindu.recipe EDIT Okay, looks like the link doesn't open and show article links, for calibre to fetch.. https://www.thehindu.com/todays-pape...tech-and-agri/ but those articles links are present in today's paper list link Last edited by unkn0wn; 12-12-2021 at 08:21 AM. Reason: Maybe i found the reason |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,337
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Thats because that page is empty on the website. Go to the scitech and agri page and see for yourself.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 615
Karma: 85520
Join Date: May 2021
Device: kindle
|
mechanize._response.httperror_seek_wrapper: HTTP Error 403: Forbidden
mechanize._response.httperror_seek_wrapper: HTTP Error 403: Forbidden
today i got this error.. website is working fine.. I didn't want to start a new thread! Spoiler:
|
![]() |
![]() |
![]() |
#4 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,337
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That will be because the website is using some kind of bot detection. Someone with more time than I do will need to figure out what is needed to bypass the bot detection.
Code:
calibre-debug -c 'from calibre import browser; br = browser(); br.open("https://www.thehindu.com/todays-paper/")' Traceback (most recent call last): File "/usr/bin/calibre-debug", line 21, in <module> sys.exit(main()) File "/home/kovid/work/calibre/src/calibre/debug.py", line 272, in main exec(opts.command) File "<string>", line 1, in <module> File "/usr/lib/python3.10/site-packages/mechanize/_mechanize.py", line 257, in open return self._mech_open(url_or_request, data, timeout=timeout) File "/usr/lib/python3.10/site-packages/mechanize/_mechanize.py", line 313, in _mech_open raise response mechanize._response.get_seek_wrapper_class.<locals>.httperror_seek_wrapper: HTTP Error 403: Forbidden |
![]() |
![]() |
![]() |
#5 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,337
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Actually this should take care of it: https://github.com/kovidgoyal/calibr...b687d54a36e88c
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 615
Karma: 85520
Join Date: May 2021
Device: kindle
|
Thank you. You are a magician..
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
The Hindu Recipe omit pictures | vishnumvv | Recipes | 11 | 09-07-2020 12:43 AM |
Request for Recipe - PIB and The Hindu Archives | Anubhav | Recipes | 0 | 07-28-2017 09:10 AM |
Req for Adding H4 section to the hindu receipe | vishnumvv | Recipes | 1 | 06-24-2017 09:53 PM |
The Hindu Business line recipe | dhiru | Recipes | 4 | 06-05-2013 09:47 PM |
the hindu recipe | Dr. Ankala Mulle | Recipes | 0 | 04-24-2013 03:29 PM |