|  09-13-2023, 08:47 AM | #1 | 
| Connoisseur            Posts: 90 Karma: 50742 Join Date: Jan 2011 Device: PW5 | 
				
				Support for HTTP 308 redirects
			 
			
			Recipe fails when the url responds with a HTTP308. Sample recipe below Code: from calibre.web.feeds.news import BasicNewsRecipe
class Http308RedirectRecipe(BasicNewsRecipe):
    title = "Http 308 Redirect Recipe"
    language = "en"
    def parse_index(self):
        return [
            (
                "Example",
                [
                    {
                        "url": "https://www.wsj.com/articles/mccarthy-biden-impeachment-inquiry-b9cc6530",
                        "title": "This url responds with a HTTP 308 redirect",
                    }
                ],
            ),
        ]Code: Fetching https://www.wsj.com/articles/mccarthy-biden-impeachment-inquiry-b9cc6530 Could not fetch link https://www.wsj.com/articles/mccarthy-biden-impeachment-inquiry-b9cc6530 Traceback (most recent call last): File "calibre/web/fetch/simple.py", line 278, in fetch_url File "mechanize/_mechanize.py", line 241, in open_novisit File "mechanize/_mechanize.py", line 313, in _mech_open mechanize._response.get_seek_wrapper_class.<locals>.httperror_seek_wrapper: HTTP Error 308: Permanent Redirect Code: $ curl -A 'Mozilla/5.0' -Ii 'https://www.wsj.com/articles/mccarthy-biden-impeachment-inquiry-b9cc6530' HTTP/2 308 content-length: 0 location: https://www.wsj.com/politics/mccarthy-biden-impeachment-inquiry-b9cc6530 date: Wed, 13 Sep 2023 12:46:09 GMT x-proxy-cache: BYPASS x-cache: Miss from cloudfront via: 1.1 53b2bbb13e5db590d598ee4e9aa9bd80.cloudfront.net (CloudFront) x-amz-cf-pop: HKG62-C2 x-amz-cf-id: mtOFJ7gK44Weycg7uPApuysAAakLzPg2kojmOCpiNpVi_Yk1bP7qZg== | 
|   |   | 
|  09-13-2023, 08:55 AM | #2 | 
| creator of calibre            Posts: 45,604 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			Your recipe worked fine for me in current calibre and the recipe system most definitely handles redirects. From the use of curl I am guessing you are on linux. Dont use whatever distro calibre package use the official binaries.
		 | 
|   |   | 
| Advert | |
|  | 
|  09-13-2023, 09:15 AM | #3 | 
| Connoisseur            Posts: 90 Karma: 50742 Join Date: Jan 2011 Device: PW5 | 
			
			Apologies for cutting too much of the log. I'm running the official 6.26.0 on macOS. I dug around and it looks like it's because calibre is pinned to mechanize v0.4.7 but support for HTTP308 redirects is available only from v0.4.8 (commit). Code: $ ebook-convert 'http308.recipe' .epub --test --debug-pipeline debug -vv
Conversion options changed from defaults:
  test: (2, 2)
  debug_pipeline: 'debug'
  verbose: 2
Resolved conversion options
calibre version: 6.26.0
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0,
 'book_producer': None,
 'change_justification': 'original',
 'chapter': None,
 'chapter_mark': 'pagebreak',
 'comments': None,
 'cover': None,
 'debug_pipeline': 'debug',
 'dehyphenate': True,
 'delete_blank_paragraphs': True,
 'disable_font_rescaling': False,
 'dont_download_recipe': False,
 'dont_split_on_page_breaks': True,
 'duplicate_links_in_toc': False,
 'embed_all_fonts': False,
 'embed_font_family': None,
 'enable_heuristics': False,
 'epub_flatten': False,
 'epub_inline_toc': False,
 'epub_max_image_size': 'none',
 'epub_toc_at_end': False,
 'epub_version': '2',
 'expand_css': False,
 'extra_css': None,
 'extract_to': None,
 'filter_css': None,
 'fix_indents': True,
 'flow_size': 260,
 'font_size_mapping': None,
 'format_scene_breaks': True,
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x118f1fbe0>,
 'insert_blank_line': False,
 'insert_blank_line_size': 0.5,
 'insert_metadata': False,
 'isbn': None,
 'italicize_common_cases': True,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0,
 'linearize_tables': False,
 'lrf': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'markup_chapter_headings': True,
 'max_toc_links': 50,
 'minimum_line_height': 120.0,
 'no_chapters_in_toc': False,
 'no_default_epub_cover': False,
 'no_inline_navbars': False,
 'no_svg_cover': False,
 'output_profile': <calibre.customize.profiles.OutputProfile object at 0x118f1ead0>,
 'page_breaks_before': None,
 'prefer_metadata_cover': False,
 'preserve_cover_aspect_ratio': False,
 'pretty_print': True,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': None,
 'remove_fake_margins': True,
 'remove_first_image': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'renumber_headings': True,
 'replace_scene_breaks': '',
 'search_replace': None,
 'series': None,
 'series_index': None,
 'smarten_punctuation': False,
 'sr1_replace': '',
 'sr1_search': '',
 'sr2_replace': '',
 'sr2_search': '',
 'sr3_replace': '',
 'sr3_search': '',
 'start_reading_at': None,
 'subset_embedded_fonts': False,
 'tags': None,
 'test': (2, 2),
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'toc_title': None,
 'transform_css_rules': None,
 'transform_html_rules': None,
 'unsmarten_punctuation': False,
 'unwrap_lines': True,
 'use_auto_toc': False,
 'verbose': 2}
1% Converting input to HTML...
InputFormatPlugin: Recipe Input running
Using custom recipe
Using user agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36
1% Fetching feeds...
1% Got feeds from index page
1% Trying to download cover...
1% Generating masthead...
Synthesizing mastheadImage
1% Starting download [4 threads]...
Fetching https://www.wsj.com/articles/mccarthy-biden-impeachment-inquiry-b9cc6530
Could not fetch link https://www.wsj.com/articles/mccarthy-biden-impeachment-inquiry-b9cc6530
Traceback (most recent call last):
  File "calibre/web/fetch/simple.py", line 278, in fetch_url
  File "mechanize/_mechanize.py", line 241, in open_novisit
  File "mechanize/_mechanize.py", line 313, in _mech_open
mechanize._response.get_seek_wrapper_class.<locals>.httperror_seek_wrapper: HTTP Error 308: Permanent Redirect
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "calibre/web/fetch/simple.py", line 536, in process_links
  File "calibre/web/fetch/simple.py", line 283, in fetch_url
calibre.web.fetch.simple.FetchError: Permanent Redirect
https://www.wsj.com/articles/mccarthy-biden-impeachment-inquiry-b9cc6530 saved to 
Failed to download article: HTTP 308 Direct from https://www.wsj.com/articles/mccarthy-biden-impeachment-inquiry-b9cc6530
Traceback (most recent call last):
  File "calibre/utils/threadpool.py", line 99, in run
  File "calibre/web/feeds/news.py", line 1195, in fetch_article
  File "calibre/web/feeds/news.py", line 1190, in _fetch_article
Exception: Could not fetch article. The debug traceback is available earlier in this log | 
|   |   | 
|  09-13-2023, 09:19 AM | #4 | 
| creator of calibre            Posts: 45,604 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | |
|   |   | 
|  09-15-2023, 11:38 AM | #5 | 
| Connoisseur  Posts: 51 Karma: 10 Join Date: Oct 2018 Device: kindle | 
			
			I'm having the same issue. I use the latest calibre binary (6.26.0) on linux but still had the same error "HTTP Error 308: Permanent Redirect" when converting wsj recipe.
		 | 
|   |   | 
| Advert | |
|  | 
|  09-15-2023, 12:13 PM | #6 | 
| creator of calibre            Posts: 45,604 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			yes why wouldnt you, since the fix has not yet been released.
		 | 
|   |   | 
|  09-16-2023, 02:22 AM | #7 | 
| Guru            Posts: 644 Karma: 85520 Join Date: May 2021 Device: kindle | 
			
			looks like WSJ articles wont be loading text even with the redirect fix. The amp version of the links have stopped loading content. maybe we are back to trying this https://www.mobileread.com/forums/sh...4&postcount=17 EDIT: oh wait is it going to start working from the next update after the redirect is fixed? https://github.com/unkn0w7n/calibre/...f7c539a973af9f Last edited by unkn0wn; 09-16-2023 at 02:59 AM. | 
|   |   | 
|  09-16-2023, 03:29 AM | #8 | 
| creator of calibre            Posts: 45,604 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			I am still getting some content in wsj free I dont subscribe so cant try the main recipe. See https://github.com/kovidgoyal/calibr...5305f9a1b4e721
		 | 
|   |   | 
|  09-16-2023, 03:30 AM | #9 | |
| creator of calibre            Posts: 45,604 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | Quote: 
 | |
|   |   | 
|  09-17-2023, 06:08 PM | #10 | 
| Junior Member  Posts: 5 Karma: 10 Join Date: Feb 2017 Device: Kindle Voyage | |
|   |   | 
|  09-17-2023, 09:49 PM | #11 | 
| creator of calibre            Posts: 45,604 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			No no one has sent me the creds, PM them to me.
		 | 
|   |   | 
|  09-18-2023, 01:46 AM | #12 | 
| Connoisseur  Posts: 51 Karma: 10 Join Date: Oct 2018 Device: kindle | |
|   |   | 
|  09-18-2023, 02:39 AM | #13 | 
| creator of calibre            Posts: 45,604 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			I tested it and sadly only the headlines, hero image and first para are present, as I said before the rest of the content is transmitted encrypted and decrypted on client. Looks like the AMP loophole @unknown found no longer works.
		 | 
|   |   | 
|  | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Does calibre support retry-after http headers ? | SimonMc | Library Management | 6 | 12-15-2021 11:40 AM | 
| table of contents redirects to front page | Eriks | Conversion | 2 | 10-01-2014 12:45 PM | 
| What are: url:http|// ... urn:urn|uuid| ... uri:http|// | 44reader | Library Management | 5 | 07-05-2012 01:42 PM |