Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-24-2022, 08:04 AM   #1
fengli
Connoisseur
fengli began at the beginning.
 
Posts: 98
Karma: 10
Join Date: Aug 2022
Device: PC
request a recipe-bloomberg

Request recipe bloomberg, can anyone help, thanks a lot
fengli is offline   Reply With Quote
Old 10-25-2022, 03:08 AM   #2
unkn0wn
Guru
unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.
 
Posts: 644
Karma: 85520
Join Date: May 2021
Device: kindle
tried this once.

all links redirect to.. are you a robot? solve captcha page! as javascript is disabled.

If you can find a way for it to not redirect.. whole article can be loaded from raw html.
unkn0wn is offline   Reply With Quote
Old 10-25-2022, 04:12 AM   #3
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,598
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
If you dont want to follow the redirect, do this:

Code:
def get_browser(self, *a, **kw):
    br = super().get_browser(*a, **kw)
    br.set_handle_redirect(False)
    return br
kovidgoyal is offline   Reply With Quote
Old 10-25-2022, 10:21 AM   #4
unkn0wn
Guru
unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.
 
Posts: 644
Karma: 85520
Join Date: May 2021
Device: kindle
I had to use vpn to test this.
don't try it twice in a row.
Code:
Traceback (most recent call last):
  File "calibre\web\fetch\simple.py", line 275, in fetch_url
  File "mechanize\_mechanize.py", line 241, in open_novisit
  File "mechanize\_mechanize.py", line 313, in _mech_open
mechanize._response.get_seek_wrapper_class.<locals>.httperror_seek_wrapper: HTTP Error 307: s2s_high_score

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "calibre\web\fetch\simple.py", line 533, in process_links
  File "calibre\web\fetch\simple.py", line 280, in fetch_url
calibre.web.fetch.simple.FetchError: Temporary Redirect
might face above error.

But if it works all articles will load.
Attached Files
File Type: recipe Bloomberg Businessweek.recipe (4.1 KB, 199 views)

Last edited by unkn0wn; 10-25-2022 at 12:43 PM.
unkn0wn is offline   Reply With Quote
Old 10-25-2022, 12:15 PM   #5
unkn0wn
Guru
unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.
 
Posts: 644
Karma: 85520
Join Date: May 2021
Device: kindle
i found this google rss feed.. but it needs to redirect from google link to bloomberg but not from bloomberg to captha page! how can I do this!

https://news.google.com/rss/search?q...=US&ceid=US:en
unkn0wn is offline   Reply With Quote
Old 10-25-2022, 02:13 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,598
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You can implement get_obfuscated_article() to get them manually, something like this

Code:
articles_are_obfuscated = True
def get_obfuscated_article(self, url):
    br = self.get_browser()
    try:
         br.open(url)
    except Exception as e:
         url = e.hdrs.get('location')
    html = br.open(url).read()
Then save the html to temporary file and return the path to the file.
kovidgoyal is offline   Reply With Quote
Old 10-26-2022, 01:39 AM   #7
fengli
Connoisseur
fengli began at the beginning.
 
Posts: 98
Karma: 10
Join Date: Aug 2022
Device: PC
Quote:
Originally Posted by unkn0wn View Post
I had to use vpn to test this.
don't try it twice in a row.
Code:
Traceback (most recent call last):
  File "calibre\web\fetch\simple.py", line 275, in fetch_url
  File "mechanize\_mechanize.py", line 241, in open_novisit
  File "mechanize\_mechanize.py", line 313, in _mech_open
mechanize._response.get_seek_wrapper_class.<locals>.httperror_seek_wrapper: HTTP Error 307: s2s_high_score

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "calibre\web\fetch\simple.py", line 533, in process_links
  File "calibre\web\fetch\simple.py", line 280, in fetch_url
calibre.web.fetch.simple.FetchError: Temporary Redirect
might face above error.

But if it works all articles will load.
Solutions/Sustainability
Pursuits
Last Thing

The above three parts of the crawl failed, but nevertheless, it has been very good, thank you very much
fengli is offline   Reply With Quote
Old 10-26-2022, 02:33 AM   #8
unkn0wn
Guru
unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.
 
Posts: 644
Karma: 85520
Join Date: May 2021
Device: kindle
Thanks.

Code:
Traceback (most recent call last):
  File "calibre\web\fetch\simple.py", line 275, in fetch_url
  File "mechanize\_mechanize.py", line 241, in open_novisit
  File "mechanize\_mechanize.py", line 313, in _mech_open
mechanize._response.get_seek_wrapper_class.<locals>.httperror_seek_wrapper: HTTP Error 307: s2s_high_score

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "calibre\web\fetch\simple.py", line 533, in process_links
  File "calibre\web\fetch\simple.py", line 280, in fetch_url
calibre.web.fetch.simple.FetchError: Temporary Redirect
Is there anything we can do about this error? Using VPN .. changing IP address makes it work I think.

I was able to fetch the whole recipe.. but some other times not all articles load.
Attached Files
File Type: recipe Bloomberg Businessweek.recipe (4.1 KB, 159 views)
File Type: recipe Bloomberg.recipe (3.0 KB, 157 views)

Last edited by unkn0wn; 10-26-2022 at 02:37 AM.
unkn0wn is offline   Reply With Quote
Old 10-26-2022, 09:55 AM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,598
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That will be bot protection, you can add the delay field to the recipe so it only sends one request every delay seconds. Experiment a bit and see if a delay of 1 or 2 does the trick.
kovidgoyal is offline   Reply With Quote
Old 10-27-2022, 06:04 AM   #10
unkn0wn
Guru
unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.
 
Posts: 644
Karma: 85520
Join Date: May 2021
Device: kindle
added delay and changed somethings.. was able to download both the recipes completely.
Attached Files
File Type: recipe Bloomberg Businessweek.recipe (4.7 KB, 196 views)
File Type: recipe Bloomberg.recipe (3.6 KB, 171 views)
unkn0wn is offline   Reply With Quote
Old 10-27-2022, 08:23 PM   #11
fengli
Connoisseur
fengli began at the beginning.
 
Posts: 98
Karma: 10
Join Date: Aug 2022
Device: PC
Quote:
Originally Posted by unkn0wn View Post
added delay and changed somethings.. was able to download both the recipes completely.
Crawl Not running



calibre, version 6.7.1 (win32, embedded-python: True)
Bloomberg Businessweek

Bloomberg Businessweek
Conversion options changed from defaults:
verbose: 2
output_profile: 'generic_eink'
Resolved conversion options
calibre version: 6.7.1
{'asciiize': False,
'author_sort': None,
'authors': None,
'base_font_size': 0,
'book_producer': None,
'change_justification': 'original',
'chapter': None,
'chapter_mark': 'pagebreak',
'comments': None,
'cover': None,
'debug_pipeline': None,
'dehyphenate': True,
'delete_blank_paragraphs': True,
'disable_font_rescaling': False,
'dont_download_recipe': False,
'dont_split_on_page_breaks': True,
'duplicate_links_in_toc': False,
'embed_all_fonts': False,
'embed_font_family': None,
'enable_heuristics': False,
'epub_flatten': False,
'epub_inline_toc': False,
'epub_toc_at_end': False,
'epub_version': '2',
'expand_css': False,
'extra_css': None,
'extract_to': None,
'filter_css': None,
'fix_indents': True,
'flow_size': 260,
'font_size_mapping': None,
'format_scene_breaks': True,
'html_unwrap_factor': 0.4,
'input_encoding': None,
'input_profile': <calibre.customize.profiles.InputProfile object at 0x000001D2F850E7A0>,
'insert_blank_line': False,
'insert_blank_line_size': 0.5,
'insert_metadata': False,
'isbn': None,
'italicize_common_cases': True,
'keep_ligatures': False,
'language': None,
'level1_toc': None,
'level2_toc': None,
'level3_toc': None,
'line_height': 0,
'linearize_tables': False,
'lrf': False,
'margin_bottom': 5.0,
'margin_left': 5.0,
'margin_right': 5.0,
'margin_top': 5.0,
'markup_chapter_headings': True,
'max_toc_links': 50,
'minimum_line_height': 120.0,
'no_chapters_in_toc': False,
'no_default_epub_cover': False,
'no_inline_navbars': False,
'no_svg_cover': False,
'output_profile': <calibre.customize.profiles.GenericEink object at 0x000001D2F850F130>,
'page_breaks_before': None,
'prefer_metadata_cover': False,
'preserve_cover_aspect_ratio': False,
'pretty_print': True,
'pubdate': None,
'publisher': None,
'rating': None,
'read_metadata_from_opf': None,
'remove_fake_margins': True,
'remove_first_image': False,
'remove_paragraph_spacing': False,
'remove_paragraph_spacing_indent_size': 1.5,
'renumber_headings': True,
'replace_scene_breaks': '',
'search_replace': None,
'series': None,
'series_index': None,
'smarten_punctuation': False,
'sr1_replace': '',
'sr1_search': '',
'sr2_replace': '',
'sr2_search': '',
'sr3_replace': '',
'sr3_search': '',
'start_reading_at': None,
'subset_embedded_fonts': False,
'tags': None,
'test': False,
'timestamp': None,
'title': None,
'title_sort': None,
'toc_filter': None,
'toc_threshold': 6,
'toc_title': None,
'transform_css_rules': None,
'transform_html_rules': None,
'unsmarten_punctuation': False,
'unwrap_lines': True,
'use_auto_toc': False,
'verbose': 2}
InputFormatPlugin: Recipe Input running
Downloading recipe urn: custom:1002
Traceback (most recent call last):
File "runpy.py", line 196, in _run_module_as_main
File "runpy.py", line 86, in _run_code
File "site.py", line 82, in <module>
File "site.py", line 77, in main
File "site.py", line 49, in run_entry_point
File "calibre\utils\ipc\worker.py", line 215, in main
File "calibre\gui2\convert\gui_conversion.py", line 31, in gui_convert_recipe
File "calibre\gui2\convert\gui_conversion.py", line 25, in gui_convert
File "calibre\ebooks\conversion\plumber.py", line 1108, in run
File "calibre\customize\conversion.py", line 242, in __call__
File "calibre\ebooks\conversion\plugins\recipe_input.py ", line 138, in convert
File "calibre\web\feeds\news.py", line 1058, in download
File "calibre\web\feeds\news.py", line 1227, in build_index
File "<string>", line 31, in parse_index
File "calibre\web\feeds\news.py", line 707, in index_to_soup
File "mechanize\_mechanize.py", line 241, in open_novisit
File "mechanize\_mechanize.py", line 313, in _mech_open
mechanize._response.get_seek_wrapper_class.<locals >.httperror_seek_wrapper: HTTP Error 307: s2s_high_score
Using proxies: {'http': '127.0.0.1:7890', 'https': '127.0.0.1:7890', 'ftp': '127.0.0.1:7890'}
fengli is offline   Reply With Quote
Old 10-28-2022, 02:31 AM   #12
unkn0wn
Guru
unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.
 
Posts: 644
Karma: 85520
Join Date: May 2021
Device: kindle
that maybe cause your ip was already flagged yesterday. Open bloomberg on browser & verify and then try or give it a gap of 2 or 3 days.

I was able to load both recipes one after another, yesterday and today, from the same ip.

maybe increase delay to 3 seconds.

Last edited by unkn0wn; 10-28-2022 at 04:30 AM.
unkn0wn is offline   Reply With Quote
Old 10-28-2022, 03:35 AM   #13
Comfy.n
want to learn what I want
Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.
 
Posts: 1,679
Karma: 7908443
Join Date: Sep 2020
Device: none
the bberg-businessweek recipe worked for me, it's pretty cool (24 articles fetched), all images included.

the other one returns: <urlopen error [Errno 11001] getaddrinfo failed>

I used the recipes from the latest source, not sure if they're the same as those attached in post #10
Comfy.n is offline   Reply With Quote
Old 10-28-2022, 05:16 AM   #14
unkn0wn
Guru
unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.
 
Posts: 644
Karma: 85520
Join Date: May 2021
Device: kindle
retry! check internet access.
unkn0wn is offline   Reply With Quote
Old 10-28-2022, 10:18 PM   #15
fengli
Connoisseur
fengli began at the beginning.
 
Posts: 98
Karma: 10
Join Date: Aug 2022
Device: PC
Success:
Bloomberg.recipe (3.6 KB)


but Failure(Tried many times):
Bloomberg Businessweek.recipe (4.7 KB)
fengli is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Bloomberg Recipe - Only Renders First Article papermadeblues Recipes 2 02-04-2023 01:00 AM
Request - Bloomberg.com Recipe SunLight Recipes 5 10-07-2015 09:02 PM
Recipe request for bloomberg.com djdag Recipes 0 06-24-2011 02:14 PM
Recipe request please aessedai44 Recipes 2 10-06-2010 01:07 AM
Request for recipe exdream Calibre 3 04-24-2010 10:13 AM


All times are GMT -4. The time now is 04:56 AM.


MobileRead.com is a privately owned, operated and funded community.