View Single Post
Old 11-28-2020, 10:01 PM   #4906
JimmXinu
Plugin Developer
JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.
 
JimmXinu's Avatar
 
Posts: 7,024
Karma: 4604635
Join Date: Dec 2011
Location: Midwest USA
Device: Kobo Clara Colour running KOReader
Quote:
Originally Posted by AndersW View Post
'The Vampire’s Templar' by TypeAxiom

I will also PM you the link. And stay safe out there.
In chapter 63, some one nested an excessive number of spoilers, which resulted in >512 nested tags (at least--that's what Firefox reports).

html5lib, basically by necessity, works by recursion and that page fails to parse with 'RecursionError: maximum recursion depth exceeded'.

I hesitate to add code to aggressively discard all the comments before parsing the HTML because I didn't write this adapter, and don't read the site. So I'm not sure where all the notes, news, etc that can appear are when they do.

You can download the rest of that story by using chapter ranges to download 1-62 and 64- in separate parts, turn on continue_on_chapter_error, or exclude it with ignore_chapter_url_list:

Code:
## continue on and just report the error in the chapter text.
[https://www.scribblehub.com/series/137441/the-vampires-templar/]
continue_on_chapter_error:true

## pretend chapter 63 doesn't exist.
[https://www.scribblehub.com/series/137441/the-vampires-templar/]
ignore_chapter_url_list:
 https://www.scribblehub.com/read/137441-the-vampires-templar/chapter/157336/
If you can show other examples of the same thing, I will revisit it. Or if one of the other devs who have worked on adapter_scribblehubcom.py care to look at it. I suspect simply discarding everything after '</main>' would work.
JimmXinu is offline   Reply With Quote