View Single Post
Old 11-21-2014, 12:46 PM   #3518
JimmXinu
Plugin Developer
JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.
 
JimmXinu's Avatar
 
Posts: 7,038
Karma: 4604637
Join Date: Dec 2011
Location: Midwest USA
Device: Kobo Clara Colour running KOReader
Quote:
Originally Posted by cryzed View Post
Did you try explicitly specifying the parser for the BeautifulSoup instance?:
Yep.

Quote:
Originally Posted by cryzed View Post
And if I remember correctly, the error occured in the BaseAdapter.utf8FromSoup method. Is the BeautifulSoup instance that is passed to it really a BeautifulSoup 3 or BeautifulSoup 4 instance?
Yeah, I modified the adapter to use bs4 and BaseAdapter.utf8FromSoup to accept either.

The error is coming from the utf8FromSoup code that does a findAll on all tags to strip off extra attributes. If I bypass that it works--so a more forgiving method of spinning through the tags may work. The improperly nested tags cause confusion.

Quote:
Originally Posted by cryzed View Post
... is it possibly that the raw HTML modifications the adapter does shortly beforehand at places are at fault?
That's a good question--I hadn't checked that. But no, skipping those didn't help.

BTW, I did consult your code from the package-magic branch and I'm using part of it, thanks for that.
JimmXinu is offline