View Single Post
Old 08-15-2012, 09:48 AM   #8
lrui
Enthusiast
lrui ought to be getting tired of karma fortunes by now.lrui ought to be getting tired of karma fortunes by now.lrui ought to be getting tired of karma fortunes by now.lrui ought to be getting tired of karma fortunes by now.lrui ought to be getting tired of karma fortunes by now.lrui ought to be getting tired of karma fortunes by now.lrui ought to be getting tired of karma fortunes by now.lrui ought to be getting tired of karma fortunes by now.lrui ought to be getting tired of karma fortunes by now.lrui ought to be getting tired of karma fortunes by now.lrui ought to be getting tired of karma fortunes by now.
 
lrui's Avatar
 
Posts: 49
Karma: 475062
Join Date: Aug 2012
Device: nook simple touch
Quote:
Originally Posted by Steven630 View Post
Thanks. Yet again, it failed.

After I started downloading, nothing indicated that Calibre had found "span" or "div" etc. I suspect this method won't work however hard we try. That is, it's not two class="a1" or other mistakes that led to the failure, but the method in the first place. (Yes, there are two class="a1", but what counts when you use find... in beautifulsoup is the first one. So the second class="a1" would be ignored when the first one is found.) And in theory at least, your method to find "span" and so on should work, but didn't. What do you think?

As for match_regexps, that didn't work either, although I'm not sure if simply adding "match_regexps" and "recursion" to the recipe is enough. Wait, seems that match_regexps is not for multi-page articles in the first place...

pagenum = soup.findAll('span')

change soup.find into soup.findAll

try it again?

lrui is offline   Reply With Quote