Some of the information in my articles is truncated, and I need to pull plot summary information from a kind of "See more.." link, so I need to request a page beyond the article so I can extract that information. What would be the recommended way to do this?
I thought about setting a recursion level in the recipe, and then detect what page I am on, and extract the article if it an article page, and then extract the plot summary if it is a plot summary page. Seems like I would have to create a dictionary for the articles and for the plot summaries. Seems like a lot of work, and besides, I am wishing for something more general purpose (what if the other URL was not linked from the original article, preventing me from using recursion?)
I thought about putting in "import requests" and using that module and then putting that into my own BeautifulSoup instance. That would be a lot more straightforward than what I just suggested, but I found the requests module is not built-in.
Last edited by ireadtheinternet; 10-31-2014 at 07:21 AM.
Reason: correction, clarity
|