Hello. Can someone give me some assistance in creating a recipe for a site that does not have an RSS feed?
The base url is "http://archiveofourown.org/tags/Sherlock%20(TV)/works" but the actual story titles seem to be located within HTML code that looks like this on the page;
Code:
<!--title, author, fandom-->
<div class="header module">
<h4 title="title">
<a href="/works/117685">Disorder</a>
by
<!-- do not cache -->
</h4>
As a result, I cannot figure out how to extract the article ID number for use. I am guessing that I will have to parse the HTML code of the page but have never done that type of extraction before. I am not familiar with Python or Beautiful Soup.
Thanks.