MobileRead Forums - View Single Post - Custom recipes (archive, read-only)

Flexicat · 09-17-2010, 06:09 PM

Hello. Can someone give me some assistance in creating a recipe for a site that does not have an RSS feed?

The base url is "http://archiveofourown.org/tags/Sherlock%20(TV)/works" but the actual story titles seem to be located within HTML code that looks like this on the page;

Code:

  <!--title, author, fandom-->
    <div class="header module">
      <h4 title="title">
  	    <a href="/works/117685">Disorder</a>
   		  by
        <!-- do not cache -->
      </h4>

As a result, I cannot figure out how to extract the article ID number for use. I am guessing that I will have to parse the HTML code of the page but have never done that type of extraction before. I am not familiar with Python or Beautiful Soup.

Thanks.

09-17-2010, 06:09 PM	#2746
Flexicat Junior Member Posts: 8 Karma: 10 Join Date: Aug 2010 Device: Kobo	Hello. Can someone give me some assistance in creating a recipe for a site that does not have an RSS feed? The base url is "http://archiveofourown.org/tags/Sherlock%20(TV)/works" but the actual story titles seem to be located within HTML code that looks like this on the page; Code: <!--title, author, fandom--> <div class="header module"> <h4 title="title"> <a href="/works/117685">Disorder</a> by <!-- do not cache --> </h4> As a result, I cannot figure out how to extract the article ID number for use. I am guessing that I will have to parse the HTML code of the page but have never done that type of extraction before. I am not familiar with Python or Beautiful Soup. Thanks.