View Single Post
Old 08-31-2011, 02:55 PM   #4
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by zhixiangpan View Post
Dear Starson17:

Thank you for the help. I had check some post about append_pages code, but i don't know how to write the code fetch the hyperlink. the link in my page is like below.

Code:
<center>
<table border="0" align="center">
<tbody>
<tr>
<td>
<a href="/GB/14562/15549575.html">
<img src="/img/next_b.gif" border="0"/>
</a>
</td>
</tr>
</tbody>
</table>
</center>
Can you help me?
Without looking closely at your page, I can't be sure, but something like this may work:
Code:
        pager = soup.find('a')
        if pager.img['src'] == "/img/next_b.gif":
           nexturl = self.INDEX + pager.a['href']
Find the <a> tag, see if it has an <img> tag that points to the "next" image (whatever that is), and if so, grab the href and append it to the INDEX.

If you don't know what pager is, see the various recipes that use append_page.
I hate posting code without testing it, so that part is up to you.
Starson17 is offline   Reply With Quote