View Single Post
Old 09-01-2010, 07:45 PM   #2589
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by Starson17 View Post
I'll leave you to play with that. I'm sure a closer look at your code and the page you're scraping would let me make better comments, but I'm short on time today. Good Luck!
I took and did the "parent" thing that you mentioned and it worked. I have a couple issues that are probably simple fixes (I hope). Yet, I can't seem to grasp what is happening even after looking at the output log.

Issue I'm having: 1) For whatever reason I always get a full run of the whole page as an article not sure why this is unless it searches for artIntroShort and then the <a> tags and doesn't find any (the webmaster isn't consistent) so as a result My guess is somehow (I can't seem to find it in my output log) BUT it takes and link['href'] ends up being NONE so the url ends up just being the INDEX.
2) This one is really the one that is puzzling me the most. I also see the person that asked for someone to help on this recipe faced a similar problem with the xml (that is why i didn't use the feed was trying this method to get the thumbnails). but for some reason The thumbnails don't come through. I looked in firebug and they appear to be wrapped inside the mainContent tag. I even went as far as taking and commenting out the keep only tags and was faced with the same results.

Anyway, whenever you get some free time have a look at this if you don't mind. thanks!!!

Attached: Code that gets articles but has issues
Attached Files
File Type: rar gtest.rar (1,010 Bytes, 259 views)
TonytheBookworm is offline