View Single Post
Old 04-16-2011, 05:32 AM   #7
DarkElf
Junior Member
DarkElf began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Apr 2011
Device: Kindle 3
Quote:
I think I understood it the first time. (If I didn't, then I still don't understand). Why do you think images worked with those two recipes? Do they not use relative links for images, or is there something different about your site? I'm just curious to know the answer to this, even if it doesn't help you.
Perhaps (I haven't check) those recipes have the images with absolute link, but my site has with relative link. Anyway I don't know if images worked with those recipes...

Quote:
I understand you'd like to know how to change the internal base url so that relative urls for images work correctly after the ad page is skipped. I don't know the answer, but I posted how I'd try. I'd change relative links for images to full links with postprocess_html, so the internal base url should be irrelevant. You asked how to pass the correct part. I'd have to think about it. Is it available in the soup of the page?
No, it is not available in the final (correct) soup.

Quote:
If not, didn't you have it in skip_ad_pages method?
Yes, I have it in the soup of the first (wrong) page, therefore in the skip_ad_pages method.

Anyway, in the meantime I found two workarounds which solve my problem.
The first:
I discover that the final correct link is also available in the feed page, but inside the "guid" tag and not the "link" tag so I override the get_artcile_url method to extract directly the correct link, with no need to use skip_ad_pages.

The second:
With a sort of easy "reverse engineering" I understand the method to parse/decode the wrong link obtaining the right link, again overriding the get_artcile_url method.

In those ways image works...
DarkElf is offline   Reply With Quote