View Single Post
Old 11-29-2012, 09:16 AM   #3
rmflight
Junior Member
rmflight began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Nov 2012
Device: Kindle
That doesn't seem right. I'm using this recipe, and it gets the article actually pretty nicely, without any trouble.

I basically want access to the URL that gets returned by "print_version", trim off the last bit, and then go download the link defined in the article for a table (see line 383 here for an example), and then download, soupify, extract the table, and insert into the article directly.

To do that, I need the original URL that was used to download the article. It doesn't seem like it should be hard to do.

Are you telling me to use "get_obfuscated_article()" to just write a full custom defined method for this article? It doesn't seem like I should have to do that, because as I said, some simple tweaks to the recipe seem to get 99% of the content I need just fine. I want to do this in post-processing because not necessarily every article will have tables, but many of them will have images, which the basic recipe seems to do very nicely.
rmflight is offline   Reply With Quote