09-26-2010, 12:37 AM | #1 |
Addict
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
|
Nfl Recipe -- Almost done need a little help
Starson,
If you get a few minutes could you look at this code and maybe explain to me why I never get the pcard content (the photo with the players stats). I don't see where I'm removing it anywhere and I'm parsing the */printable/* link and that page has the pcard. Thanks. Spoiler:
I see iframe is turned off by default. How do i turn it back on? Last edited by TonytheBookworm; 09-26-2010 at 12:48 AM. |
09-26-2010, 08:36 AM | #2 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
edit: Hint - the page that FireFox or IE gets sent is not necessarily the same as what Calibre is sent. It's time to get out TamperData. |
|
09-26-2010, 12:36 PM | #3 | |
Addict
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
|
Quote:
Referer=http://www.nfl.com/news/story/09000d5d81acc392/article/broncos-rb-moreno-out-vs-colts-buckhalter-expected-to-start That referer looks like nothing more than the current url. I then took and tried to figure this out and noticed you had a conversation with Kovid about this. So could you help me or maybe explain to me how to go about using this (or would i )? Spoiler:
My first thought was to simply take in the req = mechanize.Request(url, headers = {'Referer':'http://referer_site.com/'}) and change it to : req = mechanize.Request(url, headers = {'Referer':url}) but i don't think that is right. thanks by the way. |
|
09-26-2010, 12:50 PM | #4 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
1) You start with your recipe and make sure you're removing nothing, then print what the site sends you. You compare that to what you see in FireFox. If there's something that FireFox gets, but Calibre does not, then it's time to figure out why. That's where you are now (I assume you are sure that your recipe does not receive pcard, even with everything turned on).
2) Once you're sure that you're getting different things, you start tracking down how the site knows the difference between Calibre's request and FF's request. It could be useragent, headers, cookies, etc. TamperData (or Live HTTP Headers) will tell you what FireFox sends. These commands inside get_browser will show you what Calibre sends: Code:
# Print HTTP headers. br.set_debug_http(True) br.set_debug_responses(True) br.set_debug_redirects(True) As an example, I ran into this problem with a Skeptic Blog - I got a Bad Behavior error. It turned out the site wanted an Accept header. I also ran into it with a Comic recipe. That turned out that it wanted a referer header, etc. |
09-26-2010, 10:36 PM | #5 | |
Addict
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
|
Quote:
|
|
09-27-2010, 08:23 AM | #6 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
BTW, I'm not saying that the headers are definitely your problem. For all I know the missing part is built by script or flash, or Ajax, etc. It's up to you to find out where the missing stuff is coming from. It's just that after everything else is eliminated, when you see one thing in FF and another in your printed soup, it's often because the site is actually sending two different things, and that's usually due to a diff in the headers sent by FF vs. Calibre. |
|
09-27-2010, 06:43 PM | #7 |
Addict
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
|
Okay after messing with this for a while I finally figured out why the pcard is not showing up. Yet, I don't know how exactly to fix it. So could you hook the jumper cables to me and give me a jump-start please ?
When using liveHttp and tamperData i noticed that a request is sent out for http://www.nfl.com/widget/playercard...n=2010&gameId= (which turns out to be the pcard data) So my question is: do i add that as an addheader? or is it a br.open('http://www.nfl.com/widget/playercard?esbId=EDW720778&season=2010&gameId=') ? Sorry for all the questions but i'm totally in the dark on this one |
09-27-2010, 07:50 PM | #8 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Code:
<iframe src="/widget/playercard?esbId=NOR780922&season=2010&gameId=" id="pcard-EOCVFPSS" frameborder="0"></iframe> Code:
soup = self.index_to_soup(URL) |
|
09-27-2010, 10:25 PM | #9 | |
Addict
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
|
Quote:
1) I found the iframe Code:
<div class="articleText"> <p>CHICAGO -- The Bears say they will hold defensive tackle <a href="/players/tommieharris/profile?id=HAR548445">Tommie Harris</a> out of Monday night's game against the <a href="/teams/greenbaypackers/profile?team=GB">Green Bay Packers</a> on a coach's decision.</p> <p> <div class="pcard-wrapper nfl-tag-right" id="pcard-JMEDKDWV-wrapper"> <iframe src="/widget/playercard?esbId=HAR548445&season=2010&gameId=" id="pcard-JMEDKDWV" frameborder="0"></iframe> </div> something like this maybe? :confused Spoiler:
Just not grasping this one yet Last edited by TonytheBookworm; 09-27-2010 at 11:21 PM. Reason: still pluggin |
|
09-28-2010, 09:24 AM | #10 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
You don't want the <head>, etc. I haven't looked at that page, so I can't tell you exactly what or how much you'll want in tag_from_newsoup, but you know how to do that. Once tag_from_newsoup is extracted, you can either soup.insert(wherever, tag_from_newsoup) or use replaceWith. I know you've used both of them previously. You might just use replaceWith on the <iframe> tag. So you lied when you said "#no clue on this" You've got most of it, it's just putting it all together (Do I get partial author credit on this - writing all these posts is harder than just writing the recipe ) |
|
09-28-2010, 11:48 AM | #11 | |
Addict
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
|
Quote:
Also, how do you get the src from a tag? I also been thinking is this recipe worth all the trouble, so it might be a while before it gets complete. |
|
09-28-2010, 12:01 PM | #12 | ||
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Do this: item.img['src'] Quote:
I find it more fun to figure out how to do the recipe than to actually write it. It's your recipe, not mine, so you're the author (if you ever finish the grunt work and get it functioning). |
||
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
I need some help with a recipe | jefferson_frantz | Recipes | 14 | 11-22-2010 02:06 PM |
New recipe | kiklop74 | Recipes | 0 | 10-01-2010 02:42 PM |
Recipe Help | lrain5 | Calibre | 3 | 05-09-2010 10:42 PM |
Recipe Help | hellonewman | Calibre | 1 | 01-23-2010 03:45 AM |