10-26-2010, 04:01 PM | #46 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
i know you must be busy, but i would really appreciate if you could take a look at the site and the code when you have the time.
i got tamper data and as far as i can see, the only parameter that makes a difference is rsSearchRes_pgNo. i just dot really know what i am doing with this and feel a little lost. |
10-26-2010, 04:24 PM | #47 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
If I'm going to look at it, I need you to tell me how rsSearchRes_pgNo is used? Part of the URL? Part of a Header? Part of a Cookie? I think you said you see it in TamperData - what field? What format? Edit: - I looked at your page (I think I'm at the right one), but I'm not sure how to get the next 30 you want. Tell me in detail what to press or cahnge on that page to get the next group of data. Where? Tell me how to know when I've got all the data - what stops appearing or appears or whatever. Last edited by Starson17; 10-26-2010 at 04:28 PM. |
|
10-26-2010, 05:31 PM | #48 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
this is my practice page. it is of the same structure like more important pages, but it is almost static.
1. i have the total number of articles. you get in in the variable "report3" in the function make links. 2. every page has 30, so you know how many you have already. 3. my example page has 67 articles. (you can print report 3 to check that). 4. 67 reports means 3 pages. if you go to the bottom of the page you can read the numbers. it will say "1 of 3" (in the wrong language). 5. if you want to go to the next page you can replace the "1" with a "2" and hit enter or you can click on the gray arrows. 6. on this page you have a lot of pages to play with. 2138 pages with 30 articles each. as for this: If I'm going to look at it, I need you to tell me how rsSearchRes_pgNo is used? Part of the URL? Part of a Header? Part of a Cookie? I think you said you see it in TamperData - what field? What format? i have no idea. i never did anything higher than C language. i didnt know what RSS was a month and a half ago. give me a microcontoler, on the other hand, and things will start blinking, moving, beeping and doing all sorts o cool stuff. |
10-26-2010, 06:15 PM | #49 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
So a more basic question, then: Where did the string "rsSearchRes_pgNo" come from? Is it in the page source? Did you see it in some output? In TamperData? I just need something to start on.
|
10-27-2010, 03:15 AM | #50 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
in tamperdata
on the right hand side near the bottom.
you have to scroll down a bit to see it. as far as i can see, you can leave all the fields on the right hand side except "rsSearchRes_pgNo" blank and you will still get your next page. Last edited by marbs; 10-27-2010 at 04:10 AM. |
10-27-2010, 02:34 PM | #51 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
I found it immediately after I posted. As time permits, I'll look it over.
|
10-27-2010, 04:04 PM | #52 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
even if you don't get around to it, thank you very much.
it really is a great help. especially the fact that you dont just give the answers, you send me out looking (in a confined area) for it. i really am learning a lot. |
10-27-2010, 04:36 PM | #53 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
I didn't have much time, but briefly, what's happening is you have some java running when you enter a next page number at the bottom of your page. That number is added to a POST which goes to your url. You saw the POST in TamperData. You need to simulate it, or at least simulate the important parts, like rsSearchRes_PgNo.
It's done as follows: Code:
data = urllib.urlencode({ 'rsSearchRes_PgNo':'2'}) url = 'http:// whatever' br.open(url, data) The data that you send in the POST can be seen with: Code:
# Print HTTP headers. br.set_debug_http(True) |
10-27-2010, 04:43 PM | #54 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
You can look at the Mechanize docs for more info: http://wwwsearch.sourceforge.net/mec...-added-headers |
|
10-27-2010, 05:42 PM | #55 | |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
Quote:
i copied form google reader. ill take an other look at google and a look at greader. |
|
10-27-2010, 05:58 PM | #56 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
If that's what you did, then you need to track down whether you're getting the page back that you expect to get. If not, then you need to track down if you are sending what you think you need to send in the POST data. It's just a matter of sending the right data in the POST, checking the results, etc. If you send the right data, you should get back the right page. Have you gotten back page 2 yet?
|
10-28-2010, 05:21 AM | #57 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
i reacreated the post perfectly
it still does not work.
i also do not know how to deal with the difference between a request on this page and this page. or how to deal with dates. i think i will wait for when you have time and energy to lead the way. i am dreaming post and get on tamper data and it is time to step in down a noch. at least for a day or two. here is the code: Spoiler:
|
10-28-2010, 11:21 AM | #58 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
I'm not sure what this means. Your code has append_page code and other stuff that looks to me like it's just getting in the way of the simple problem you need to solve. You want to be able to retrieve page 1 and page 2. Page 1 should come by default. Page 2 should come when you do the post with the right parameters. If it doesn't, perhaps there is other protection, such as cookies or referer, etc.
I'm not sure if this: "i reacreated the post perfectly" means that you downloaded page 1 or page 2, but until you get both pages properly retrieved, there's not much point in using append_page and trying to put them together as a multipage recipe does. |
10-28-2010, 12:18 PM | #59 |
Zealot
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
|
what i meant was that i copied all the parameters of the post request.
i cant get the 2nd page so there is nothing to append. |
10-28-2010, 02:08 PM | #60 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
OK, and I presume you've monitored what you sent and the response, and you can see that the first time you request, it sends the request for page 1, and the second time you request it sends the request for page 2? What do you get from the page 2 request? Is it page 1?
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
New recipe | kiklop74 | Recipes | 0 | 10-05-2010 04:41 PM |
New recipe | kiklop74 | Recipes | 0 | 10-01-2010 02:42 PM |
New Title from Book View Cafe: A Princess of Passyunk by Maya Kaathryn Bohnhoff | suelange | Self-Promotions by Authors and Publishers | 0 | 08-11-2010 04:35 PM |
Recipe Help | lrain5 | Calibre | 3 | 05-09-2010 10:42 PM |
Recipe Help Please | estral | Calibre | 1 | 06-11-2009 02:35 PM |