View Single Post
Old 02-09-2010, 04:47 PM   #1407
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by kovidgoyal View Post
stick some print statements into fetch_url to debug the session. Also try customizing get_browser to disable cookies/handle refreshe, etc.
I'm not sure if you saw my edit - the 0A character problem. I hadn't thought of debugging fetch_url. I was going down the road of trying to use preprocess_regexps.

I'm not sure if I understood it, but it looks like that will let me match and replace some portion of the fetched html page before it gets processed. I was thinking I could just remove the 0A character that was causing the problem. (I have some other uses for processing the html with a regex search replace).

However, the API described using re.compile to compile the regex, and I think I need to import re. Would this approach work, and if so, where do I import re from?

Edit:

OK, I should learn to think before typing. I solved it (with your help) The import format was easy to find. I just searched for where you used re.compile and found the answer was just 'import re'.

The print statement in fetch_url was absolutely vital to let me see that the fetch was getting a '\n' at the broken link point. I was able to remove that char with preprocess_regexps.

Thanks for the help!

Last edited by Starson17; 02-09-2010 at 05:12 PM.
Starson17 is offline