View Single Post
Old 01-23-2008, 09:33 AM   #1
alexxxm
Addict
alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.
 
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
python coding...

I am trying to write down a simple applet for web2lrf/libprs500, to download the magazine the Atlantic (http://www.theatlantic.com/) - it is free since today...

damn, I dont know python so I have a couple of problems...

1) under http://www.theatlantic.com/doc/current, all the links are relative (e.g. <a href="/doc/200801/millbank">), so I began with:


preprocess_regexps = [(re.compile(i[0], re.IGNORECASE | re.DOTALL), i[1]) for i in
[
(r'<a href="/', lambda match : match.group().replace(match.group(1), '<a href="http://www.theatlantic.com')),
]
]


... is it right?

2) at the end of every run I get the error (freely translated by me: italian windows version!)

Exception exceptions.WindowsError: WindowsError(32, 'Impossible to access the file. File is used by another process') in <bound method atlantic.__de
l__ of <atlantic.atlantic object at 0x0111A690>> ignored

I add that I get this error even under other scripts I tried to write for other newspapers, but this didnt prevent an LRF output to be written.

In this case instead, the LRF just contains the header and nothing else - probably it has something to do with question 1)...

any idea?

Alessandro
alexxxm is offline   Reply With Quote