Shiny New E-Book Gizmo: The Amazon Kindle


View Full Version : JPluck & www.nytimes.com


T T
04-25-2003, 11:57 PM
I'm trying to pluck the site www.nytimes.com using JPluck, because supposedly this application supports cookies.

I've enabled cookies and selected the "Internet Explorer" cookies file. Then I try to pluck the start page of www.nytimes.com (0 page deep). This seems to work fine. I can see that I am still logged in because my login name is displayed.

When I try to increase to 1 page deep, I get logged out for some reason. I've excluded links which I think might cause trouble, for example "www.nytimes.com/logout". Nothing I've tried works. The article pages only display a login prompt. Perhaps I've excluded too many links?

Has anyone had better luck with this?

I'm including my jxl file below. I had to change the extension to ixl in order to attach it. Please rename after you download it.

Any help would be much appreciated.

Alexander Turcic
04-29-2003, 02:08 AM
Does anyone know more about Plucker Desktop's cookie handling? It sounds like a cookie problem to me.

T T
04-29-2003, 08:08 PM
Hi,

Just to be clear, this is JPluck, the java app. Not Plucker Desktop.

The way JPluck handles cookies seems amazing. One only has to select which browser cookie file to use (IE, Netscape, etc) Then I guess, JPluck searches for the necessary cookies from this file.

Not sure how well the whole mechanizm works, but I know that it does work if I have only one download level.

Alexander Turcic
04-30-2003, 02:13 AM
Well the thing is that cookies are sometimes not only read but also written - does JPluck automatically allow sites to write back to the browser cookie files?

Laurens
05-07-2003, 06:18 AM
Answering all the questions at once.

The reason why NYTimes does not work with MSIE cookies is that JPluck does not write new cookies back to the cookie store. Apparently NYTimes sends a new cookie value during downloading that it wants to see when you download subpages.

However, you can pluck NYTimes correctly when you use a Mozilla/Netscape cookies.txt file. This uses a different cookiestore implementation that holds new cookies in RAM. Fortunately, MSIE lets you export its cookies to cookies.txt format. Simply point JPluck to the exported cookies.txt file. I tried this with the TT's JXL and it works fine.

JPluck never writes new cookies to disk. It only holds new cookies in memory while it's downloading the site. After the conversion is complete it discards the new cookies.

Until the MSIE cookie handling is fixed, the best solution would be to use Mozilla just to log in to sites and obtain a cookie. You can still use MSIE for normal browsing.

Alexander Turcic
05-07-2003, 06:49 AM
Hey Laurens, seems my guess (as a non-Plucker user) wasn't so bad after all :)

T T
05-10-2003, 01:40 PM
Thanks, Laurens. I tried to export the IE cookies file as you suggested, but now I'm getting the following error:

[Unhandled HTTP response] 302: Moved Temporarily

Do you get the same thing? In addition, when I click on the link from my brower, it tells me that the page was not downloaded. I have JPluck 0.9 RC4, which is the latest version I believe.

Thanks again for your help.