I am using 0.41n. On an xp workstation with 512megs and lots of diskspace.
I have the heap set to 1296. Yes, this is 1296.
I have some pretty large documents (web sites that I take with me). The sites may be from 200-1500kb. It seems that the sites are kind of slow and large to crawl them. The site is
www.candlepowerforums.com. It is a UBB forum. I have 0.41n crawl multiple topics within the same site.
IF sunrise can crawl the topic, the system does not crash. Sometimes it just crashes and closes sunrise.
Having the settings for the heap set this high seems to help, but I really can't go much higher, the system will not allow it. 1300 seems to be the max.
The forum topics on the server seems to be slow. (I think the server is slow). So each thread seems to take 5-10+ minutes.
Any ideas why it crashes or what I might be able to do about the crashes?
It crashes more if I try to crawl multiple documents at the same time. So I have scaled back to only doing one document at a time, but this makes it slow to finish geting all the documents I crawl, as long as 45 minutes for a full update.
Another question. I know that sunrise access 1-5 documents at a time. I also can see that within each document, it crawls 2 links at a time. Is there any way to change the setting to more then 2 links within a document at one time, this might help me finish faster, because the forums that I follow may pause a long time accessing some of the links. Running fewer documents at once has seemed to make sunrise more stable, but at the expense of the full access of all of my documents takes 45 minutes This could potentially make a huge difference if it could do as many as 4 links at a time? And I believe that processing 4 links is a suggested max?
But I would take any ideas you have that might make it more stable and faster.