View Single Post
Old 12-22-2008, 04:52 PM   #6
Robotech_Master
Fanatic
Robotech_Master ought to be getting tired of karma fortunes by now.Robotech_Master ought to be getting tired of karma fortunes by now.Robotech_Master ought to be getting tired of karma fortunes by now.Robotech_Master ought to be getting tired of karma fortunes by now.Robotech_Master ought to be getting tired of karma fortunes by now.Robotech_Master ought to be getting tired of karma fortunes by now.Robotech_Master ought to be getting tired of karma fortunes by now.Robotech_Master ought to be getting tired of karma fortunes by now.Robotech_Master ought to be getting tired of karma fortunes by now.Robotech_Master ought to be getting tired of karma fortunes by now.Robotech_Master ought to be getting tired of karma fortunes by now.
 
Posts: 514
Karma: 2954711
Join Date: May 2006
Quote:
Originally Posted by tompe View Post
Yes, that is what plucker is good at. You can give a web address to plucker and specify the maximum level of links to follow and then get one file suitable for offline browsing.
OK, I tried it. And Plucker crashed out, I guess on an example link within the document:

Processing http://my-conf-server:my-conf-port/testdatabase.jsp...
Traceback (most recent call last):
File "C:\Program Files\Plucker/parser/python/PyPlucker/Spider.py", line 1734, in ?
sys.exit(realmain(None))
File "C:\Program Files\Plucker/parser/python/PyPlucker/Spider.py", line 1719, in realmain
retval = main (config, exclusion_lists)
File "C:\Program Files\Plucker/parser/python/PyPlucker/Spider.py", line 1124, in main
spider.process_all(verbose=verbosity)
File "C:\Program Files\Plucker/parser/python/PyPlucker/Spider.py", line 623, in process_all
self.process (verbose, estimate, statusfile)
File "C:\Program Files\Plucker/parser/python/PyPlucker/Spider.py", line 732, in process
post_data=post_data)
File "C:\Program Files\Plucker/parser/python\PyPlucker\Retriever.py", line 313, in retrieve
result = self._retrieve (url, alias_list, post_data)
File "C:\Program Files\Plucker/parser/python\PyPlucker\Retriever.py", line 212, in _retrieve
webdoc = self._urlopener.open (real_url, post_data)
File "C:\Program Files\Plucker\parser\python\vm\lib\urllib.py", line 176, in open
return getattr(self, name)(url)
File "C:\Program Files\Plucker\parser\python\vm\lib\urllib.py", line 277, in open_http
h = httplib.HTTP(host)
File "C:\Program Files\Plucker\parser\python\vm\lib\httplib.py", line 666, in __init__
self._conn = self._connection_class(host, port)
File "C:\Program Files\Plucker\parser\python\vm\lib\httplib.py", line 342, in __init__
self._set_hostport(host, port)
File "C:\Program Files\Plucker\parser\python\vm\lib\httplib.py", line 348, in _set_hostport
port = int(host[i+1:])
ValueError: invalid literal for int(): my-conf-port
Installing channel output to destinations...
Setting new due date...
Tasks completed for all channels.
Robotech_Master is offline   Reply With Quote