View Single Post
Old 03-31-2003, 03:54 PM   #3
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,175
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
Test 1: Small Channel, Low Graphics

CNN: http://wireless.cnn.com/avantgo/cnn/index.html

Plucker:
  • Link-Depth: 3
  • Image Settings: Include Image (1000s of colors), thumbnail for images > 320x320
  • Other Settings: ZLIB
  • Target: PALM\PROGRAMS\Plucker
  • Conversion Time: insignificant
  • File Size: 84,994 Bytes (get file <a href="http://www.turcic.com/forums/supplements/isilo-plucker/CNN Pluck.pdb">here</a>)
iSiloX:
  • Link-Depth: 2
  • Image Settings: Include Image (16-bit 64K colors), Compress
  • Other Settings: Process table formatting
  • Target: PALM\PROGRAMS\iSilo
  • Conversion Time: insignificant
  • File Size: 47,382 Bytes (get file <a href="http://www.turcic.com/forums/supplements/isilo-plucker/CNN iSilo.pdb">here</a>)
Observations:
There is a bug in the current Plucker Desktop Parser. As apparent from the Log (get file <a href="http://www.turcic.com/forums/supplements/isilo-plucker/CNN Pluck.log">here</a>), Plucker downloaded and saved each page twice here. This explains why the resulting file is almost twice as big as the converted iSiloX file. If I reduce the Link-Depth for Plucker to 2, it would not download the pages twice anymore, but the “<< Main” link in the header of each CNN article would link to an unresolved URL.

Note that for Plucker Link-Depth always starts at 1 (=retrieve starting page only), whereas for iSiloX Link-Depth starts at 0.

In both cases, conversion time was below 10 seconds.

Screenshots:
CNN Home iSilo:


CNN Home Plucker:


CNN Article iSilo:


CNN Article Plucker:
Alexander Turcic is offline   Reply With Quote