View Full Version : explode - plucker to html


pruss
04-07-2009, 10:01 PM
I've finally updated the explode utility for compatibility with newer versions of the Plucker format, and made it work a bit smarter. Here are Windows binaries, and source code:
http://www.1src.com/freeware/fileinfo.php?id=1916

To run, do:
explode --directory=outdir filename.pdb

Then your home page is outdir\default.html

You can adjust the jpeg compression with --jpeg-quality=x (where x ranges from 0 to 100).

No direct epub support yet. I might add it one day. But I don't have an epub capable device, so my motivation is low.

This is a kind of sad thing for me. I've been involved in the Plucker project for five or six years, and now it's looking like the end of the format is in sight--it's time to release converters to other formats so people can migrate their converters. :-( Oh well. I still do all my ebook reading on my TX, and 95% of that with Plucker.

nrapallo
04-08-2009, 12:38 PM
Thanks a lot! It worked like a charm!

Also, I had to login (thanks to my Sony Clie TH-55 days) to get your upload at 1src.com. Most won't be so lucky.

So with the author's permission (pruss), I attach explode v0.11 to this thread for others to have access to it.
For years, everyone kept telling me that plucker to .html was not possible; and thanks to you, they are ALL wrong now!

I converted the 43MB Wikipedia 2006.pdb in plucker format (ftp://ftp.wizzy.com/pub/wizzy/palm/) quickly as follows:
E:\ebooks> explode --directory=Wikipedia Wikipedia.pdb

I'm converting it now to .imp and from there to .mobi/.prc and .epub, so thanks again for your (much appreciated) update to explode.c.

nrapallo
04-08-2009, 12:50 PM
I'm converting it now to .imp and from there to .mobi/.prc and .epub, so thanks again for your (much appreciated) update to explode.c.

WOW!!!!

It converted on it's first run to my REB1200 .imp format, but the resulting file is TOO huge to be used on my ebook reader. After about 4.5 hours of processing, my conversion software (eBook Publisher) created a 151MB file!

The PC viewer can load it, so I attach some screenshots as the "proof" of concept. Hey, it was fun to even be able to try this.

Now onto round two; decrease the resulting .imp file size by eliminating some or all of the images and/or repetitive text. I would need a 10 fold reduction to make this useable and fear that that may not be feasible. But I WILL try.... (to boldly go where no man has gone before)

nrapallo
04-08-2009, 12:52 PM
I'm converting it now to .imp and from there to .mobi/.prc and .epub, so thanks again for your (much appreciated) update to explode.c.

OK, I've completed the round one conversions of this to .mobi/.prc and .epub and the results were similar to my previous conversion to .imp.

Using Mobipocket Creator with standard compression resulted in a 138MB .prc whereas using calibre resulted in a 141MB .epub (see some sample ADE screenshots). In all three conversions, the images seem to be stored in full 16M color resolution (as extracted from explode.exe) and the culprit for the tremendous filesize increase. The (compressed) text occupied only about 10% (12 to 15MB) of the resulting ebooks.

Time to focus on image resolution reduction, but I may ( will! ) not probably get ANY better compression result than Plucker's 43MB .pdb, so this is an exercise in futility, but I don't mind...

nrapallo
04-08-2009, 12:53 PM
You could also reduce the compression accuracy of the jpeg files. I've just posted a new version of explode with a --jpeg-quality=xxx switch (untested). I noticed that the previous version always set the quality to 100 (!). If you set it to 85 or so, you might find a nice improvement, unless of course your further conversion tools themselves reduce the jpeg quality.

Actually, I was thinking of using instead (in place) .gif images with 256-colors (yields about 50% filesize savings) and will see if using a 16-color/grayscale .gif will reduce this any further. Worth a shot!

However, I will also try your new --jpeg-quality switch. "Whatever works best" is my motto...

Thanks again!

nrapallo
04-08-2009, 01:00 PM
Using 4-bit .gif improved the filesize of the final ebook a lot. The .imp now is 50 MB, .epub now 43 MB and .prc (stnd) is 44 MB & .prc (high compression) is 37 MB. There was very little image quality loss reducing from 16M colors to 16 colors. :)

Still too big to be useful. :smack:

If I exclude images, then the resulting filesizes are: 23 MB .imp, 14 MB .epub and 16 MB .prc (stnd).
:chinscratch: much more useable...

I'll experiment a bit more and see if using the Wikipedia 2006 CD source directly instead of the plucker extracted .html/images works just as good or better.

Stay tuned...

nrapallo
04-09-2009, 10:20 AM
Using 4-bit .gif improved the filesize of the final ebook a lot. The .imp now is 50 MB, .epub now 43 MB and .prc (stnd) is 44 MB & .prc (high compression) is 37 MB. There was very little image quality loss reducing from 16M colors to 16 colors. :)

Still too big to be useful. :smack:

WOW!! WOW!! WOW!! I just tried that 50 MB Wikipedia 2006 .imp file on my REB1200 ebook reader and it works flawlessly!!!! All the links, images and text (though needs to be better formatted) were all there and useable.

I had to use Impserve to load it and that took several minutes, but WOW!! WOW!! WOW!!

I'm a bit excited... :snicker:

p.s. doesn't work on my EBW1150, though! :angry:

nrapallo
04-10-2009, 12:04 PM
I've finally updated the explode utility for compatibility with newer versions of the Plucker format, and made it work a bit smarter. Here are Windows binaries, and source code:
http://www.1src.com/freeware/fileinfo.php?id=1916

Amazing, this software has already been downloaded over 5,000 times!!! Check it out. :2thumbsup

I guess it sure filled a void! ;)

nrapallo
08-15-2009, 01:34 PM
Using 4-bit .gif improved the filesize of the final ebook a lot. The .imp now is 50 MB, .epub now 43 MB and .prc (stnd) is 44 MB & .prc (high compression) is 37 MB. There was very little image quality loss reducing from 16M colors to 16 colors. :)

Still too big to be useful. :smack:

If I exclude images, then the resulting filesizes are: 23 MB .imp, 14 MB .epub and 16 MB .prc (stnd).
:chinscratch: much more useable...

I'll experiment a bit more and see if using the Wikipedia 2006 CD source directly instead of the plucker extracted .html/images works just as good or better.

Stay tuned...

Some HUGE .prc, .epub and REB1200 .imp (http://www.mediafire.com/?sharekey=20a7cfff64c9b6fa36df4e8dca1419690d0b12fc ed63f529b8eada0a1ae8665a) ebooks made from that 2006 Wikipedia CD Selection from the SOS Children website.

If I exclude images, then the resulting ebook filesizes are: 22 MB REB1200 .imp, 12 MB .epub and 9 MB .prc (high-compression) and may be much more useful. A 15 MB .prc (standard compression) is also available for those readers that cannot handle high (Huff-Dic) compression.

Just a test... :) ...Try this (http://www.mediafire.com/nrapallo) for more previews of my 2006 Wikipedia work-in-progress ebooks ...

Blue Tyson
08-17-2009, 05:17 AM
I've finally updated the explode utility for compatibility with newer versions of the Plucker format, and made it work a bit smarter. Here are Windows binaries, and source code:
http://www.1src.com/freeware/fileinfo.php?id=1916

To run, do:
explode --directory=outdir filename.pdb

Then your home page is outdir\default.html

You can adjust the jpeg compression with --jpeg-quality=x (where x ranges from 0 to 100).

No direct epub support yet. I might add it one day. But I don't have an epub capable device, so my motivation is low.

This is a kind of sad thing for me. I've been involved in the Plucker project for five or six years, and now it's looking like the end of the format is in sight--it's time to release converters to other formats so people can migrate their converters. :-( Oh well. I still do all my ebook reading on my TX, and 95% of that with Plucker.

I use Plucker all the time, so thanks very much. This is a great idea.

nrapallo
08-20-2009, 05:40 PM
EDIT: I've stopped hijacking this thread and now have a new thread called
Creating HUGE ebooks from the 2006 Wikipedia CD Selection (http://www.mobileread.com/forums/showthread.php?t=54166) to continue any 2006 Wikipedia posts.

Please only discuss here any issues you may have with explode.c issues...

pruss
02-23-2013, 11:55 PM
I've made some more updates to explode, and it's now available here (https://code.google.com/p/plucker/downloads/list).