View Single Post
Old 05-17-2012, 08:15 AM   #231
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,560
Karma: 93980341
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by JoeD View Post
It's very very possible with text. I won't pretend to know anything about the algorithms used in detail, but if you google huffman encoding, that's one of the ways they can compress so heavily.

The numbers I used are real. I made a fresh copy of my apache log, 5.8MB in size (yes I rounded to 6MB in my OP . After gzip -9 compression, 180KB gzip'd, or if bzip2'd instead 121KB.
That implies massive redundancy, as I said in my previous post. If you compress a typical piece of "real world" text (eg a book) you'll typically get a factor of 2 or 3 from it in the way of compression.
HarryT is offline   Reply With Quote