I found a legitimate 45MB log file, that I skimmed to make sure had typically varied data (as I have many huge debug logs filled with repetitive "I'm stuck on this one file" lines.)
This 45MB log compressed to under 2 MB with 7z compression, and 4MB with zip compression.
By comparison, an 80MB error log with different time stamps on every line but mostly the same message, went down to 470K.
Do remember that while "real" data, especially logs, is not super repetitive, neither is it super random.
OK, this was fun. Back to work for me now.
|