Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book General > News

Notices

Reply
 
Thread Tools Search this Thread
Old 05-17-2012, 04:06 AM   #226
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,557
Karma: 93980341
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by morantis View Post
Our client list, which is 540,000 email addresses and names(a little shorter than a log line) is only 37 KB.
Perhaps you mean 37MB? 37kB for 540,000 addresses is rather less than a tenth of a byte per client .
HarryT is offline   Reply With Quote
Old 05-17-2012, 04:19 AM   #227
plib
Guru
plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.
 
Posts: 777
Karma: 6356004
Join Date: Jan 2012
Device: Kobo Touch
Quote:
Originally Posted by morantis View Post
I didn't use a router command, lol, wow, really? I used the debugger port for my router through SSH, and to the other gentleman that has that huge file as a log, something is very wrong there. I just peeked at our web server log, which maintained about the same line formatting and is still maintaining a full log for Apache at 7 years and it is right around 87 KB, just like any other file of that type. Our client list, which is 540,000 email addresses and names(a little shorter than a log line) is only 37 KB.
As each character in a name or address is a byte, even if the email address amounts to "@" then that's 527KB? Admittedly my logs include a lot more than just the connection IP address but your math seems a little off.

Also could you explain how your "nothing special" router is running Apache as a basic service, seeing as we're talking about basic, unaltered consumer routers? And I'd be interested to know how you can operate SSH on the "debugger port" of your router? Does the router come with a special install of dropbear? How did you program your public/private key combo to get into SSH on the "debugger port".

Oh, and the linux gurus on the DD-WRT forums would really like to know what the address of the "debugger port" is? I'm presuming it's not 22 as that's the standard rather than a "debugger" address, and most people I know don't operate SSH on the standard port anyway as it gets pinged all the time from China. How do you access it? Putty, WinSCP, some manufacturer utility I could d/load from their website?

Enquiring minds would like to know.

Last edited by plib; 05-17-2012 at 04:52 AM.
plib is offline   Reply With Quote
Advert
Old 05-17-2012, 06:36 AM   #228
JoeD
Guru
JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.
 
Posts: 895
Karma: 4383958
Join Date: Nov 2007
Device: na
Quick comment on the file sizes, there's a chance the logs are compressed and uncompressed (zless) on the fly to read. It's common for a 1MB log to compress to 50-60KB gzip'd (30-40KB bzip'd although don't think zless supports those).

I've checked the apache log for my internal server, it's currently at 6MB (180KB when compressed) and the earliest entry is mid February. That's a low activity server that is accessible only via the LAN. I wouldn't be surprised that a public accessible apache would have much greater log sizes, even if it's used infrequently just with all the worms/automated exploit attempts that will appear in it.

In terms of web access, if you combine all the people who use our connection, we visit many more websites than I access the apache server. When you consider connecting to a single website and loading one page can result in tens of connections entering the log (image loading, ad access, pulling in scripts, css...) the logs generated for even a home user are going to be pushing routers that don't have dedicated space for logs. So, I would take an educated guess that if my router did log all access, the logs would be significantly > 6MB (per month).

If you've SSH to access the logs, that still implies you're running some sort of command on the router as most routers that allow telnet or ssh support multiple configuration commands once connected. What router model do you have and what is it you've executed to view such detailed logs?

The only way I can get long term logs out of my router is to configure it to log to an external server that supports syslog.

I'm sure there are routers available with plenty of space for logging even in high usages areas, however I doubt most consumer routers will come with more than a token amount of space.

Last edited by JoeD; 05-17-2012 at 07:03 AM.
JoeD is offline   Reply With Quote
Old 05-17-2012, 06:42 AM   #229
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,557
Karma: 93980341
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by JoeD View Post
Quick comment on the file sizes, there's a chance the logs are compressed and uncompressed (zless) on the fly to read. It's common for a 1MB log to compress to 50-60KB gzip'd (30-40KB bzip'd although don't think zless supports those).
I find that hard to believe. You can get a factor of 2 or perhaps 3 when you compress a text file, but a factor of 20? Unless the information in the file is massively redundant (lots of repeated strings) I just don't see how it could be done.
HarryT is offline   Reply With Quote
Old 05-17-2012, 07:13 AM   #230
JoeD
Guru
JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.
 
Posts: 895
Karma: 4383958
Join Date: Nov 2007
Device: na
Quote:
Originally Posted by HarryT View Post
I find that hard to believe. You can get a factor of 2 or perhaps 3 when you compress a text file, but a factor of 20? Unless the information in the file is massively redundant (lots of repeated strings) I just don't see how it could be done.
It's very very possible with text. I won't pretend to know anything about the algorithms used in detail, but if you google huffman encoding, that's one of the ways they can compress so heavily.

The numbers I used are real. I made a fresh copy of my apache log, 5.8MB in size (yes I rounded to 6MB in my OP . After gzip -9 compression, 180KB gzip'd, or if bzip2'd instead 121KB.

Don't forget, when it comes to logs, there can be quite a lot of repetition. For example, my private IP will occur numerous times, dates will be repeated multiple times, pages accessed might occur several times.

Last edited by JoeD; 05-17-2012 at 07:17 AM.
JoeD is offline   Reply With Quote
Advert
Old 05-17-2012, 07:15 AM   #231
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,557
Karma: 93980341
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by JoeD View Post
It's very very possible with text. I won't pretend to know anything about the algorithms used in detail, but if you google huffman encoding, that's one of the ways they can compress so heavily.

The numbers I used are real. I made a fresh copy of my apache log, 5.8MB in size (yes I rounded to 6MB in my OP . After gzip -9 compression, 180KB gzip'd, or if bzip2'd instead 121KB.
That implies massive redundancy, as I said in my previous post. If you compress a typical piece of "real world" text (eg a book) you'll typically get a factor of 2 or 3 from it in the way of compression.
HarryT is offline   Reply With Quote
Old 05-17-2012, 07:21 AM   #232
jkeene
Connoisseur
jkeene understands the Henderson-Hasselbalch Equation.jkeene understands the Henderson-Hasselbalch Equation.jkeene understands the Henderson-Hasselbalch Equation.jkeene understands the Henderson-Hasselbalch Equation.jkeene understands the Henderson-Hasselbalch Equation.jkeene understands the Henderson-Hasselbalch Equation.jkeene understands the Henderson-Hasselbalch Equation.jkeene understands the Henderson-Hasselbalch Equation.jkeene understands the Henderson-Hasselbalch Equation.jkeene understands the Henderson-Hasselbalch Equation.jkeene understands the Henderson-Hasselbalch Equation.
 
Posts: 83
Karma: 85586
Join Date: Nov 2010
Device: Kindle 3
I just compressed a recent log from one of our Apache servers with WinZip. 2,995,170 bytes in, 87,252 bytes out, ratio 97%.

A typical line in the log is
Quote:
[ip address removed] - [userid removed] [13/May/2012:20:01:08 -0400] "PROPFIND /svn-repos/Main/!svn/vcc/default HTTP/1.1" 207 414
As there is usually a blast of lines for access in a row, the time stamps, ip addresses, userids and urls are quite similar and very compressable.

All that notwithstanding, typical consumer routers don't run Apache.
jkeene is offline   Reply With Quote
Old 05-17-2012, 07:53 AM   #233
JoeD
Guru
JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.
 
Posts: 895
Karma: 4383958
Join Date: Nov 2007
Device: na
Quote:
Originally Posted by HarryT View Post
That implies massive redundancy, as I said in my previous post. If you compress a typical piece of "real world" text (eg a book) you'll typically get a factor of 2 or 3 from it in the way of compression.
Yes, but we're discussing log files and they have a lot of redundancy.

For a book, yes I'd expect something in the range 2-4x compression, depending on the text.
JoeD is offline   Reply With Quote
Old 05-17-2012, 07:57 AM   #234
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,557
Karma: 93980341
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by JoeD View Post
Yes, but we're discussing log files and they have a lot of redundancy.

For a book, yes I'd expect something in the range 2-4x compression, depending on the text.
I agree, but I was actually thinking of the earlier poster's claim that a list of 540,000 e-mail addresses and names had been compressed to 37k. That's just... improbable ... no matter how you look at it.
HarryT is offline   Reply With Quote
Old 05-17-2012, 08:02 AM   #235
JoeD
Guru
JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.JoeD ought to be getting tired of karma fortunes by now.
 
Posts: 895
Karma: 4383958
Join Date: Nov 2007
Device: na
Quote:
Originally Posted by HarryT View Post
I agree, but I was actually thinking of the earlier poster's claim that a list of 540,000 e-mail addresses and names had been compressed to 37k. That's just... improbable ... no matter how you look at it.
Ah, I was referring to his log file size (where he mentioned apache and 87KB for 7 years worth). A client list might compress quite well if there's a lot of repeated names and domain names, but I agree 37k sounds very unlikely.

Last edited by JoeD; 05-17-2012 at 08:08 AM.
JoeD is offline   Reply With Quote
Old 05-17-2012, 09:19 AM   #236
Rob Lister
Fanatic
Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.
 
Posts: 532
Karma: 3293888
Join Date: Oct 2011
Location: Virginia
Device: Nook Simple Touch
Quote:
Originally Posted by morantis View Post
I didn't use a router command, lol, wow, really? I used the debugger port for my router through SSH, and to the other gentleman that has that huge file as a log, something is very wrong there. I just peeked at our web server log, which maintained about the same line formatting and is still maintaining a full log for Apache at 7 years and it is right around 87 KB, just like any other file of that type. Our client list, which is 540,000 email addresses and names(a little shorter than a log line) is only 37 KB.


I don't want to say you're being evasive but nothing in the post above relates to the claim you made and my request for evidence of it. I'll make it simple for you:

You claimed, and I quote:
Quote:
Just to let you know, having the logging function disabled on the router does not stop logs from being created, you simply don't have access to them through the consumer or administrator interface. I can go into any router anywhere anytime and pull every website visited, providing that it is not five years ago and the logs have gone past the allocated size, but I guarantee that I can pull at least 3 to 4 years worth of logs.
I have a Cisco E2000 router. I've had it for 2 years. So tell us what process or procedure you would use to pull logs for the last 2 years.

Last edited by Rob Lister; 05-17-2012 at 09:24 AM.
Rob Lister is offline   Reply With Quote
Old 05-17-2012, 10:17 AM   #237
plib
Guru
plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.
 
Posts: 777
Karma: 6356004
Join Date: Jan 2012
Device: Kobo Touch
Post

Quote:
Originally Posted by Rob Lister View Post
I have a Cisco E2000 router. I've had it for 2 years. So tell us what process or procedure you would use to pull logs for the last 2 years.
I'll go for a bit of that.

Router: Cisco/Linksys WRT610N v2
Firmware: DD-WRT v24-sp2 (08/12/10) std-usb-ftp - build 14929 with Frater's OTRW installed.

Just to make it easier for him I already have SSH access via either Putty or WinSCP set up, though I'm not sure I know how to get at his "debugger port", but I'm sure he can tell us.
plib is offline   Reply With Quote
Old 05-17-2012, 10:49 AM   #238
ApK
Award-Winning Participant
ApK ought to be getting tired of karma fortunes by now.ApK ought to be getting tired of karma fortunes by now.ApK ought to be getting tired of karma fortunes by now.ApK ought to be getting tired of karma fortunes by now.ApK ought to be getting tired of karma fortunes by now.ApK ought to be getting tired of karma fortunes by now.ApK ought to be getting tired of karma fortunes by now.ApK ought to be getting tired of karma fortunes by now.ApK ought to be getting tired of karma fortunes by now.ApK ought to be getting tired of karma fortunes by now.ApK ought to be getting tired of karma fortunes by now.
 
Posts: 7,390
Karma: 68329346
Join Date: Feb 2010
Location: NJ, USA
Device: Kindle
Quote:
Originally Posted by JoeD View Post
Ah, I was referring to his log file size (where he mentioned apache and 87KB for 7 years worth). A client list might compress quite well if there's a lot of repeated names and domain names, but I agree 37k sounds very unlikely.
Just because I'll do almost anything to avoid real work, I just created an email list with 1.2 million name/address entries.
There is a lot of repetition, but I did make random changes to groups of a hundred thousand or so at a time.

The raw text file is 47.8 MB. 7Zip's Ultra level compression gets it down to UNDER 8 KILOBYTES. Pretty impressive. I'm going to spend a few minutes trying with one of my actual huge log files and see what happens.
ApK is offline   Reply With Quote
Old 05-17-2012, 10:53 AM   #239
plib
Guru
plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.plib ought to be getting tired of karma fortunes by now.
 
Posts: 777
Karma: 6356004
Join Date: Jan 2012
Device: Kobo Touch
Quote:
Originally Posted by ApK View Post
Just because I'll do almost anything to avoid real work, I just created an email list with 1.2 million name/address entries.
So you decided becoming a spammer isn't real work?

(Not that I'm disagreeing)
plib is offline   Reply With Quote
Old 05-17-2012, 10:54 AM   #240
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,557
Karma: 93980341
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by ApK View Post
Just because I'll do almost anything to avoid real work, I just created an email list with 1.2 million name/address entries.
There is a lot of repetition, but I did make random changes to groups of a hundred thousand or so at a time.

The raw text file is 47.8 MB. 7Zip's Ultra level compression gets it down to UNDER 8 KILOBYTES. Pretty impressive. I'm going to spend a few minutes trying with one of my actual huge log files and see what happens.
Yea - if you have 100,000 lines the same, they will essentially be replaced by one marker saying "repeat this 100,000 times". If you made your file 10 million lines long, it would still compress to 8k . Artificial cases like that aren't a terribly good test, because they will compress in a way that "real" data doesn't.
HarryT is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
360 Plus How to enter address into browser address field? Hope PocketBook 15 04-06-2012 12:07 PM
TheyRule.Net - relationships of the US ruling class Alexander Turcic Lounge 0 05-13-2004 09:50 AM


All times are GMT -4. The time now is 03:44 PM.


MobileRead.com is a privately owned, operated and funded community.