12-29-2006, 01:03 PM | #1 |
Fully Converged
Posts: 18,170
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
|
Apologies for the outage
Duh, we just had our first real outage. It's somehow related to Murphy's Law I think, because it was only the last two days when I was unable to access the Internet for other reasons.
I wish I could say who was to blame for the outage, like a squirrel which somehow squeezed its way into our server, but the truth is: I don't know. A note popped up in my e-mail saying that our Apache server went down on December 28, 2006 05:22:23 PST. A simple restart did the trick - alas with a day delay thanks to my absense. We'll make sure this is not going to happen again! Thanks to everyone who informed us, including David @ TeleRead, Laurens, Simon, Roland, Daniel, and everyone else! Anyways, I hope everyone of you saddled up and is ready to ride into a Happy New Year! |
12-29-2006, 01:34 PM | #2 |
Feedbooks.com Co-Founder
Posts: 2,263
Karma: 145123
Join Date: Nov 2006
Location: Paris, France
Device: Sony PRS-t-1/350/300/500/505/600/700, Nexus S, iPad
|
We use another machine to ping the server on Feedbooks. If the ping's not working it send us a texto on our cellphone, that's pretty usefull. You still need an Internet connection with SSH access to the server though to restart everything...
|
Advert | |
|
12-29-2006, 01:46 PM | #3 |
iLiad Maniac
Posts: 1,382
Karma: 2369
Join Date: Apr 2006
Location: Germany
Device: Bookeen Opus (i love that thing) and iPad (what an irony)
|
If a simple apache restart did the trick, you might want to take a look at this: http://www.tildeslash.com/monit/
|
12-29-2006, 02:02 PM | #4 |
GadgetGirl
Posts: 10
Karma: 10
Join Date: Dec 2006
Location: Southern California
Device: Sony Reader / eBookman (currently misplaced)
|
Welcome back!
Here I'd just found the forum a few days ago, and you were gone! I'm glad that all it took was a simple server restart.
|
12-29-2006, 02:49 PM | #5 |
Fully Converged
Posts: 18,170
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
|
Thanks for the tips guys! It's really my fault, because I didn't plan a backup solution for something like this (me not being around, server going down). It looks as if the Apache process didn't do a graceful restart after the logs had been rotated (which is being done every day).
It's just weird because we didn't change any settings. |
Advert | |
|
12-29-2006, 03:16 PM | #6 | |
Fully Converged
Posts: 18,170
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
|
OK, here is what happened: Every night, we rotate our log files. After rotation, the Apache server receives a "USR1" signal to be gracefully reloaded. Gracefully means that before Apache is reloaded, its children must wait to complete their request before dying. The problem is that on rare occasions, but especially during high system loads, some children may still be up waiting to finish their requests while the master process is already being reloaded. When this happens, Apache fails to restart.
Googling reveals this link: Quote:
|
|
12-29-2006, 03:22 PM | #7 |
Evangelist
Posts: 490
Karma: 1641
Join Date: Oct 2006
Location: Louisville
Device: Sony Reader PRS-500
|
Just tell it to sleep for 5 minutes, that should be plenty of time for anything it is doing.
Glad to see you guys are back up. I wondered what happened, after being down for over a day I was wondering if for some reason the site was taken down. I checked a few other eBook sites, but didn't see any notices or posts, so wasn't sure what was happening. |
12-29-2006, 04:59 PM | #8 |
iLiad Maniac
Posts: 1,382
Karma: 2369
Join Date: Apr 2006
Location: Germany
Device: Bookeen Opus (i love that thing) and iPad (what an irony)
|
Alexander, try monit. It lets you monitor the apache and restart it, if it doesnt react on HTTP requests. You can make it send you an email, if the restart does not work. Its a pretty awesome too that monit thingy And you can monitor almost anything else on the machine.
|
12-31-2006, 07:33 AM | #9 |
Fully Converged
Posts: 18,170
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
|
Thanks for the tips, guys!
We just installed monit and configured it to make sure that this kind of outage is not going to happen again. |
01-03-2007, 12:21 PM | #10 |
Drama Queen
Posts: 784
Karma: 11712
Join Date: Nov 2002
Location: United States
Device: Palm Tungsten T|T3
|
Happy New Year everyone.
Love, Murphy's Law |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Apologies in Advance for Dumb Question | sherryk_us | Sony Reader | 9 | 06-26-2009 06:13 AM |
Total N00B questions! (Apologies in advance). | OUTATIME | Sony Reader | 34 | 02-09-2008 02:51 PM |
Database outage | Alexander Turcic | Announcements | 2 | 07-14-2005 04:44 PM |
Gmail - first observed outage! | doctorow | Lounge | 4 | 08-23-2004 08:04 PM |