Yesterday's Outage.
Yesterdays outage was due to a power failure at the data center.
All TextDrive servers were affected, beginning at approximately 10:15 GMT. The outage lasted approximately four hours and all servers were restored except Pendrell.
Here is the information posted by The Planet regarding the outage:
At approx. 4:15 AM CST, a pair of redundant Powerware 500KVA UPS units failed – creating a power failure in Section B of our D2 datacenter. Emergency teams were deployed within minutes and power was restored, however intermittent power outages have continued to occur until 6:45 AM CST. Powerware, JT Packard, and electricians are currently onsite with over 100 Planet technicians working to resolve the issue. We do NOT anticipate any futher outages. A formal RFO will be released once the team debriefs. We apologize for all the issues this has caused you.We will be performing a sweep of the data center as soon as this issue is completely resolved to reboot and ensure connectivity of all Active servers.Thank you for your patience in this matter.
During the outage, Pendrell and Bidwell experienced power loss. Pendrell did not recover from the power loss. We power cycled Pendrell many times – Pendrell would boot and then lose connectivity a few minutes later. Finally, one of The Planet technicians and myself disconnected the network cable just before a reboot. Once the web and mail servers where back up and normalized, the network cable was plugged back in. Pendrell has been running since and likely had issues trying to both start up.
Yesterday’s outage left us all with some concerns and we had a meeting at The Planet which addressed our concerns.
Power Outage – How do we know it won’t happen again?
Let me first start by saying that 4 hours of downtime is the longest I have ever seen The Planet down. I am certain that this will make The Planet stronger and we won’t see them down again. That said, the focus of TextDrive is our customers and not our data center, so we will do what’s necessary to keep you running, and will be discussing some changes in the current setup to make that happen.
TextDrive has been growing fast and as such we now have a valuable presence at our data center. We now have “Tier 1” status, and Tier 1 servers have immediate escalation rights and in cases such as this one, are dealt with first. This means if something goes down, something goes wrong, or if we’re simply ordering a new server – we’ll get first priority.
Last, the TextDrive team has got their heads together to re-invent shared hosting (if we haven’t already). This is rather outside the scope of the ‘status blog’, so I’m going to have to kindly ask that you visit the TextDrive blog a little later today to see what’s in the works. Enjoy.
[
TextDrive Status Updates]
< 7:48:44 AM
>
ARRLWeb: ARISS Packet BBS Back Up. ARRLWeb: ARISS Packet BBS Back Up. This refers to the Amateur Radio packet digital data communication system on board the International Space Station. Amateur radio operators world wide can access the packet bulletin board system on board ISS, whenever the ISS is in range.
The first time we stood outside in the darkness to spot the ISS, we waited patiently for the very bright space station to
By null. [Common Sense Technology]
< 7:46:11 AM
>