Service Alert: Multiple Customer Outage
As we work to improve our web hosting infrastructure, we are also working hard to improve our communication links with you. As you may have noticed, over the past few months, we have implemented a staff blog on our website, as well as a Customer Support Forum, where you can ask and answer each other’s technical questions. We will also work hard to answer each question. By now, you should have a user account on Smooth Stone Services with which you are able to open Support Tickets. You can use this same account to visit our forums.
Our blog is located at http://www.smoothstoneservices.com/blog, and our forums are located at http://www.smoothstoneservices.com/forum. Please note these resources and use them at your convenience. They are a free resource to you! We hope that they will be useful to you, in addition to our growing Hosting Knowledge Base and our Wiki.
We are also implementing a new email alert system in the event of serious downtime experienced by our clients. These are manually written emails that we will strive to send you within 24-hours of an event that caused your website to be down, what we did to fix it, and any additional steps taken.
Unfortunately, this is the first such email.
Cause / Scope:
At approximately midnight (Eastern Time), on Thursday evening / Friday morning, all customer websites that were tied to a database became unusable. This was due to a script that we automatically run every night to back up your databases, which was not deleting old database backups properly. As a result, our server’s free space was filled to 100%, causing MySQL to crash.
If your website is not tied to a database, then your website was not affected. If you are unsure, please feel free to contact us.
Fix
We received a phone call at approximately noon on Friday indicating a customer’s website was down. As we investigated the problem, we were able to quickly isolate the cause for MySQL to be unreachable, and cleared out the old database backups. Full service was restored to all websites by 12:30pm.
Thus, your website was down for approximately 12 hours & 30 minutes.
This is a very long amount of downtime, and is completely unacceptable. Please accept our sincere apologies. Please also read below to learn what we have done to prevent this from happening in the future.
Additional Steps Taken
Ironically, we were planning on implementing additional monitoring onto our server infrastructure this weekend. However, we have already gone above & beyond what our plans had called for.
- We have now implemented a sophisticated monitoring platform that tests our server for: free disk space, server load, Apache uptime, and MySQL uptime. If any of these tests fail, automated commands will run (when appropriate), and email & text message alerts will be sent to us. We have thoroughly tested this system to ensure it works as expected.
- Fixed our MySQL database backup script to properly remove old backups
- Implemented URL monitoring for our primary URL, www.smoothstoneservices.com through an external service, which will help us identify and verify any downtime in the future.
Thank you for your continued trust in Smooth Stone Services. We appreciate your patience as we have worked to remove the kinks from moving into our new Web Hosting environment.
As always, if you have any questions, comments, or concerns, please feel free to open a Support Ticket, email us, or calling us at 864-640-1329.
Post new comment