This is a guest post from Tim Hollis, VP of Operations at JetApps! JetApps has returned this year to exhibit at the cPanel Conference, October 1st – 3rd in Houston, Texas. If you haven’t already, take a look at the agenda, book your room (discounted rates apply until September 9th!), and get registered!
As a software company, nothing makes us happier here at JetApps than hearing stories of how JetBackup has significantly improved the backup and restore process for hosting providers. Please enjoy the following true story of a JetBackup user who found a way to reduce his backup window by 50%!
On A Mission…
We were on a mission to find a way to run daily backups of our entire server farm of 150 servers on to 1 single backup destination server. Our ultimate goal was to complete this daunting task in a backup window of 7 hours or less between the time of 11pm to 6am to avoid disturbing our clients during business hours. Before continuing this true story of a “Mission Impossible” task that would eventually become reality, let’s discuss the technical specs of our configuration:
1st Run: Here We Go!
Our first backup job run time was not pretty, to say the least. But without incremental kicking in to reduce the amount of data to backup up, it is hard to tell exactly where we stand.
Conclusion: We need 1 full incremental backup to better understand how long each server takes to run backups.
2nd Run: First Incremental Backup
Great improvement from the first run but it is nowhere near our goal of 7 hours or less to fit inside our backup job window.
Conclusion: There is a major IO & network bottleneck. Having all the servers run at the same time is clearly slowing down the backup process. We decide to use JetBackup 3.1’s push notification API call to send a remote command to a log server specifying the start and end times of each backup job on the server. After waiting another 24 hours to collect the data, we did some calculations and decided we need to expand our backup window to 12 hours from 10pm to 10am. This allowed us to schedule 20 server connections to our destination server at a time, reducing the IO stress from our destination server.
3rd Run: Improved Scheduling and Hours
Made quite a bit of progress reducing the total backup time per server. But we still have work to do!
Conclusion: We need a closer look to see what exactly is happening in real time. After running “top” and “ps” commands throughout the night we found some interesting results:
- IOTOP showed consistent 30MB-40MB of both read and write, squeezing the most out of the disks.
- Server CPU load was hovering between 40-60%. Working on the bash CLI was smooth, though any action relating to the “/home” 12TB mount had a 1-3 second delay.
- All the servers were running either “find” or “cp” commands that consumed most of the disks’ IO.
It was time to harness the power of JetBackup 3.2! The need to use “find” and “cp” commands was completely removed as it now creates everything “on the fly” using hardlinks. In addition, this newer version of JetBackup improved the general speed using open sockets saving roughly half a second per account. That might not sound like much but when you are backing up 15,000 cPanel accounts it shaved 2 hours off our backup time!
4th Run: JetBackup 3.2 Pays Off!
We reduced another hour per server just by optimizing JetBackup ! Good results, but not good enough, we can do better.
Conclusion: We need to solve the IO disk speed limits. So it was back to the drawing board to reconfigure our backup destination configuration. We took an old Dell R510 server and filled it up with the following:
- 2 x 146GB 15K (for booting FreeNAS)
- 10 x 6TB SATA hard drives configured with RAID 5 (for storage data)
- 2 x 1TB SSD hard drives configured with RAID 1 (for write cache)
5th Run: SSD Cache & FreeNAS
AMAZING! 150 servers, 15,000+ cPanel accounts backed up in 0.5 – 2.5 hours per server!
SSD write cache made a HUGE Difference.
Conclusion: I had to see this with my own eyes to believe it. Sure enough, when I logged into the destination server the next night the writing speed was around 300MB per second! By recalculating the run time of each backup job and optimizing the schedule for each server I should be able to shave off another 10 to 15 minutes. Integrating JetBackup 3.3’s incremental multi-scheduling feature should eliminate redundant backups from occurring and continue to reduce the total run time. I see my backup window per server being reduced to under 2 hours in the very near future!
We hope you enjoyed this true story from a hosting provider using JetBackup. The founder of JetBackup, Eli Alum, will be presenting in Houston, Texas October 2nd at the cPanel conference so don’t miss it! JetApps is also a sponsor of the cPanel conference for the 2nd year in a row. We met a lot of great cPanel partners at our booth last year and came away with some of our largest JetApps Partners. Please come by and visit us at our booth. We would love to hear how you currently configure your backups and provide you with some tips on how to make them even faster and more efficient!