Joined: 12 Apr 2005
Subject: [How To] Back Up Your Board [Work In Progress]
Mon Apr 25, 2005 8:12 pm
One of the easiest steps to take to protect yourself from hackers, hardware problems, server crashes, or hosting issues is to back up your forum. Yet it is surprising how few people actually take this simple step. This article will cover a few items to consider and then provide some details about how the phpBB Doctor manages our backup process. For those of you that don't want to read the entire article, here's the summary.
If you do those three things, then you will have minimal downtime and lost data if / when hackers or hardware issues strike.
- Back up source code before and after each major change
- Back up your data daily
- Automatically download your data backup file to a different server
There are really two components of your board: source code and data. You generally know when your source code changes (or at least you should ). Data can change at any time. We'll cover source code strategies first, and then your data.
At the phpBB Doctor we subscribe to a three stage configuration. We have an "alpha" board that is essentially offline. It's running on a linux server that belongs to one of our developers. By being offline there is no need to keep the server up 24x7; the code on that box is often in various stages of repair. At any time we can reconstruct the "alpha" box simply by restoring a copy of our production board. We use this box to play around with new ideas and for initial MOD development.
Our "beta" forum is located on the same server as our production board. As a result, it is exposed to the Internet, and anyone would therefore be able to help test MODs. The "beta" board is used for final MOD development and testing.
Our "production" board is what you're looking at now.
So what does our backup strategy for source code look like?
Our "alpha" and "beta" boards are often restored to production status simply by copying the source (php, tpl, css, and so on) from our production board. So we won't really be talking about those. Our production board is backed up before and after each code change. That way if there is a problem that is not discovered during alpha or beta testing we can quickly restore a "known working" board with minimal fuss. This source backup is copied to another folder on our server and downloaded to an off-server RAID device. If our server competely crashes, we can be up and running on a new host very quickly. We have completely moved a large board (150K+ posts, 8K+ users) from one server to another in about an hour as a test of our process. Copy the source files, update the config.php and phpbb_config tables, restore the database, and we were up and running.
Backing up your source code does not have to be difficult. We keep a simple log that includes the date and a list of changes that were made. This is currently kept in a text file offline, but we intend to move this into our phpBB database so all of our developers can access it. The phpBB Group uses SourceForge and CVS to manage their coding process. But it doesn't have to be that difficult. Just make a copy of every source file in the forum folder (and include all sub-folders) and you're ready to restore. Do this before and after every code change so that you don't lose anything.
Source control is generally a manual process. In a perfect world, database backups should be automatic. We will share the process and scripts that we use to back up the phpBB Doctor site. These scripts are specific to *nix hosts running MySQL.
We used to back up our database on Saturday night. We thought weekly backups were good enough. Then we had a SQL update script go awry on a Thursday, and lost new members that had registered on Sunday through Thursday of that week. Now we back up every day.
We use the mysqldump program to extract data from our phpBB databases. The admin control panel (ACP) provides a mechanism for taking backups, but it does not automatically include everything in the database, only the tables it knows about. It is not possible to schedule that process automatically. And it pulls the data down to your desktop. So here's what we do, along with some other ideas.
We use cron (a *nix utility) for scheduling. Our backups run at 11:59 PM every night. Here is the command used:
|mysqldump -h localhost -u root --opt $dbname > $dbpath/backup.sql 2>>$dbpath/backup_errors.out |
mysqldump is the command being run
-h localhost says that the source database is on the localhost (-h means host)
-u root means that the user (-u) is root
--opt is shorthand for a bunch of options that will be covered later
$dbname is filled in by the script as the name of the database to be backed up
> $dbpath/backup.sql says to direct the output from mysqldump into a specified path and the filename is backup.sql
2>>$dbpath/backup_errors.out says to direct stdout (where error messages would normally go) to the file backup_errors.out. The double >> arrow means to append rather than overwrite.
This command creates a file called backup.sql in the specified folder. The next steps in the script go like this:
|# create new backup filename
# zip, and copy the output
mv $output_file.gz $filename
The first line sets up a variable called filename. It will have the same value as the database name variable used earlier. The gzip utility is then used to compress (zip) the output file, and finally the mv (move) command is used to put the file into a specific directory for downloading.
You may or may not know what "root" is. Suffice it to say you will not have access to root unless you own your own server. The "root" user is the administrator of the server, and automatically has access to everything. Our script uses root because we host a number of different web sites and root allows us to back up every database for every site all at once.
This is all well and good. So we have a compressed backup file for our database. Now what? This is where it gets good. You may have a yahoo or gmail account with 1 GB of storage. Why not email yourself a backup of your database? As long as it's under the attachment size limit you can consider this approach. However, once your board becomes more active (and we all want that, right?) then this option might not work for you anymore.
What we have done is configure an off-site linux server with ftp access to our production server. This off-site server connects to our production server every morning at 2am and download the backup files for the day. It then runs a script that appends the date to the backup file, and copies it to a RAID device. By using this strategy we have - at any given time - the most recent backup file on our production server, and daily backup files (identified by date) on a redundant disk array. At this time we have daily backups stretching back several months stored on a 250GB mirror. A mirror is the most simple type of RAID but still provides some basic redundancy. If one disk fails, the other disk contains an exact copy of the original.
The final and perhaps most important step in all of this? Test your backup files! You can have the most robust backup strategy in the universe... if your files are screwed up to the point where you can't restore, it's the same as if you had no backups at all. So take your source code backup (you have one of those, right?) and your database backup, and go to another server. Upload your source, restore your database, and make a few adjustments to your config table and test your forum. If it works once, that's great. Do it again in three months or so.
We'll cover the adjustments needed when moving to a new server in another article soon. Hopefully this gives you some ideas about how to backup and maintain your board.