I work with MongoDB in day today basis and one of the challenge is taking backup of different DB’s deployed. Taking backup is important for recovery in case of mishap. Mostly there are enormous write operations and database size keeps growing so the chances of failure. Initially we started to do this manually which was okay till we had limited traffic but as registration and engagement increased so we started looking for automating this process. Cron job scheduler in Linux and Amazon S3 came to the rescue.
- MongoDB 2.33 or higher deployed on Linux Environment.
- Amazon S3 account to store all backup data.
Tools Required(or their alternatives):
- S3cmd – S3cmd is a free command line tool and client for uploading, retrieving and managing data in Amazon S3.
- Linux Cron utility
- Mongodump utility provided by MongoDB
- Tar archiving utility– GNU version.
- Command Line Interpretor(CLI).
Installation and Configuration of S3cmd:
- Run to install in CLI or Shell
sudo yum –enablerepo epel install s3cmd
- Display all buckets in your Amazon S3 account
Writing a Shell Script to Backup MongoDB:
- Force file syncronization and lock writes on MongoDB using fsync—
mongo admin –eval “printjson(db.fsyncLock())”
- Declare some variables—
TIMESTAMP=`date +%F-%H%M` // use as it is. S3_BUCKET_NAME=”name-of-s3-bucket”
- Take mongoDB dump using mongodump utility
$MONGODUMP_PATH -h $MONGO_HOST:$MONGO_PORT -d $MONGO_DATABASE -u $USER_NAME -p $PASSWORD
- Unlock database writes
mongo admin –eval “printjson(db.fsyncUnlock())”
- Add timestamp to backup folder name
mv dump mongodb-$HOSTNAME-$TIMESTAMP
- Compress the folder
tar cf mongodb-$HOSTNAME-$TIMESTAMP.tar mongodb-$HOSTNAME-$TIMESTAMP
- Upload to S3
/usr/local/bin/s3cmd put mongodb-$HOSTNAME-$TIMESTAMP.tar s3://$S3_BUCKET_NAME/$S3_BUCKET_PATH/mongodb-$HOSTNAME-$TIMESTAMP.tar
- Delete local dump folder
/bin/rm -r mongodb-$HOSTNAME-* // to delete all use *
Run this script at a particular time using crontab