February 25, 2008

Synchronize Files with rsync

I have been looking for an effective way to synchronize files across machines for quite some time. I researched online and found quite a few programs that synchronize files. However, no one program was overwhelmingly better than the rest and had the features I was looking for. I really needed to implement something, so last week I decided to see what I can do anything with rsync. After about a day, I came up with a simple shell script that worked for me. It is not glamorous, but it gets the job done. Here is an explanation of how you can set up something similar.

Requirements

The image above illustrates an example computing environment that will be used in this post. This environment can be expanded, but for the simple shell script to work, the following items are necessary.

  • Every machine has ssh and rsync installed.

  • The Sync Server is the main synchronization point. The clients (Laptop, Home and Work Desktop) always synchronize their data to it, and not to each other.

  • SSH is already configured to provide the necessary level of security for the data you are synchronizing. The proper configuration of the SSH server and clients is beyond the scope of this post.

  • The clients must use the same top-level synchronization directory. The Sync Server's top-level synchronization directory can differ from the clients'.

In this example, the following directory structure will be used to store the synchronizable data. Both the clients and the Sync Server use the same top-level synchronization directory.

/sync/ - the top-level synchronization directory /sync/mydocs/ - directory for personal documents /sync/apps/ - directory for application data /sync/apps/bash/ - directory that holds the synchronization shell script (IMPORTANT) /sync/apps/ffox/ - copy of your firefox profile, modify your ~/.mozilla/firefox/profiles.ini to point here or use a symlink /sync/data/ - directory containing other data /sync/otherdata/ - directory containing data that we do not want to synchronize

If you have an user account on a restricted server and want to use it as the Sync Server, simply alter your paths to use your home directory.

~yourlogin/sync/ ~yourlogin/sync/mydocs/ ~yourlogin/sync/apps/ ~yourlogin/sync/apps/bash/ ~yourlogin/sync/apps/ffox/ ~yourlogin/sync/data/

Setup

  1. Create the top-level synchronization directory on all clients and the Sync Server. Set proper permisions and ownership as appropriate. The permissions of the files and directories are preserved during synchronization.

    $ mkdir /sync
  2. On one of your clients, create the directory that will store the shell script. In this example, that directory is /sync/apps/bash/.

    $ mkdir /sync/apps $ mkdir /sync/apps/bash
  3. Create the following file. Modify the highlighted portions to your specific configuration.

    /sync/apps/bash/bashrc sync_rsync_options='-auv --exclude-from=/sync/apps/bash/sync-exclude' sync_directory_up='/sync/{mydocs,apps,data} sync-server.dns:/sync/' sync_directory_down='sync-server.dns:/sync/{mydocs,apps,data} /sync/' alias sync-up-pretend="rsync -n --delete-delay --delete-excluded ${sync_rsync_options} ${sync_directory_up}" alias sync-up-full="rsync --delete-delay --delete-excluded ${sync_rsync_options} ${sync_directory_up}" alias sync-up-update="rsync --delete-excluded ${sync_rsync_options} ${sync_directory_up}" alias sync-down-pretend="rsync -On --delete-delay ${sync_rsync_options} ${sync_directory_down}" alias sync-down-full="rsync -O --delete-delay ${sync_rsync_options} ${sync_directory_down}" alias sync-down-update="rsync -O ${sync_rsync_options} ${sync_directory_down}" unset sync_rsync_options sync_directory_up sync_directory_down

    This file provides you with six simple command alliases to do synchronization. Those will be explained later in the Usage section.

  4. Create an exclude pattern file. Its basic usage it to prevent temporary files and cached data from being synchronized and wasting bandwith and storage. If you want complete synchronization, leave the file blank. Here is an example file:

    /sync/apps/bash/sync-exclude #General excludes *~ #Mozilla Firefox Cache/ XUL.mfasl

Installation

  1. Copy the shell script folder to every client.

  2. Add the following line to the bottom of your .bashrc file.

    ~yourlogin/.bashrc ... source /sync/apps/bash/bashrc

    This will make the new synchronization command alliases available in all future bash sessions. You can run the above line as a command and the new alliases will be available immediatelly. Now you are ready to synchronize your data!

  3. Move directories and files under the top-level synchronization directory and create symlinks to data that needs to appear somewhere else.

Usage

  • sync-up-full

    Uploads all newly modified files to the Sync Server and deletes all files that are no longer present on the client from the Sync Server. CAUTION! It may delete important files from the Sync Server. If these files exist on other clients, a sync-up-full from one of those clients will restore them. Use with care.

  • sync-down-full

    Downloads all new newly modified files from the Sync Server to the client and deletes all files no longeer present on the Sync Server from the client. CAUTION! It may delete important files that have not yet been synchronized to the Sync Server with sync-up-update.

  • sync-up-update

    Uploads all newly modified files to the Sync Server.

  • sync-down-full

    Downloads all new newly modified files from the Sync Server.

  • sync-up-pretend, sync-down-pretent

    These commands will give you an overview of what will happen if you run sync-up-full or sync-down-full. No files are modified.

While this is a very simple synchronization setup, so far it has worked well for me. It is still a work in progress. I encourage you to read the rsync manual. It explains in detail the exclude patterns and the source and target specification rules.

No comments: