Content Distribution with Rsync
By Bryant Durrell
At first glance, the rsync utility may seem like one of myriad solutions for copying files from one machine to another. Yet, when used on large systems, rsync is a powerful tool for content distribution and synchronization. And when it comes to speed and security, rsync is superior to previous generic file distribution mechanisms.
Rsync's most direct ancestors are the suite of "r" utilitiesrlogin, rsh, and rdist. These command-line programs are most famous for being serious security holes. Fortunately, the holes have been plugged on most modern Unix distributions. But using rdistrsync's closest relativeis still potentially risky. Once you've opened up rdist, you've opened up all the other "r" commands for every user on that system.
Why take the risk? Rsync can run as a secure, stand-alone server, and that reduces the possibility of compromising your password file. Rsync is also quicker than rdist. The rsync protocol transfers only the differences between two sets of files, so if you have a multimegabyte content tree, the entire tree won't need to be transferred each time you update.
Furthermore, rsync includes reasonably sophisticated mechanisms for ensuring the completeness of a copy. If a file transfer is interrupted for some reasonperhaps the network died halfway throughthe partial file is deleted so as not to leave incomplete files on the server.
One popular alternative to rsync is CVS, which gives you version control along with file distribution. Version control systems are essential to good content management, but CVS might not be the right option for you.