11 July 2009
Backing up and synchronising data
In which I relate my past and present strategies for backing up and synchronising data small and large.
Starting in 1997 I've gone through through many backup strategies and backup media, beginning with a pile of 1.44MB floppies, moving on to CD-RWs, and DVD-RWs, then to mirroring between various machines using a combination of rsync (fast one-way sync) and Unison (slower two-way sync). Mirroring also lets me have the data available at the machines where I need it, while DVD's and are too small to be worth the time to write them.
The simple part is managing my small "working set" of about 300MB of code and documents: my flash drive goes everywhere, and the first and last thing I do when using a machine is run Unison to synchronise between the machine's home folder and the flash drive. Unison is speedy on Linux, although much slower than rsync with large folders, but is slow on Windows where it is better to sync just the subset that you need on that machine. With this being South Africa where internet access is neither guaranteed nor fast, I won't be trading my flash drive in for a DropBox either.
But the disk-sized data became complex: archived documents, books and PDFs, photos, music, videos. The reason was a proliferation of machines that I used, and a poor choice of naming scheme with each computer keeping a folders likes "mirror/HomeLaptop" folder that would mirror a subset of data "owned" by HomeLaptop. The more partitions and hosts (home laptop, home desktop, lab or office desktop, external drive, and so forth) the more rsync scripts to maintain. And, while I could read the mirrored data, using one-way sync meant I could only make changes to the data on the "owner" machine.
With setting up my new desktop, I've moved to a naming scheme which uses a shared namespace across all machines, and uses Unison and the Grsync GUI to make them become eventually consistent over time, although no machine has all of the data in the name space. The top-level "libraries" are Code, Documents, Pictures, Music, Software, and Video. Documents contains the second-level folders Home, Office, Archives and Books: Home and Office are synced by unison to the flash drive to keep them always-consistent. However, "Software/LinuxDesktop" may diverge over time between the Home and Office desktops, and become consistent later thanks to unison and grsync and a travelling drive. The strategy will evolve over time.
Other people have different solutions. Using only one computer (backup to an external drive) is a common one. People with multiple machines at home might mimic an office environment by having a central always-on network share (even a dedicated NAS device) where everything resides and is backed up en-mass. My machines are 3 x home (1 laptop and a dual-boot desktop) and office, plus a flash disk and a disk drive in-between. So I've gone with a distributed approach where I only use local storage, and get "backups" as a side-effect the drives being synchronised occasionally.
0 comments:
Post a Comment