8.8 Live Updates

As we saw earlier in this chapter, no filesystem can be replaced in its entirety while being mounted from the storage media where it is being stored. Hence, we need to look for ways to update a filesystem's content while it is mounted. There are quite a few ways to do this, each with their own advantages and disadvantages. In this section we will discuss three such methods, the rsync utility, package management tools, and ad-hoc scripts.

8.8.1 The rsync Utility

rsync is a remote updating utility that allows you to synchronize a local directory tree with a remote server. It relies on the rsync algorithm to transfer only the differences between the local and remote files. It can preserve file permissions, file ownership, symbolic links, access times, and device entries. rsync can use either rsh or ssh to communicate with the remote server. Given its features, rsync is a good candidate for updating network-enabled embedded systems. rsync is available from its project web site, along documentation and a mailing list, at http://samba.anu.edu.au/rsync/. In addition to the documentation available from the project's site, there is a very good introductory tutorial by Michael Holve available at http://everythinglinux.org/rsync/.

To use rsync, you must have the rsync daemon running on a server and an rsync client running in the embedded system. I will not cover the installation of an rsync server nor the detailed use of the rsync client, since they are already well covered by the tutorial mentioned earlier and the rest of the rsync documentation. I will, nevertheless, explain how to cross-compile, and install rsync for use on your target.

To begin, download and extract a copy of the rsync package to your ${PRJROOT}/sysapps directory. For my UI module, for example, I used rsync 2.5.6. With the package extracted, move to its directory for the rest of the manipulations:

$ cd ${PRJROOT}/sysapps/rsync-2.5.6/

Now, configure and compile the package:

$ CC=arm-linux-gcc CPPFLAGS="-DHAVE_GETTIMEOFDAY_TZ=1" ./configure \
> --host=$TARGET --prefix=${TARGET_PREFIX}
$ make

Replace arm-linux-gcc with arm-uclibc-gcc to compile against uClibc instead of glibc. Here we must set CPPFLAGS to define HAVE_GETTIMEOFDAY_TZ to 1, otherwise, the compilation fails because the configure script is unable to correctly determine the number of arguments used for gettimeofday( ) on the target.

With the compilation complete, install the rsync binary on your target's root filesystem and strip it:

$ cp rsync ${PRJROOT}/rootfs/bin
$ arm-linux-strip ${PRJROOT}/rootfs/bin/rsync

The stripped binary is 185 KB in size when dynamically linked with either uClibc or glibc, 270 KB when statically linked with uClibc, and 655 KB when statically linked with glibc.

The same binary can be used both on the command line and as a daemon. The - -daemon option instructs rsync to run as a daemon. In our case, we will be using rsync on the command line only. To use rsync, you need to have either rsh or ssh installed on your target. rsh is available as part of the netkit-rsh package from ftp://ftp.uk.linux.org/pub/linux/Networking/netkit/. ssh is available as part of the OpenSSH package, which we will discuss in depth in Chapter 10. Though that discussion concentrates on the use of the SSH daemon generated by OpenSSH (sshd), the SSH client (ssh) is also generated during the compilation of the OpenSSH package. In the following, I will assume that you are using ssh, not rsh, since it provides a secure transfer channel. The downside to using ssh, however, is that the dynamically linked and stripped SSH client is above 1.1 MB in size, and is even larger when linked statically. rsh, on the other hand, is only 8 KB when dynamically linked and stripped.

Once rsync is installed on your target, you can use a command such as the following on your target to update its root filesystem:

# rsync -e "ssh -l root" -r -l -p -t -D -v --progress \ 
> 192.168.172.50:/home/karim/control-project/user-interface/rootfs/* / 
root@192.168.172.50's password:
receiving file list ... done
bin/
dev/
etc/
lib/
sbin/
tmp/
usr/bin/
usr/sbin/
bin/busybox
750756 (100%)
bin/tinylogin
39528 (100%)
etc/inittab
377 (100%)
etc/profile
58 (100%)
lib/ld-2.2.1.so
111160 (100%)
lib/libc-2.2.1.so
1242208 (100%)
 ...
sbin/nftl_format
8288 (100%)
sbin/nftldump
7308 (100%)
sbin/unlock
3648 (100%)
bin/
dev/
etc/
lib/
sbin/
wrote 32540 bytes  read 2144597 bytes  150147.38 bytes/sec
total size is 3478029  speedup is 1.60

This command copies the content of my UI module project workspace rootfs directory from my host, whose IP address is 192.168.172.50, to my target's root directory. For this command to run successfully, my host must be running both sshd and the rsync daemon.

The options you need are:

-e: Passes to rsync the name of the application to use to connect to the remote server. (In this case, we use ssh -l root to connect as root to the server. You could replace root with whichever username is most appropriate. If no username is provided, ssh tries to connect using the same username as the session's owner.)
-r: Recursively copies directories.
-l: Preserves symbolic links.
-p: Preserves file permissions.
-t: Preserves timestamps.
-D: Preserves device nodes.
-v: Provides verbose output.
--progress: Reports transfer progress.

While running, rsync provides a list of each file or directory copied, and maintains a counter displaying the percentage of the transfer already completed. When done, rsync will have replicated the remote directory locally, and the target's root filesystem will be synchronized with the up-to-date directory on the server.

If you would like to check which files would be updated, without carrying out the actual update, you can use the -n option to do a "dry run" of rsync:

# rsync -e "ssh -l root" -r -l -p -t -D -v --progress -n \ 
> 192.168.172.50:/home/karim/control-project/user-interface/rootfs/* / 
root@192.168.172.50's password:
receiving file list ... done
bin/busybox
bin/tinylogin
etc/inittab
etc/profile
lib/ld-2.2.1.so
lib/libc-2.2.1.so        
 ...
sbin/nftl_format
sbin/nftldump
sbin/unlock
wrote 176 bytes  read 5198 bytes  716.53 bytes/sec
total size is 3478029  speedup is 647.20

For more information on the use of rsync, both as a client and a server, have a look at the command's manpage and the documentation available from the project's web site.

8.8.2 Package Management Tools

Updating simultaneously all the software packages that make up a root filesystem, as we have done in the previous section using rsync, is not always possible or desirable. Sometimes, the best approach is to upgrade each package separately using a package management system such as those commonly used in workstation and server distributions. If you are using Linux on your workstation, for example, you are probably already familiar with one of the two main package management systems used with Linux, the RPM Package Manager (RPM) or the Debian package (dpkg), whichever your distribution is based on. Because of these systems' good track records at helping users and system administrators keep their systems up to date and in perfect working condition, it may be tempting to try to cross-compile the tools that power these systems for use in an embedded system. Both systems are, however, demanding in terms of system resources, and are not well adapted for direct use in embedded systems.

Fortunately, there are tools aimed at embedded systems that can deal with packages in a way that enables us to obtain much of the functionality provided by more powerful packaging tools without requiring as much system resources. Two such tools are BusyBox's dpkg command and the Itsy Package Management System (iPKG). The dpkg BusyBox command allows us to install dpkg packages in an embedded system. Much like other BusyBox commands, it can be optionally configured as part of the busybox binary. iPKG is the package management system used by the Familiar distribution I mentioned earlier in this book. It is available from its project web site at http://www.handhelds.org/z/wiki/iPKG, along with usage documentation. iPKG relies on its own package format, but can also handle dpkg packages.

Instructions on how to build iPKG packages are available at http://www.handhelds.org/z/wiki/BuildingIpkgs. For instructions on how to build dpkg packages, have a look at the Debian New Maintainers' Guide and the Dpkg Internals Manual both available from http://www.debian.org/doc/devel-manuals. The use of the BusyBox dpkg command is explained in the BusyBox documentation, and the use of the ipkg tool part of the iPKG package management system is explained on the project's web site.

8.8.3 Ad Hoc Scripts

If, for some reason, the tools discussed earlier are not adapted to the task of updating an embedded system's root filesystem, we can still update it using more basic file-handling utilities. In essence, we can either copy each file using the cp command or patch sets of files using the patch command, or use a combination of both. Either way, we need to have a method to package the modifications on the host, and a method to apply the modification packages on the target. The simplest way to create and apply modification packages is to use shell scripts.

diff and patch

Although the diff and patch pair can be used to patch entire directory hierarchies, these tools deal with symbolic links as if they were ordinary files and end up copying the content of the linked file instead of creating a symbolic link. Hence, the patch created by diff -aurN oldrootfs/ rootfs/ is useless. Plans for modifying the utilities to deal appropriately with symbolic links are part of both packages' future projects.

In creating such scripts, we need to make sure that the dependencies between files are respected. If, for example, we are updating a library, we must make sure that the binaries on the filesystem that depend on that library will still be functional with the new library version. For example, the binary format used by uClibc has changed between Versions 0.9.14 and 0.9.15. Hence, any application linked with uClibc Version 0.9.14 and earlier must be updated if uClibc is updated to 0.9.15 or later. Although such changes are infrequent, they must be carefully considered. In general, any update involving libraries must be carefully carried out to avoid rendering the system unusable. For further information on the correct way to update libraries, see the "Upgrading Libraries" subsection of Chapter 7 in Running Linux.

8.8.3.1 Installing the patch utility

The first step in creating update scripts is having the appropriate tools available both on the host and the target. Since diff and patch are most likely already installed on your host, let's see how patch can be installed for the target.

To install patch on your target's root filesystem, start by downloading the GNU patch utility from the GNU project's FTP site at ftp://ftp.gnu.org/gnu/patch/. For my UI module, for example, I used patch 2.5.4. With the package downloaded, extract it in your ${PRJROOT}/sysapps directory.

Now, create a build directory for the utility:

$ cd ${PRJROOT}/sysapps
$ mkdir build-patch
$ cd build-patch

Configure, build, and install the package:

$ CC=arm-uclibc-gcc ../patch-2.5.4/configure --host=$TARGET \
> --prefix=${TARGET_PREFIX}
$ make LDFLAGS="-static"
$ make install

Notice that we are using uClibc and are linking the command statically. We could have also used glibc or diet libc. Regardless of the library being used, linking patch statically ensures that it will not fail to run on your target during an update because of a missing or an incomplete library installation.

The patch utility has been installed in ${TARGET_PREFIX}/bin. You can copy it from that directory to your root filesystem's /bin directory for use on your target. Once in your target's root filesystem, use the appropriate strip command to reduce the size of the utility. For example, here is how I install patch for my UI module:

$ cp ${TARGET_PREFIX}/bin/patch ${PRJROOT}/rootfs/bin
$ cd ${PRJROOT}/rootfs/bin
$ ls -al patch
-rwxrwxr-x    1 karim    karim      252094 Sep  5 16:23 patch
$ arm-linux-strip patch
$ ls -al patch
-rwxrwxr-x    1 karim    karim      113916 Sep  5 16:23 patch

8.8.3.2 Scripts for performing updates

Using the target update guidelines discussed earlier, here is a basic shell script that can be used on the host to create a package for updating the target's root filesystem:

#!/bin/sh

# File: createupdate
# Parameter $1: directory containing original root filesystem
# Parameter $2: directory containing updated root filesystem
# Parameter $3: directory where patches and updates are to be stored
# Parameter $4: updated uClibc library version

# Diff the /etc directories
diff -urN $1/etc $2/etc > $3/etc.diff

# Copy BusyBox and TinyLogin
cp $2/bin/busybox $2/bin/tinylogin $3/

# Copy uClibc components
cp $2/lib/*$4* $3

The script makes a few assumptions. First, it assumes that neither /etc nor any of its subdirectories contain symbolic links. Though this is true in most cases, we can still exclude any such symbolic links explicitly using the -x or -X options. Also, the script updates BusyBox, TinyLogin, and uClibc. You need to add the appropriate cp and diff commands for your setup.

The script can be used as follows:

$ cd ${PRJROOT}
$ mkdir tmp/rootfsupdate
$ createupdate oldrootfs/ rootfs/ tmp/rootfsupdate/ 0.9.14

In this case, oldrootfs contains the root filesystem as found on the target, rootfs contains the latest version of the root filesystem, tmp/rootfsupdate contains the files and patches used to update the target, and the new uClibc version is 0.9.14.

The following script updates the target using the update directory created above:

#!/bin/sh

# File: applyupdate
# Parameter $1: absolute path of dir containing patches and updates
# Parameter $2: old uClibc version
# Parameter $3: new uClibc version

# Patch /etc
patch -p1 < $1/etc.diff

# Copy BusyBox and TinyLogin
cp $1/busybox $1/tinylogin /bin/

# Copy updated uClibc components
cp $1/*$3* /lib

# Update uClibc symbolic links
ln -sf libuClibc-$3.so /lib/libc.so.0
for file in ld-uClibc libcrypt libdl libm libpthread libresolv libutil
do
ln -sf $file-$3.so /lib/$file.so.0
done

# Remove old uClibc components
rm -rf /lib/*$2*

This script is a little longer than the script used to create the update. The added complexity is due to the care taken in replacing the C library components. Notice that we use ln -sf instead of deleting the links and then using ln -s. This is very important because deleting the links outright would render the system unusable. You would then have to shut down the target and reprogram its storage device using the appropriate means.

To run the script, copy the rootfsupdate directory to your target's /tmp directory and run the script:

# applyupdate /tmp/rootfsupdate 0.9.13 0.9.14

You can run the update script on your host to test it before using it on your actual target. Here are the steps involved:

From the ${PRJROOT} directory, copy the old root filesystem (possibly oldrootfs) to tmp.
Modify the script to remove absolute references to /. Replace, for example, references to /etc with references to etc.
Run the script on the copied filesystem.
Verify manually that everything has been updated adequately.

Copying an Entire Directory Tree Without GNU cp

When building an embedded Linux system, you will often need to copy entire directories from one location to another as efficiently as possible while keeping files, directories, and symbolic links intact. I have already done this a few times in the course of my earlier explanations and have repeatedly used the cp -a command to accomplish this. Although the -a option has been part of GNU cp for some time, it may not be installed on your system if you are not using Linux. If, for some reason, GNU cp is not available on your system, you can still obtain the same result as cp -a using a command that is a combination of cd and tar. Let's take a closer look at this command and how it works. This is how the command looks in its generalized form:

$ (cd  SRC_DIR  && tar cf - .) | (cd  DEST_DIR  && tar xvf -)

This command has two parts. The one on the left of the | character changes directories to SRC_DIR and initiates a tar in that directory. Specifically, tar is told to create a tar archive of the content of the directory from which it runs and to dump the resulting archive to the standard output. In simple uses of tar for archiving, the command is followed by a greater-than sign (>) and the name of either a tape device or a disk file. Here we aren't actually saving the output; we're just using tar as a convenient way to put the files into a stream and put it elsewhere.

On Unix command shells, the | is used to create a pipe between the output of the command on the left and the input of the command on the right. Hence, the archive dumped on the standard output by the command on the left is fed as the standard input for the command on the right. In turn, the command on the right of | changes to the DEST_DIR and initiates a tar in that directory. Contrary to the first tar, this one extracts the content found on its standard input into the directory from which it is executed.

The net effect of this command is that the files and directories found in the SRC_DIR directory are copied as-is to the DEST_DIR directory. The content of DEST_DIR is thereafter identical to that of SRC_DIR.

Though this command is of little use if your system already has GNU cp, you may find it helpful on systems that don't have GNU cp. If you are using a standard Linux workstation or server distribution, cp -a remains the better option for copying entire directory trees.