<title>Features and Benefits</title>
<para>
+<indexterm><primary>backup</primary></indexterm>
+<indexterm><primary>UNIX system files</primary></indexterm>
+<indexterm><primary>system tools</primary></indexterm>
+<indexterm><primary>Samba mailing lists</primary></indexterm>
The Samba project is over 10 years old. During the early history
of Samba, UNIX administrators were its key implementors. UNIX administrators
use UNIX system tools to backup UNIX system files. Over the past
<title>Discussion of Backup Solutions</title>
<para>
+<indexterm><primary>Meccano set</primary></indexterm>
+<indexterm><primary>training course</primary></indexterm>
During discussions at a Microsoft Windows training course, one of
the pro-UNIX delegates stunned the class when he pointed out that Windows
NT4 is limiting compared with UNIX. He likened UNIX to a Meccano set
</para>
<para>
+<indexterm><primary>networking advocates</primary></indexterm>
+<indexterm><primary>clear purpose preferred</primary></indexterm>
One of the Windows networking advocates retorted that if she wanted a
Meccano set, she would buy one. She made it clear that a complex single
tool that does more than is needed but does it with a clear purpose and
</para>
<para>
+<indexterm><primary>due diligence</primary></indexterm>
+<indexterm><primary>research</primary></indexterm>
+<indexterm><primary>backup solution</primary></indexterm>
Please note that all information here is provided as is and without recommendation
of fitness or suitability. The network administrator is strongly encouraged to
perform due diligence research before implementing any backup solution, whether free
<para>
<indexterm><primary>BackupPC</primary></indexterm>
+<indexterm><primary>rsync</primary></indexterm>
+<indexterm><primary>rsyncd</primary></indexterm>
BackupPC version 2.0.0 has been released on <ulink url="http://backuppc.sourceforge.net">SourceForge</ulink>.
New features include support for <command>rsync/rsyncd</command> and internationalization of the CGI interface
(including English, French, Spanish, and German).
</para>
<para>
+<indexterm><primary>BackupPC</primary></indexterm>
+<indexterm><primary>laptops</primary></indexterm>
+<indexterm><primary>SMB</primary></indexterm>
+<indexterm><primary>smbclient</primary></indexterm>
+<indexterm><primary>tar</primary></indexterm>
+<indexterm><primary>rsh</primary></indexterm>
+<indexterm><primary>ssh</primary></indexterm>
+<indexterm><primary>rsync</primary></indexterm>
BackupPC is a high-performance Perl-based package for backing up Linux,
UNIX, and Windows PCs and laptops to a server's disk. BackupPC is highly
configurable and easy to install and maintain. SMB (via smbclient),
</para>
<para>
+<indexterm><primary>RAID</primary></indexterm>
+<indexterm><primary>local disk</primary></indexterm>
+<indexterm><primary>network storage</primary></indexterm>
Given the ever-decreasing cost of disks and RAID systems, it is now
practical and cost effective to backup a large number of machines onto
a server's local disk or network storage. This is what BackupPC does.
</para>
<para>
+<indexterm><primary>GNU GPL</primary></indexterm>
BackupPC is free software distributed under a GNU GPL license.
BackupPC runs on Linux/UNIX/freenix servers and has been tested
on Linux, UNIX, Windows 9x/Me, Windows 98, Windows 200x, Windows XP, and Mac OSX clients.
<sect2>
<title>Rsync</title>
- <para><command>rsync</command> is a flexible program for efficiently copying files or
+ <para>
+<indexterm><primary>rsync</primary></indexterm>
+<indexterm><primary>ftp</primary></indexterm>
+<indexterm><primary>http</primary></indexterm>
+<indexterm><primary>scp</primary></indexterm>
+<indexterm><primary>rcp</primary></indexterm>
+<indexterm><primary>checksum-search</primary></indexterm>
+ <command>rsync</command> is a flexible program for efficiently copying files or
directory trees.</para>
<para><command>rsync</command> has many options to select which files will be copied
and how they are to be transferred. It may be used as an
alternative to <command>ftp, http, scp</command>, or <command>rcp</command>.</para>
- <para>The rsync remote-update protocol allows rsync to transfer just
+ <para>
+<indexterm><primary>remote-update protocol</primary></indexterm>
+<indexterm><primary>transfer differences</primary></indexterm>
+<indexterm><primary>differences</primary></indexterm>
+ The rsync remote-update protocol allows rsync to transfer just
the differences between two sets of files across the network link,
using an efficient checksum-search algorithm described in the
technical report that accompanies the rsync package.</para>
<para>
<indexterm><primary>Amanda</primary></indexterm>
+<indexterm><primary>native dump</primary></indexterm>
+<indexterm><primary>GNU tar</primary></indexterm>
Amanda, the Advanced Maryland Automatic Network Disk Archiver, is a backup system that
allows the administrator of a LAN to set up a single master backup server to back up
multiple hosts to a single large capacity tape drive. Amanda uses native dump and/or
<title>Features and Benefits</title>
<para>
+<indexterm><primary>availability</primary></indexterm>
+<indexterm><primary>intolerance</primary></indexterm>
+<indexterm><primary>vital task</primary></indexterm>
Network administrators are often concerned about the availability of file and print
services. Network users are inclined toward intolerance of the services they depend
on to perform vital task responsibilities.
<blockquote>
<para>
+<indexterm><primary>fail</primary></indexterm>
+<indexterm><primary>managed by humans</primary></indexterm>
+<indexterm><primary>economically wise</primary></indexterm>
+<indexterm><primary>anticipate failure</primary></indexterm>
All humans fail, in both great and small ways we fail continually. Machines fail too.
Computers are machines that are managed by humans, the fallout from failure
can be spectacular. Your responsibility is to deal with failure, to anticipate it
</para>
<para>
+<indexterm><primary>high availability</primary></indexterm>
+<indexterm><primary>CIFS/SMB</primary></indexterm>
+<indexterm><primary>state of knowledge</primary></indexterm>
Parenthetically, in the following discussion there are seeds of information on how to
provision a network infrastructure against failure. Our purpose here is not to provide
a lengthy dissertation on the subject of high availability. Additionally, we have made
<title>Technical Discussion</title>
<para>
+<indexterm><primary>SambaXP conference</primary></indexterm>
+<indexterm><primary>Germany</primary></indexterm>
+<indexterm><primary>inspired structure</primary></indexterm>
The following summary was part of a presentation by Jeremy Allison at the SambaXP 2003
conference that was held at Goettingen, Germany, in April 2003. Material has been added
from other sources, but it was Jeremy who inspired the structure that follows.
<title>The Ultimate Goal</title>
<para>
+<indexterm><primary>clustering technologies</primary></indexterm>
+<indexterm><primary>affordable power</primary></indexterm>
+<indexterm><primary>unstoppable services</primary></indexterm>
All clustering technologies aim to achieve one or more of the following:
</para>
<para>
A clustered file server ideally has the following properties:
+<indexterm><primary>clustered file server</primary></indexterm>
+<indexterm><primary>connect transparently</primary></indexterm>
+<indexterm><primary>transparently reconnected</primary></indexterm>
+<indexterm><primary>distributed file system</primary></indexterm>
</para>
<itemizedlist>
<itemizedlist>
<listitem>
<para>
+<indexterm><primary>state information</primary></indexterm>
All TCP/IP connections are dependent on state information.
</para>
<para>
+<indexterm><primary>TCP failover</primary></indexterm>
The TCP connection involves a packet sequence number. This
sequence number would need to be dynamically updated on all
machines in the cluster to effect seamless TCP failover.
</listitem>
<listitem>
<para>
+<indexterm><primary>CIFS/SMB</primary></indexterm>
+<indexterm><primary>TCP</primary></indexterm>
CIFS/SMB (the Windows networking protocols) uses TCP connections.
</para>
<para>
All current SMB clusters are failover solutions
&smbmdash; they rely on the clients to reconnect. They provide server
failover, but clients can lose information due to a server failure.
+<indexterm><primary>server failure</primary></indexterm>
</para></listitem>
</itemizedlist>
</para>
<para>
Servers keep state information about client connections.
<itemizedlist>
+<indexterm><primary>state</primary></indexterm>
<listitem><para>CIFS/SMB involves a lot of state.</para></listitem>
<listitem><para>Every file open must be compared with other open files
to check share modes.</para></listitem>
<title>The Front-End Challenge</title>
<para>
+<indexterm><primary>cluster servers</primary></indexterm>
+<indexterm><primary>single server</primary></indexterm>
+<indexterm><primary>TCP data streams</primary></indexterm>
+<indexterm><primary>front-end virtual server</primary></indexterm>
+<indexterm><primary>virtual server</primary></indexterm>
+<indexterm><primary>de-multiplex</primary></indexterm>
+<indexterm><primary>SMB</primary></indexterm>
To make it possible for a cluster of file servers to appear as a single server that has one
name and one IP address, the incoming TCP data streams from clients must be processed by the
front-end virtual server. This server must de-multiplex the incoming packets at the SMB protocol
</para>
<para>
+<indexterm><primary>IPC4 connections</primary></indexterm>
+<indexterm><primary>RPC calls</primary></indexterm>
One could split all IPC4 connections and RPC calls to one server to handle printing and user
lookup requirements. RPC printing handles are shared between different IPC4 sessions &smbmdash; it is
hard to split this across clustered servers!
<title>Demultiplexing SMB Requests</title>
<para>
+<indexterm><primary>SMB requests</primary></indexterm>
+<indexterm><primary>SMB state information</primary></indexterm>
+<indexterm><primary>front-end virtual server</primary></indexterm>
+<indexterm><primary>complicated problem</primary></indexterm>
De-multiplexing of SMB requests requires knowledge of SMB state information,
all of which must be held by the front-end <emphasis>virtual</emphasis> server.
This is a perplexing and complicated problem to solve.
</para>
<para>
+<indexterm><primary>vuid</primary></indexterm>
+<indexterm><primary>tid</primary></indexterm>
+<indexterm><primary>fid</primary></indexterm>
Windows XP and later have changed semantics so state information (vuid, tid, fid)
must match for a successful operation. This makes things simpler than before and is a
positive step forward.
</para>
<para>
+<indexterm><primary>SMB requests</primary></indexterm>
+<indexterm><primary>Terminal Server</primary></indexterm>
SMB requests are sent by vuid to their associated server. No code exists today to
effect this solution. This problem is conceptually similar to the problem of
correctly handling requests from multiple requests from Windows 2000
</para>
<para>
+<indexterm><primary>de-multiplexing</primary></indexterm>
One possibility is to start by exposing the server pool to clients directly.
This could eliminate the de-multiplexing step.
</para>
</para>
<para>
+<indexterm><primary>backend</primary></indexterm>
+<indexterm><primary>SMB semantics</primary></indexterm>
+<indexterm><primary>share modes</primary></indexterm>
+<indexterm><primary>locking</primary></indexterm>
+<indexterm><primary>oplock</primary></indexterm>
+<indexterm><primary>distributed file systems</primary></indexterm>
Many could be adopted to backend our cluster, so long as awareness of SMB
semantics is kept in mind (share modes, locking, and oplock issues in particular).
Common free distributed file systems include:
</itemizedlist>
<para>
+<indexterm><primary>server pool</primary></indexterm>
The server pool (cluster) can use any distributed file system backend if all SMB
semantics are performed within this pool.
</para>
<title>Restrictive Constraints on Distributed File Systems</title>
<para>
+<indexterm><primary>SMB services</primary></indexterm>
+<indexterm><primary>oplock handling</primary></indexterm>
+<indexterm><primary>server pool</primary></indexterm>
+<indexterm><primary>backend file system pool</primary></indexterm>
Where a clustered server provides purely SMB services, oplock handling
may be done within the server pool without imposing a need for this to
be passed to the backend file system pool.
</para>
<para>
+<indexterm><primary>NFS</primary></indexterm>
+<indexterm><primary>interoperability</primary></indexterm>
On the other hand, where the server pool also provides NFS or other file services,
it will be essential that the implementation be oplock-aware so it can
interoperate with SMB services. This is a significant challenge today. A failure
<title>Server Pool Communications</title>
<para>
+<indexterm><primary>POSIX semantics</primary></indexterm>
+<indexterm><primary>SMB</primary></indexterm>
+<indexterm><primary>POSIX locks</primary></indexterm>
+<indexterm><primary>SMB locks</primary></indexterm>
Most backend file systems support POSIX file semantics. This makes it difficult
to push SMB semantics back into the file system. POSIX locks have different properties
and semantics from SMB locks.
</para>
<para>
+<indexterm><primary>smbd</primary></indexterm>
+<indexterm><primary>tdb</primary></indexterm>
+<indexterm><primary>Clustered smbds</primary></indexterm>
All <command>smbd</command> processes in the server pool must of necessity communicate
very quickly. For this, the current <parameter>tdb</parameter> file structure that Samba
uses is not suitable for use across a network. Clustered <command>smbd</command>s must use something else.
</para>
<itemizedlist>
+<indexterm><primary>Myrinet</primary></indexterm>
+<indexterm><primary>scalable coherent interface</primary><see>SCI</see></indexterm>
<listitem><para>
Proprietary shared memory bus (example: Myrinet or SCI [scalable coherent interface]).
These are high-cost items.
</para></listitem>
<listitem><para>
+<indexterm><primary>failure semantics</primary></indexterm>
+<indexterm><primary>oplock messages</primary></indexterm>
Failure semantics need to be defined. Samba behaves the same way as Windows.
When oplock messages fail, a file open request is allowed, but this is
potentially dangerous in a clustered environment. So how should interserver
<title>A Simple Solution</title>
<para>
+<indexterm><primary>failover servers</primary></indexterm>
+<indexterm><primary>exported file system</primary></indexterm>
+<indexterm><primary>distributed locking protocol</primary></indexterm>
Allowing failover servers to handle different functions within the exported file system
removes the problem of requiring a distributed locking protocol.
</para>
<para>
+<indexterm><primary>high-speed server interconnect</primary></indexterm>
+<indexterm><primary>complex file name space</primary></indexterm>
If only one server is active in a pair, the need for high-speed server interconnect is avoided.
This allows the use of existing high-availability solutions, instead of inventing a new one.
This simpler solution comes at a price &smbmdash; the cost of which is the need to manage a more
</para>
<para>
+<indexterm><primary>virtual server</primary></indexterm>
The <emphasis>virtual server</emphasis> is still needed to redirect requests to backend
servers. Backend file space integrity is the responsibility of the administrator.
</para>
<title>High-Availability Server Products</title>
<para>
+<indexterm><primary>resource failover</primary></indexterm>
+<indexterm><primary>high-availability services</primary></indexterm>
+<indexterm><primary>dedicated heartbeat</primary></indexterm>
+<indexterm><primary>LAN</primary></indexterm>
+<indexterm><primary>failover process</primary></indexterm>
Failover servers must communicate in order to handle resource failover. This is essential
for high-availability services. The use of a dedicated heartbeat is a common technique to
introduce some intelligence into the failover process. This is often done over a dedicated
<para>
<indexterm><primary>SCSI</primary></indexterm>
+<indexterm><primary>Red Hat Cluster Manager</primary></indexterm>
+<indexterm><primary>Microsoft Wolfpack</primary></indexterm>
+<indexterm><primary>Fiber Channel</primary></indexterm>
+<indexterm><primary>failover communication</primary></indexterm>
Many failover solutions (like Red Hat Cluster Manager and Microsoft Wolfpack)
can use a shared SCSI of Fiber Channel disk storage array for failover communication.
Information regarding Red Hat high availability solutions for Samba may be obtained from
</para>
<para>
+<indexterm><primary>Linux High Availability project</primary></indexterm>
The Linux High Availability project is a resource worthy of consultation if your desire is
to build a highly available Samba file server solution. Please consult the home page at
<ulink url="http://www.linux-ha.org/">www.linux-ha.org/</ulink>.
</para>
<para>
+<indexterm><primary>backend failures</primary></indexterm>
+<indexterm><primary>continuity of service</primary></indexterm>
Front-end server complexity remains a challenge for high availability because it must deal
gracefully with backend failures, while at the same time providing continuity of service
to all network clients.