Merge branch 'master' in devel-3.0
authorNeilBrown <neilb@suse.de>
Tue, 10 Mar 2009 05:47:02 +0000 (16:47 +1100)
committerNeilBrown <neilb@suse.de>
Tue, 10 Mar 2009 05:47:02 +0000 (16:47 +1100)
62 files changed:
ANNOUNCE-3.0-devel1 [new file with mode: 0644]
ANNOUNCE-3.0-devel2 [new file with mode: 0644]
Assemble.c
Build.c
Create.c
Detail.c
Examine.c
Grow.c
Incremental.c
Kill.c
Makefile
Manage.c
Monitor.c
Query.c
ReadMe.c
TODO
bitmap.c
config.c
crc32.c [new file with mode: 0644]
crc32.h [new file with mode: 0644]
inventory
kernel-patch-2.6.25 [new file with mode: 0644]
kernel-patch-2.6.27 [new file with mode: 0644]
managemon.c [new file with mode: 0644]
mapfile.c
md.4
mdadm.8
mdadm.c
mdadm.conf.5
mdadm.h
mdadm.spec
mdassemble.8
mdassemble.c
mdmon.8 [new file with mode: 0644]
mdmon.c [new file with mode: 0644]
mdmon.h [new file with mode: 0644]
mdopen.c
mdstat.c
monitor.c [new file with mode: 0644]
msg.c [new file with mode: 0644]
msg.h [new file with mode: 0644]
platform-intel.c [new file with mode: 0644]
platform-intel.h [new file with mode: 0644]
probe_roms.c [new file with mode: 0644]
probe_roms.h [new file with mode: 0644]
restripe.c
sg_io.c [new file with mode: 0644]
super-ddf.c [new file with mode: 0644]
super-intel.c [new file with mode: 0644]
super0.c
super1.c
sysfs.c
test
tests/03r0assem
tests/03r5assemV1
tests/06name
tests/08imsm-overlap [new file with mode: 0644]
tests/09imsm-create-fail-rebuild [new file with mode: 0644]
tests/env-08imsm-overlap [new file with mode: 0644]
tests/env-09imsm-create-fail-rebuild [new file with mode: 0644]
udev-md-raid.rules [new file with mode: 0644]
util.c

diff --git a/ANNOUNCE-3.0-devel1 b/ANNOUNCE-3.0-devel1
new file mode 100644 (file)
index 0000000..89ed2e3
--- /dev/null
@@ -0,0 +1,84 @@
+Subject:  ANNOUNCE: mdadm 3.0-devel1 - A tool for managing Soft RAID under Linux
+
+I am pleased to announce the availability of
+   mdadm version 3.0-devel1
+
+It is available at the usual places:
+   countrycode=xx.
+   http://www.${countrycode}kernel.org/pub/linux/utils/raid/mdadm/
+and via git at
+   git://neil.brown.name/mdadm
+   http://neil.brown.name/git?p=mdadm
+
+Note that this is a "devel" release.  It is not intended for
+production use yet, but rather for testing and ongoing development.
+
+The significant change which justifies the new major version number is
+that mdadm can now handle metadata updates entirely in userspace.
+This allows mdadm to support metadata formats that the kernel knows
+nothing about.
+
+Currently two such metadata formats are supported:
+  - DDF  - The SNIA standard format
+  - Intel Matrix - The metadata used by recent Intel ICH controlers.
+
+The manual pages have not yet been updated, but here is a brief outline.
+
+Externally managed metadata introduces the concept of a 'container'.
+A container is a collection of (normally) physical devices which have
+a common set of metadata.  A container is assembled as an md array, but
+is left 'inactive'.
+
+A container can contain one or more data arrays.  These are composed from
+slices (partitions?) of various devices in the container.
+
+For example, a 5 devices DDF set can container a RAID1 using the first
+half of two devices, a RAID0 using the first half of the remain 3 devices,
+and a RAID5 over thte second half of all 5 devices.
+
+A container can be created with
+
+   mdadm --create /dev/md0 -e ddf -n5 /dev/sd[abcde]
+
+or "-e imsm" to use the Intel Matrix Storage Manager.
+
+An array can be created within a container either by giving the
+container name and the only member:
+
+   mdadm -C /dev/md1 --level raid1 -n 2 /dev/md0
+
+or by listing the component devices
+
+   mdadm -C /dev/md2 --level raid0 -n 3 /dev/sd[cde]
+
+The assemble a container, it is easiest just to pass each device in turn to 
+mdadm -I
+
+  for i in /dev/sd[abcde]
+  do mdadm -I $i
+  done
+
+This will assemble the container and the components.
+
+Alternately the container can be assembled explicitly
+
+   mdadm -A /dev/md0 /dev/sd[abcde]
+
+Then the components can all be assembled with
+
+   mdadm -I /dev/md0
+
+For each container, mdadm will start a program called "mdmon" which will
+monitor the array and effect any metadata updates needed.  The array is
+initially assembled readonly. It is up to "mdmon" to mark the metadata 
+as 'dirty' and which the array to 'read-write'.
+
+The version 0.90 and 1.x metadata formats supported by previous
+versions for mdadm are still supported and the kernel still performs
+the same updates it use to.  The new 'mdmon' approach is only used for
+newly introduced metadata types.
+
+Any testing and feedback will be greatly appreciated.
+
+NeilBrown  18th September 2008
+
diff --git a/ANNOUNCE-3.0-devel2 b/ANNOUNCE-3.0-devel2
new file mode 100644 (file)
index 0000000..0f2924c
--- /dev/null
@@ -0,0 +1,98 @@
+Subject:  ANNOUNCE: mdadm 3.0-devel2 - A tool for managing Soft RAID under Linux
+
+I am pleased to announce the availability of
+   mdadm version 3.0-devel2
+
+It is available at the usual places:
+   countrycode=xx.
+   http://www.${countrycode}kernel.org/pub/linux/utils/raid/mdadm/
+and via git at
+   git://neil.brown.name/mdadm
+   http://neil.brown.name/git?p=mdadm
+
+Note that this is a "devel" release.  It should be used with
+caution, though it is believed to be close to release-candidate stage.
+
+The significant change which justifies the new major version number is
+that mdadm can now handle metadata updates entirely in userspace.
+This allows mdadm to support metadata formats that the kernel knows
+nothing about.
+
+Currently two such metadata formats are supported:
+  - DDF  - The SNIA standard format
+  - Intel Matrix - The metadata used by recent Intel ICH controlers.
+
+Also the approach to device names has changed significantly.
+
+If udev is installed on the system, mdadm will not create any devices
+in /dev.  Rather it allows udev to manage those devices.  For this to work
+as expected, the included udev rules file should be installed.
+
+If udev is not install, mdadm will still create devices and symlinks 
+as required, and will also remove them when the array is stopped.
+
+mdadm now requires all devices which do not have a standard name (mdX
+or md_dX) to live in the directory /dev/md/.  Names in this directory
+will always be created as symlinks back to the standard name in /dev.
+
+The man pages contain some information about the new externally managed
+metadata.  However see below for a more condensed overview.
+
+Externally managed metadata introduces the concept of a 'container'.
+A container is a collection of (normally) physical devices which have
+a common set of metadata.  A container is assembled as an md array, but
+is left 'inactive'.
+
+A container can contain one or more data arrays.  These are composed from
+slices (partitions?) of various devices in the container.
+
+For example, a 5 devices DDF set can container a RAID1 using the first
+half of two devices, a RAID0 using the first half of the remain 3 devices,
+and a RAID5 over thte second half of all 5 devices.
+
+A container can be created with
+
+   mdadm --create /dev/md0 -e ddf -n5 /dev/sd[abcde]
+
+or "-e imsm" to use the Intel Matrix Storage Manager.
+
+An array can be created within a container either by giving the
+container name and the only member:
+
+   mdadm -C /dev/md1 --level raid1 -n 2 /dev/md0
+
+or by listing the component devices
+
+   mdadm -C /dev/md2 --level raid0 -n 3 /dev/sd[cde]
+
+To assemble a container, it is easiest just to pass each device in turn to 
+mdadm -I
+
+  for i in /dev/sd[abcde]
+  do mdadm -I $i
+  done
+
+This will assemble the container and the components.
+
+Alternately the container can be assembled explicitly
+
+   mdadm -A /dev/md0 /dev/sd[abcde]
+
+Then the components can all be assembled with
+
+   mdadm -I /dev/md0
+
+For each container, mdadm will start a program called "mdmon" which will
+monitor the array and effect any metadata updates needed.  The array is
+initially assembled readonly. It is up to "mdmon" to mark the metadata 
+as 'dirty' and which the array to 'read-write'.
+
+The version 0.90 and 1.x metadata formats supported by previous
+versions for mdadm are still supported and the kernel still performs
+the same updates it use to.  The new 'mdmon' approach is only used for
+newly introduced metadata types.
+
+Any testing and feedback will be greatly appreciated.
+
+NeilBrown  5th November 2008
+
index ab8faed..99f3599 100644 (file)
@@ -50,7 +50,32 @@ static int name_matches(char *found, char *required, char *homehost)
        return 0;
 }
 
-int Assemble(struct supertype *st, char *mddev, int mdfd,
+static int is_member_busy(char *metadata_version)
+{
+       /* check if the given member array is active */
+       struct mdstat_ent *mdstat = mdstat_read(1, 0);
+       struct mdstat_ent *ent;
+       int busy = 0;
+
+       for (ent = mdstat; ent; ent = ent->next) {
+               if (ent->metadata_version == NULL)
+                       continue;
+               if (strncmp(ent->metadata_version, "external:", 9) != 0)
+                       continue;
+               if (!is_subarray(&ent->metadata_version[9]))
+                       continue;
+               /* Skip first char - it can be '/' or '-' */
+               if (strcmp(&ent->metadata_version[10], metadata_version+1) == 0) {
+                       busy = 1;
+                       break;
+               }
+       }
+       free_mdstat(mdstat);
+
+       return busy;
+}
+
+int Assemble(struct supertype *st, char *mddev,
             mddev_ident_t ident,
             mddev_dev_t devlist, char *backup_file,
             int readonly, int runstop,
@@ -111,10 +136,13 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
         *    START_ARRAY
         *
         */
-       int clean = 0;
-       int must_close = 0;
+       int mdfd;
+       int clean;
+       int auto_assem = (mddev == NULL && !ident->uuid_set &&
+                         ident->super_minor == UnSet && ident->name[0] == 0
+                         && ident->container == NULL && ident->member == NULL);
        int old_linux = 0;
-       int vers = 0; /* Keep gcc quite - it really is initialised */
+       int vers = vers; /* Keep gcc quite - it really is initialised */
        struct {
                char *devname;
                int uptodate; /* set once we decide that this device is as
@@ -132,36 +160,23 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
        int chosen_drive;
        int change = 0;
        int inargv = 0;
+       int report_missmatch;
        int bitmap_done;
-       int start_partial_ok = (runstop >= 0) && (force || devlist==NULL || mdfd < 0);
+       int start_partial_ok = (runstop >= 0) && 
+               (force || devlist==NULL || auto_assem);
        unsigned int num_devs;
        mddev_dev_t tmpdev;
        struct mdinfo info;
+       struct mdinfo *content = NULL;
        char *avail;
        int nextspare = 0;
+       char *name = NULL;
+       int trustworthy;
+       char chosen_name[1024];
 
        if (get_linux_version() < 2004000)
                old_linux = 1;
 
-       if (mdfd >= 0) {
-               vers = md_get_version(mdfd);
-               if (vers <= 0) {
-                       fprintf(stderr, Name ": %s appears not to be an md device.\n", mddev);
-                       return 1;
-               }
-               if (vers < 9000) {
-                       fprintf(stderr, Name ": Assemble requires driver version 0.90.0 or later.\n"
-                               "    Upgrade your kernel or try --build\n");
-                       return 1;
-               }
-
-               if (ioctl(mdfd, GET_ARRAY_INFO, &info.array)>=0) {
-                       fprintf(stderr, Name ": device %s already active - cannot assemble it\n",
-                               mddev);
-                       return 1;
-               }
-               ioctl(mdfd, STOP_ARRAY, NULL); /* just incase it was started but has no content */
-       }
        /*
         * If any subdevs are listed, then any that don't
         * match ident are discarded.  Remainder must all match and
@@ -178,12 +193,18 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                        mddev ? mddev : "further assembly");
                return 1;
        }
+
        if (devlist == NULL)
                devlist = conf_get_devs();
-       else if (mdfd >= 0)
+       else if (mddev)
                inargv = 1;
 
+       report_missmatch = ((inargv && verbose >= 0) || verbose > 0);
  try_again:
+       /* We come back here when doing auto-assembly and attempting some
+        * set of devices failed.  Those are now marked as ->used==2 and
+        * we ignore them and try again
+        */
 
        tmpdev = devlist; num_devs = 0;
        while (tmpdev) {
@@ -203,7 +224,7 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
 
        /* first walk the list of devices to find a consistent set
         * that match the criterea, if that is possible.
-        * We flag the one we like with 'used'.
+        * We flag the ones we like with 'used'.
         */
        for (tmpdev = devlist;
             tmpdev;
@@ -217,14 +238,14 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
 
                if (ident->devices &&
                    !match_oneof(ident->devices, devname)) {
-                       if ((inargv && verbose>=0) || verbose > 0)
+                       if (report_missmatch)
                                fprintf(stderr, Name ": %s is not one of %s\n", devname, ident->devices);
                        continue;
                }
 
                dfd = dev_open(devname, O_RDONLY|O_EXCL);
                if (dfd < 0) {
-                       if ((inargv && verbose >= 0) || verbose > 0)
+                       if (report_missmatch)
                                fprintf(stderr, Name ": cannot open device %s: %s\n",
                                        devname, strerror(errno));
                        tmpdev->used = 2;
@@ -238,72 +259,120 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                                devname);
                        tmpdev->used = 2;
                } else if (!tst && (tst = guess_super(dfd)) == NULL) {
-                       if ((inargv && verbose >= 0) || verbose > 0)
+                       if (report_missmatch)
                                fprintf(stderr, Name ": no recogniseable superblock on %s\n",
                                        devname);
                        tmpdev->used = 2;
                } else if (tst->ss->load_super(tst,dfd, NULL)) {
-                       if ((inargv && verbose >= 0) || verbose > 0)
+                       if (report_missmatch)
                                fprintf( stderr, Name ": no RAID superblock on %s\n",
                                         devname);
                } else {
-                       tst->ss->getinfo_super(tst, &info);
+                       content = &info;
+                       memset(content, 0, sizeof(*content));
+                       tst->ss->getinfo_super(tst, content);
                }
                if (dfd >= 0) close(dfd);
 
+               if (tst && tst->sb && tst->ss->container_content
+                   && tst->loaded_container) {
+                       /* tmpdev is a container.  We need to be either
+                        * looking for a member, or auto-assembling
+                        */
+                       if (st) {
+                               /* already found some components, this cannot
+                                * be another one.
+                                */
+                               if (report_missmatch)
+                                       fprintf(stderr, Name ": %s is a container, but we are looking for components\n",
+                                               devname);
+                               goto loop;
+                       }
+
+                       if (ident->container) {
+                               if (ident->container[0] == '/' &&
+                                   !same_dev(ident->container, devname)) {
+                                       if (report_missmatch)
+                                               fprintf(stderr, Name ": %s is not the container required (%s)\n",
+                                                       devname, ident->container);
+                                       goto loop;
+                               }
+                               if (ident->container[0] != '/') {
+                                       /* we have a uuid */
+                                       int uuid[4];
+                                       if (!parse_uuid(ident->container, uuid) ||
+                                           !same_uuid(content->uuid, uuid, tst->ss->swapuuid)) {
+                                               if (report_missmatch)
+                                                       fprintf(stderr, Name ": %s has wrong UUID to be required container\n",
+                                                               devname);
+                                               goto loop;
+                                       }
+                               }
+                       }
+                       /* It is worth looking inside this container.
+                        */
+               next_member:
+                       if (tmpdev->content)
+                               content = tmpdev->content;
+                       else
+                               content = tst->ss->container_content(tst);
+
+                       tmpdev->content = content->next;
+                       if (tmpdev->content == NULL)
+                               tmpdev->used = 2;
+
+               } else if (ident->container || ident->member) {
+                       /* No chance of this matching if we don't have
+                        * a container */
+                       if (report_missmatch)
+                               fprintf(stderr, Name "%s is not a container, and one is required.\n",
+                                       devname);
+                       goto loop;
+               }
+
                if (ident->uuid_set && (!update || strcmp(update, "uuid")!= 0) &&
                    (!tst || !tst->sb ||
-                    same_uuid(info.uuid, ident->uuid, tst->ss->swapuuid)==0)) {
-                       if ((inargv && verbose >= 0) || verbose > 0)
+                    same_uuid(content->uuid, ident->uuid, tst->ss->swapuuid)==0)) {
+                       if (report_missmatch)
                                fprintf(stderr, Name ": %s has wrong uuid.\n",
                                        devname);
                        goto loop;
                }
                if (ident->name[0] && (!update || strcmp(update, "name")!= 0) &&
                    (!tst || !tst->sb ||
-                    name_matches(info.name, ident->name, homehost)==0)) {
-                       if ((inargv && verbose >= 0) || verbose > 0)
+                    name_matches(content->name, ident->name, homehost)==0)) {
+                       if (report_missmatch)
                                fprintf(stderr, Name ": %s has wrong name.\n",
                                        devname);
                        goto loop;
                }
                if (ident->super_minor != UnSet &&
                    (!tst || !tst->sb ||
-                    ident->super_minor != info.array.md_minor)) {
-                       if ((inargv && verbose >= 0) || verbose > 0)
+                    ident->super_minor != content->array.md_minor)) {
+                       if (report_missmatch)
                                fprintf(stderr, Name ": %s has wrong super-minor.\n",
                                        devname);
                        goto loop;
                }
                if (ident->level != UnSet &&
                    (!tst || !tst->sb ||
-                    ident->level != info.array.level)) {
-                       if ((inargv && verbose >= 0) || verbose > 0)
+                    ident->level != content->array.level)) {
+                       if (report_missmatch)
                                fprintf(stderr, Name ": %s has wrong raid level.\n",
                                        devname);
                        goto loop;
                }
                if (ident->raid_disks != UnSet &&
                    (!tst || !tst->sb ||
-                    ident->raid_disks!= info.array.raid_disks)) {
-                       if ((inargv && verbose >= 0) || verbose > 0)
+                    ident->raid_disks!= content->array.raid_disks)) {
+                       if (report_missmatch)
                                fprintf(stderr, Name ": %s requires wrong number of drives.\n",
                                        devname);
                        goto loop;
                }
-               if (mdfd < 0) {
+               if (auto_assem) {
                        if (tst == NULL || tst->sb == NULL)
                                continue;
-                       if (update == NULL &&
-                           tst->ss->match_home(tst, homehost)==0) {
-                               if ((inargv && verbose >= 0) || verbose > 0)
-                                       fprintf(stderr, Name ": %s is not built for host %s.\n",
-                                               devname, homehost);
-                               /* Auto-assemble, and this is not a usable host */
-                               /* if update != NULL, we are updating the host
-                                * name... */
-                               goto loop;
-                       }
                }
                /* If we are this far, then we are nearly commited to this device.
                 * If the super_block doesn't exist, or doesn't match others,
@@ -320,6 +389,33 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                        return 1;
                }
 
+               if (tst && tst->sb && tst->ss->container_content
+                   && tst->loaded_container) {
+                       /* we have the one container we need, don't keep
+                        * looking.  If the chosen member is active, skip.
+                        */
+                       if (is_member_busy(content->text_version)) {
+                               if (report_missmatch)
+                                       fprintf(stderr, Name ": member %s in %s is already assembled\n",
+                                               content->text_version,
+                                               devname);
+                               tst->ss->free_super(tst);
+                               tst = NULL;
+                               content = NULL;
+                               if (auto_assem)
+                                       goto loop;
+                               return 1;
+                       }
+                       st = tst; tst = NULL;
+                       if (!auto_assem && tmpdev->next != NULL) {
+                               fprintf(stderr, Name ": %s is a container, but is not "
+                                       "only device given: confused and aborting\n",
+                                       devname);
+                               st->ss->free_super(st);
+                               return 1;
+                       }
+                       break;
+               }
                if (st == NULL)
                        st = dup_super(tst);
                if (st->minor_version == -1)
@@ -332,21 +428,22 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                         * Or, if we are auto assembling, we just ignore the second
                         * for now.
                         */
-                       if (mdfd < 0)
+                       if (auto_assem)
                                goto loop;
                        if (homehost) {
                                int first = st->ss->match_home(st, homehost);
                                int last = tst->ss->match_home(tst, homehost);
-                               if (first+last == 1) {
+                               if (first != last &&
+                                   (first == 1 || last == 1)) {
                                        /* We can do something */
                                        if (first) {/* just ignore this one */
-                                               if ((inargv && verbose >= 0) || verbose > 0)
+                                               if (report_missmatch)
                                                        fprintf(stderr, Name ": %s misses out due to wrong homehost\n",
                                                                devname);
                                                goto loop;
                                        } else { /* reject all those sofar */
                                                mddev_dev_t td;
-                                               if ((inargv && verbose >= 0) || verbose > 0)
+                                               if (report_missmatch)
                                                        fprintf(stderr, Name ": %s overrides previous devices due to good homehost\n",
                                                                devname);
                                                for (td=devlist; td != tmpdev; td=td->next)
@@ -367,53 +464,93 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                tmpdev->used = 1;
 
        loop:
+               if (tmpdev->content)
+                       goto next_member;
                if (tst)
                        tst->ss->free_super(tst);
        }
 
-       if (mdfd < 0) {
-               /* So... it is up to me to open the device.
-                * We create a name '/dev/md/XXX' based on the info in the
-                * superblock, and call open_mddev on that
-                */
-               mdu_array_info_t inf;
-               char *c;
-               if (!st || !st->sb) {
-                       return 2;
-               }
-               st->ss->getinfo_super(st, &info);
-               c = strchr(info.name, ':');
-               if (c) c++; else c= info.name;
-               if (isdigit(*c) && ((ident->autof & 7)==4 || (ident->autof&7)==6))
-                       /* /dev/md/d0 style for partitionable */
-                       xasprintf(&mddev, "/dev/md/d%s", c);
+       if (!st || !st->sb || !content)
+               return 2;
+
+       /* Now need to open array the device.  Use create_mddev */
+       if (content == &info)
+               st->ss->getinfo_super(st, content);
+
+       trustworthy = FOREIGN;
+       switch (st->ss->match_home(st, homehost)) {
+       case 0:
+               trustworthy = FOREIGN;
+               name = content->name;
+               break;
+       case 1:
+               trustworthy = LOCAL;
+               name = strchr(content->name, ':');
+               if (name)
+                       name++;
                else
-                       xasprintf(&mddev, "/dev/md/%s", c);
-               mdfd = open_mddev(mddev, ident->autof);
-               if (mdfd < 0) {
-                       st->ss->free_super(st);
-                       free(devices);
+                       name = content->name;
+               break;
+       case -1:
+               trustworthy = FOREIGN;
+               break;
+       }
+       if (!auto_assem && trustworthy == FOREIGN)
+               /* If the array is listed in mdadm or on
+                * command line, then we trust the name
+                * even if the array doesn't look local
+                */
+               trustworthy = LOCAL;
+
+       if (content->name[0] == 0 &&
+           content->array.level == LEVEL_CONTAINER) {
+               name = content->text_version;
+               trustworthy = METADATA;
+       }
+       mdfd = create_mddev(mddev, name, ident->autof, trustworthy,
+                           chosen_name);
+       if (mdfd < 0) {
+               st->ss->free_super(st);
+               free(devices);
+               if (auto_assem)
                        goto try_again;
-               }
-               vers = md_get_version(mdfd);
-               if (ioctl(mdfd, GET_ARRAY_INFO, &inf)==0) {
-                       for (tmpdev = devlist ;
-                            tmpdev && tmpdev->used != 1;
-                            tmpdev = tmpdev->next)
-                               ;
-                       fprintf(stderr, Name ": %s already active, cannot restart it!\n", mddev);
-                       if (tmpdev)
-                               fprintf(stderr, Name ":   %s needed for %s...\n",
-                                       mddev, tmpdev->devname);
-                       close(mdfd);
-                       mdfd = -1;
-                       st->ss->free_super(st);
-                       free(devices);
+               return 1;
+       }
+       mddev = chosen_name;
+       vers = md_get_version(mdfd);
+       if (vers < 9000) {
+               fprintf(stderr, Name ": Assemble requires driver version 0.90.0 or later.\n"
+                       "    Upgrade your kernel or try --build\n");
+               close(mdfd);
+               return 1;
+       }
+       if (mddev_busy(fd2devnum(mdfd))) {
+               fprintf(stderr, Name ": %s already active, cannot restart it!\n",
+                       mddev);
+               for (tmpdev = devlist ;
+                    tmpdev && tmpdev->used != 1;
+                    tmpdev = tmpdev->next)
+                       ;
+               if (tmpdev && auto_assem)
+                       fprintf(stderr, Name ":   %s needed for %s...\n",
+                               mddev, tmpdev->devname);
+               close(mdfd);
+               mdfd = -3;
+               st->ss->free_super(st);
+               free(devices);
+               if (auto_assem)
                        goto try_again;
-               }
-               must_close = 1;
+               return 1;
        }
+       ioctl(mdfd, STOP_ARRAY, NULL); /* just incase it was started but has no content */
 
+#ifndef MDASSEMBLE
+       if (content != &info) {
+               /* This is a member of a container.  Try starting the array. */
+               return assemble_container_content(st, mdfd, content, runstop,
+                                          chosen_name, verbose);
+       }
+#endif
        /* Ok, no bad inconsistancy, we can try updating etc */
        bitmap_done = 0;
        for (tmpdev = devlist; tmpdev; tmpdev=tmpdev->next) if (tmpdev->used == 1) {
@@ -446,19 +583,19 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
 
                        tst = dup_super(st);
                        tst->ss->load_super(tst, dfd, NULL);
-                       tst->ss->getinfo_super(tst, &info);
+                       tst->ss->getinfo_super(tst, content);
 
-                       memcpy(info.uuid, ident->uuid, 16);
-                       strcpy(info.name, ident->name);
-                       info.array.md_minor = minor(stb2.st_rdev);
+                       memcpy(content->uuid, ident->uuid, 16);
+                       strcpy(content->name, ident->name);
+                       content->array.md_minor = minor(stb2.st_rdev);
 
-                       tst->ss->update_super(tst, &info, update,
+                       tst->ss->update_super(tst, content, update,
                                              devname, verbose,
                                              ident->uuid_set, homehost);
                        if (strcmp(update, "uuid")==0 &&
                            !ident->uuid_set) {
                                ident->uuid_set = 1;
-                               memcpy(ident->uuid, info.uuid, 16);
+                               memcpy(ident->uuid, content->uuid, 16);
                        }
                        if (dfd < 0)
                                fprintf(stderr, Name ": Cannot open %s for superblock update\n",
@@ -472,7 +609,7 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                        if (strcmp(update, "uuid")==0 &&
                            ident->bitmap_fd >= 0 && !bitmap_done) {
                                if (bitmap_update_uuid(ident->bitmap_fd,
-                                                      info.uuid,
+                                                      content->uuid,
                                                       tst->ss->swapuuid) != 0)
                                        fprintf(stderr, Name ": Could not update uuid on external bitmap.\n");
                                else
@@ -489,7 +626,7 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                        remove_partitions(dfd);
 
                        tst->ss->load_super(tst, dfd, NULL);
-                       tst->ss->getinfo_super(tst, &info);
+                       tst->ss->getinfo_super(tst, content);
                        tst->ss->free_super(tst);
                        close(dfd);
                }
@@ -498,10 +635,10 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
 
                if (verbose > 0)
                        fprintf(stderr, Name ": %s is identified as a member of %s, slot %d.\n",
-                               devname, mddev, info.disk.raid_disk);
+                               devname, mddev, content->disk.raid_disk);
                devices[devcnt].devname = devname;
                devices[devcnt].uptodate = 0;
-               devices[devcnt].i = info;
+               devices[devcnt].i = *content;
                devices[devcnt].i.disk.major = major(stb.st_rdev);
                devices[devcnt].i.disk.minor = minor(stb.st_rdev);
                if (most_recent < devcnt) {
@@ -509,17 +646,17 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                            > devices[most_recent].i.events)
                                most_recent = devcnt;
                }
-               if (info.array.level == -4)
+               if (content->array.level == -4)
                        /* with multipath, the raid_disk from the superblock is meaningless */
                        i = devcnt;
                else
                        i = devices[devcnt].i.disk.raid_disk;
                if (i+1 == 0) {
-                       if (nextspare < info.array.raid_disks)
-                               nextspare = info.array.raid_disks;
+                       if (nextspare < content->array.raid_disks)
+                               nextspare = content->array.raid_disks;
                        i = nextspare++;
                } else {
-                       if (i >= info.array.raid_disks &&
+                       if (i >= content->array.raid_disks &&
                            i >= nextspare)
                                nextspare = i+1;
                }
@@ -542,8 +679,8 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                            == devices[devcnt].i.events
                            && (devices[best[i]].i.disk.minor
                                != devices[devcnt].i.disk.minor)
-                           && st->ss->major == 0
-                           && info.array.level != -4) {
+                           && st->ss == &super0
+                           && content->array.level != LEVEL_MULTIPATH) {
                                /* two different devices with identical superblock.
                                 * Could be a mis-detection caused by overlapping
                                 * partitions.  fail-safe.
@@ -558,7 +695,7 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                                        inargv ? "the list" :
                                           "the\n      DEVICE list in mdadm.conf"
                                        );
-                               if (must_close) close(mdfd);
+                               close(mdfd);
                                return 1;
                        }
                        if (best[i] == -1
@@ -574,21 +711,21 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                        mddev);
                if (st)
                        st->ss->free_super(st);
-               if (must_close) close(mdfd);
+               close(mdfd);
                return 1;
        }
 
        if (update && strcmp(update, "byteorder")==0)
                st->minor_version = 90;
 
-       st->ss->getinfo_super(st, &info);
-       clean = info.array.state & 1;
+       st->ss->getinfo_super(st, content);
+       clean = content->array.state & 1;
 
        /* now we have some devices that might be suitable.
         * I wonder how many
         */
-       avail = malloc(info.array.raid_disks);
-       memset(avail, 0, info.array.raid_disks);
+       avail = malloc(content->array.raid_disks);
+       memset(avail, 0, content->array.raid_disks);
        okcnt = 0;
        sparecnt=0;
        for (i=0; i< bestcnt ;i++) {
@@ -600,7 +737,7 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                /* note: we ignore error flags in multipath arrays
                 * as they don't make sense
                 */
-               if (info.array.level != -4)
+               if (content->array.level != -4)
                        if (!(devices[j].i.disk.state & (1<<MD_DISK_SYNC))) {
                                if (!(devices[j].i.disk.state
                                      & (1<<MD_DISK_FAULTY)))
@@ -610,15 +747,15 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                if (devices[j].i.events+event_margin >=
                    devices[most_recent].i.events) {
                        devices[j].uptodate = 1;
-                       if (i < info.array.raid_disks) {
+                       if (i < content->array.raid_disks) {
                                okcnt++;
                                avail[i]=1;
                        } else
                                sparecnt++;
                }
        }
-       while (force && !enough(info.array.level, info.array.raid_disks,
-                               info.array.layout, 1,
+       while (force && !enough(content->array.level, content->array.raid_disks,
+                               content->array.layout, 1,
                                avail, okcnt)) {
                /* Choose the newest best drive which is
                 * not up-to-date, update the superblock
@@ -628,7 +765,7 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                struct supertype *tst;
                long long current_events;
                chosen_drive = -1;
-               for (i=0; i<info.array.raid_disks && i < bestcnt; i++) {
+               for (i=0; i<content->array.raid_disks && i < bestcnt; i++) {
                        int j = best[i];
                        if (j>=0 &&
                            !devices[j].uptodate &&
@@ -662,8 +799,8 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                        devices[chosen_drive].i.events = 0;
                        continue;
                }
-               info.events = devices[most_recent].i.events;
-               tst->ss->update_super(tst, &info, "force-one",
+               content->events = devices[most_recent].i.events;
+               tst->ss->update_super(tst, content, "force-one",
                                     devices[chosen_drive].devname, verbose,
                                     0, NULL);
 
@@ -685,7 +822,7 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                /* If there are any other drives of the same vintage,
                 * add them in as well.  We can't lose and we might gain
                 */
-               for (i=0; i<info.array.raid_disks && i < bestcnt ; i++) {
+               for (i=0; i<content->array.raid_disks && i < bestcnt ; i++) {
                        int j = best[i];
                        if (j >= 0 &&
                            !devices[j].uptodate &&
@@ -716,29 +853,32 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                if ((fd=dev_open(devices[j].devname, O_RDONLY|O_EXCL))< 0) {
                        fprintf(stderr, Name ": Cannot open %s: %s\n",
                                devices[j].devname, strerror(errno));
-                       if (must_close) close(mdfd);
+                       close(mdfd);
                        return 1;
                }
                if (st->ss->load_super(st,fd, NULL)) {
                        close(fd);
                        fprintf(stderr, Name ": RAID superblock has disappeared from %s\n",
                                devices[j].devname);
-                       if (must_close) close(mdfd);
+                       close(mdfd);
                        return 1;
                }
                close(fd);
        }
        if (st->sb == NULL) {
                fprintf(stderr, Name ": No suitable drives found for %s\n", mddev);
-               if (must_close) close(mdfd);
+               close(mdfd);
                return 1;
        }
-       st->ss->getinfo_super(st, &info);
+       st->ss->getinfo_super(st, content);
+#ifndef MDASSEMBLE
+       sysfs_init(content, mdfd, 0);
+#endif
        for (i=0; i<bestcnt; i++) {
                int j = best[i];
                unsigned int desired_state;
 
-               if (i < info.array.raid_disks)
+               if (i < content->array.raid_disks)
                        desired_state = (1<<MD_DISK_ACTIVE) | (1<<MD_DISK_SYNC);
                else
                        desired_state = 0;
@@ -775,10 +915,10 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
 #endif
        }
        if (force && !clean &&
-           !enough(info.array.level, info.array.raid_disks,
-                   info.array.layout, clean,
+           !enough(content->array.level, content->array.raid_disks,
+                   content->array.layout, clean,
                    avail, okcnt)) {
-               change += st->ss->update_super(st, &info, "force-array",
+               change += st->ss->update_super(st, content, "force-array",
                                        devices[chosen_drive].devname, verbose,
                                               0, NULL);
                clean = 1;
@@ -790,14 +930,14 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                if (fd < 0) {
                        fprintf(stderr, Name ": Could not open %s for write - cannot Assemble array.\n",
                                devices[chosen_drive].devname);
-                       if (must_close) close(mdfd);
+                       close(mdfd);
                        return 1;
                }
                if (st->ss->store_super(st, fd)) {
                        close(fd);
                        fprintf(stderr, Name ": Could not re-write superblock on %s\n",
                                devices[chosen_drive].devname);
-                       if (must_close) close(mdfd);
+                       close(mdfd);
                        return 1;
                }
                close(fd);
@@ -808,7 +948,7 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
         * The code of doing this lives in Grow.c
         */
 #ifndef MDASSEMBLE
-       if (info.reshape_active) {
+       if (content->reshape_active) {
                int err = 0;
                int *fdlist = malloc(sizeof(int)* bestcnt);
                for (i=0; i<bestcnt; i++) {
@@ -825,14 +965,14 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                                fdlist[i] = -1;
                }
                if (!err)
-                       err = Grow_restart(st, &info, fdlist, bestcnt, backup_file);
+                       err = Grow_restart(st, content, fdlist, bestcnt, backup_file);
                while (i>0) {
                        i--;
                        if (fdlist[i]>=0) close(fdlist[i]);
                }
                if (err) {
                        fprintf(stderr, Name ": Failed to restore critical section for reshape, sorry.\n");
-                       if (must_close) close(mdfd);
+                       close(mdfd);
                        return err;
                }
        }
@@ -840,30 +980,29 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
        /* count number of in-sync devices according to the superblock.
         * We must have this number to start the array without -s or -R
         */
-       req_cnt = info.array.working_disks;
+       req_cnt = content->array.working_disks;
 
        /* Almost ready to actually *do* something */
        if (!old_linux) {
                int rv;
-               if ((vers % 100) >= 1) { /* can use different versions */
-                       mdu_array_info_t inf;
-                       memset(&inf, 0, sizeof(inf));
-                       inf.major_version = st->ss->major;
-                       inf.minor_version = st->minor_version;
-                       rv = ioctl(mdfd, SET_ARRAY_INFO, &inf);
-               } else
-                       rv = ioctl(mdfd, SET_ARRAY_INFO, NULL);
 
+               /* First, fill in the map, so that udev can find our name
+                * as soon as we become active.
+                */
+               map_update(NULL, fd2devnum(mdfd), content->text_version,
+                          content->uuid, chosen_name);
+
+               rv = set_array_info(mdfd, st, content);
                if (rv) {
-                       fprintf(stderr, Name ": SET_ARRAY_INFO failed for %s: %s\n",
+                       fprintf(stderr, Name ": failed to set array info for %s: %s\n",
                                mddev, strerror(errno));
-                       if (must_close) close(mdfd);
+                       close(mdfd);
                        return 1;
                }
                if (ident->bitmap_fd >= 0) {
                        if (ioctl(mdfd, SET_BITMAP_FILE, ident->bitmap_fd) != 0) {
                                fprintf(stderr, Name ": SET_BITMAP_FILE failed.\n");
-                               if (must_close) close(mdfd);
+                               close(mdfd);
                                return 1;
                        }
                } else if (ident->bitmap_file) {
@@ -872,13 +1011,13 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                        if (bmfd < 0) {
                                fprintf(stderr, Name ": Could not open bitmap file %s\n",
                                        ident->bitmap_file);
-                               if (must_close) close(mdfd);
+                               close(mdfd);
                                return 1;
                        }
                        if (ioctl(mdfd, SET_BITMAP_FILE, bmfd) != 0) {
                                fprintf(stderr, Name ": Failed to set bitmapfile for %s\n", mddev);
                                close(bmfd);
-                               if (must_close) close(mdfd);
+                               close(mdfd);
                                return 1;
                        }
                        close(bmfd);
@@ -895,14 +1034,15 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                                j = chosen_drive;
 
                        if (j >= 0 /* && devices[j].uptodate */) {
-                               if (ioctl(mdfd, ADD_NEW_DISK,
-                                         &devices[j].i.disk)!=0) {
+                               rv = add_disk(mdfd, st, content, &devices[j].i);
+
+                               if (rv) {
                                        fprintf(stderr, Name ": failed to add "
                                                        "%s to %s: %s\n",
                                                devices[j].devname,
                                                mddev,
                                                strerror(errno));
-                                       if (i < info.array.raid_disks
+                                       if (i < content->array.raid_disks
                                            || i == bestcnt)
                                                okcnt--;
                                        else
@@ -912,49 +1052,67 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                                                        "to %s as %d\n",
                                                devices[j].devname, mddev,
                                                devices[j].i.disk.raid_disk);
-                       } else if (verbose > 0 && i < info.array.raid_disks)
+                       } else if (verbose > 0 && i < content->array.raid_disks)
                                fprintf(stderr, Name ": no uptodate device for "
                                                "slot %d of %s\n",
                                        i, mddev);
                }
 
+               if (content->array.level == LEVEL_CONTAINER) {
+                       if (verbose >= 0) {
+                               fprintf(stderr, Name ": Container %s has been "
+                                       "assembled with %d drive%s",
+                                       mddev, okcnt+sparecnt, okcnt+sparecnt==1?"":"s");
+                               if (okcnt < content->array.raid_disks)
+                                       fprintf(stderr, " (out of %d)",
+                                               content->array.raid_disks);
+                               fprintf(stderr, "\n");
+                       }
+                       sysfs_uevent(content, "change");
+                       wait_for(chosen_name);
+                       close(mdfd);
+                       return 0;
+               }
+
                if (runstop == 1 ||
                    (runstop <= 0 &&
-                    ( enough(info.array.level, info.array.raid_disks,
-                             info.array.layout, clean, avail, okcnt) &&
+                    ( enough(content->array.level, content->array.raid_disks,
+                             content->array.layout, clean, avail, okcnt) &&
                       (okcnt >= req_cnt || start_partial_ok)
                             ))) {
                        if (ioctl(mdfd, RUN_ARRAY, NULL)==0) {
                                if (verbose >= 0) {
                                        fprintf(stderr, Name ": %s has been started with %d drive%s",
                                                mddev, okcnt, okcnt==1?"":"s");
-                                       if (okcnt < info.array.raid_disks)
-                                               fprintf(stderr, " (out of %d)", info.array.raid_disks);
+                                       if (okcnt < content->array.raid_disks)
+                                               fprintf(stderr, " (out of %d)", content->array.raid_disks);
                                        if (sparecnt)
                                                fprintf(stderr, " and %d spare%s", sparecnt, sparecnt==1?"":"s");
                                        fprintf(stderr, ".\n");
                                }
-                               if (info.reshape_active &&
-                                   info.array.level >= 4 &&
-                                   info.array.level <= 6) {
+                               if (content->reshape_active &&
+                                   content->array.level >= 4 &&
+                                   content->array.level <= 6) {
                                        /* might need to increase the size
                                         * of the stripe cache - default is 256
                                         */
-                                       if (256 < 4 * (info.array.chunk_size/4096)) {
+                                       if (256 < 4 * (content->array.chunk_size/4096)) {
                                                struct mdinfo *sra = sysfs_read(mdfd, 0, 0);
                                                if (sra)
                                                        sysfs_set_num(sra, NULL,
                                                                      "stripe_cache_size",
-                                                                     (4 * info.array.chunk_size / 4096) + 1);
+                                                                     (4 * content->array.chunk_size / 4096) + 1);
                                        }
                                }
-                               if (must_close) {
+                               close(mdfd);
+                               wait_for(mddev);
+                               if (auto_assem) {
                                        int usecs = 1;
-                                       close(mdfd);
                                        /* There is a nasty race with 'mdadm --monitor'.
                                         * If it opens this device before we close it,
                                         * it gets an incomplete open on which IO
-                                        * doesn't work and the capacity if wrong.
+                                        * doesn't work and the capacity is
+                                        * wrong.
                                         * If we reopen (to check for layered devices)
                                         * before --monitor closes, we loose.
                                         *
@@ -979,59 +1137,57 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                        fprintf(stderr, Name ": failed to RUN_ARRAY %s: %s\n",
                                mddev, strerror(errno));
 
-                       if (!enough(info.array.level, info.array.raid_disks,
-                                   info.array.layout, 1, avail, okcnt))
+                       if (!enough(content->array.level, content->array.raid_disks,
+                                   content->array.layout, 1, avail, okcnt))
                                fprintf(stderr, Name ": Not enough devices to "
                                        "start the array.\n");
-                       else if (!enough(info.array.level,
-                                        info.array.raid_disks,
-                                        info.array.layout, clean,
+                       else if (!enough(content->array.level,
+                                        content->array.raid_disks,
+                                        content->array.layout, clean,
                                         avail, okcnt))
                                fprintf(stderr, Name ": Not enough devices to "
                                        "start the array while not clean "
                                        "- consider --force.\n");
 
-                       if (must_close) {
+                       if (auto_assem)
                                ioctl(mdfd, STOP_ARRAY, NULL);
-                               close(mdfd);
-                       }
+                       close(mdfd);
                        return 1;
                }
                if (runstop == -1) {
                        fprintf(stderr, Name ": %s assembled from %d drive%s",
                                mddev, okcnt, okcnt==1?"":"s");
-                       if (okcnt != info.array.raid_disks)
-                               fprintf(stderr, " (out of %d)", info.array.raid_disks);
+                       if (okcnt != content->array.raid_disks)
+                               fprintf(stderr, " (out of %d)", content->array.raid_disks);
                        fprintf(stderr, ", but not started.\n");
-                       if (must_close) close(mdfd);
+                       close(mdfd);
                        return 0;
                }
                if (verbose >= -1) {
                        fprintf(stderr, Name ": %s assembled from %d drive%s", mddev, okcnt, okcnt==1?"":"s");
                        if (sparecnt)
                                fprintf(stderr, " and %d spare%s", sparecnt, sparecnt==1?"":"s");
-                       if (!enough(info.array.level, info.array.raid_disks,
-                                   info.array.layout, 1, avail, okcnt))
+                       if (!enough(content->array.level, content->array.raid_disks,
+                                   content->array.layout, 1, avail, okcnt))
                                fprintf(stderr, " - not enough to start the array.\n");
-                       else if (!enough(info.array.level,
-                                        info.array.raid_disks,
-                                        info.array.layout, clean,
+                       else if (!enough(content->array.level,
+                                        content->array.raid_disks,
+                                        content->array.layout, clean,
                                         avail, okcnt))
                                fprintf(stderr, " - not enough to start the "
                                        "array while not clean - consider "
                                        "--force.\n");
                        else {
-                               if (req_cnt == info.array.raid_disks)
+                               if (req_cnt == content->array.raid_disks)
                                        fprintf(stderr, " - need all %d to start it", req_cnt);
                                else
-                                       fprintf(stderr, " - need %d of %d to start", req_cnt, info.array.raid_disks);
+                                       fprintf(stderr, " - need %d of %d to start", req_cnt, content->array.raid_disks);
                                fprintf(stderr, " (use --run to insist).\n");
                        }
                }
-               if (must_close) {
+               if (auto_assem)
                        ioctl(mdfd, STOP_ARRAY, NULL);
-                       close(mdfd);
-               }
+               close(mdfd);
                return 1;
        } else {
                /* The "chosen_drive" is a good choice, and if necessary, the superblock has
@@ -1047,6 +1203,92 @@ int Assemble(struct supertype *st, char *mddev, int mdfd,
                }
 
        }
-       if (must_close) close(mdfd);
+       close(mdfd);
        return 0;
 }
+
+#ifndef MDASSEMBLE
+int assemble_container_content(struct supertype *st, int mdfd,
+                              struct mdinfo *content, int runstop,
+                              char *chosen_name, int verbose)
+{
+       struct mdinfo *dev, *sra;
+       int working = 0, preexist = 0;
+       struct map_ent *map = NULL;
+
+       sysfs_init(content, mdfd, 0);
+
+       sra = sysfs_read(mdfd, 0, GET_VERSION);
+       if (sra == NULL || strcmp(sra->text_version, content->text_version) != 0)
+               if (sysfs_set_array(content, md_get_version(mdfd)) != 0) {
+                       close(mdfd);
+                       return 1;
+               }
+       if (sra)
+               sysfs_free(sra);
+
+       for (dev = content->devs; dev; dev = dev->next)
+               if (sysfs_add_disk(content, dev) == 0)
+                       working++;
+               else if (errno == EEXIST)
+                       preexist++;
+       if (working == 0) {
+               close(mdfd);
+               return 1;/* Nothing new, don't try to start */
+       } else if (runstop > 0 ||
+                (working + preexist) >= content->array.working_disks) {
+               int err;
+
+               map_update(&map, fd2devnum(mdfd),
+                          content->text_version,
+                          content->uuid, chosen_name);
+               switch(content->array.level) {
+               case LEVEL_LINEAR:
+               case LEVEL_MULTIPATH:
+               case 0:
+                       err = sysfs_set_str(content, NULL, "array_state",
+                                           "active");
+                       break;
+               default:
+                       err = sysfs_set_str(content, NULL, "array_state",
+                                     "readonly");
+                       /* start mdmon if needed. */
+                       if (!err) {
+                               if (!mdmon_running(st->container_dev))
+                                       start_mdmon(st->container_dev);
+                               ping_monitor(devnum2devname(st->container_dev));
+                       }
+                       break;
+               }
+               if (!err)
+                       sysfs_set_safemode(content, content->safe_mode_delay);
+               if (verbose >= 0) {
+                       if (err)
+                               fprintf(stderr, Name
+                                       ": array %s now has %d devices",
+                                       chosen_name, working + preexist);
+                       else
+                               fprintf(stderr, Name
+                                       ": Started %s with %d devices",
+                                       chosen_name, working + preexist);
+                       if (preexist)
+                               fprintf(stderr, " (%d new)", working);
+                       fprintf(stderr, "\n");
+               }
+               if (!err)
+                       wait_for(chosen_name);
+               close(mdfd);
+               return 0;
+               /* FIXME should have an O_EXCL and wait for read-auto */
+       } else {
+               if (verbose >= 0)
+                       fprintf(stderr, Name
+                               ": %s assembled with %d devices but "
+                               "not started\n",
+                               chosen_name, working);
+               close(mdfd);
+               return 1;
+       }
+}
+#endif
+
diff --git a/Build.c b/Build.c
index 1e213ce..52fc0ca 100644 (file)
--- a/Build.c
+++ b/Build.c
 #define START_MD               _IO (MD_MAJOR, 2)
 #define STOP_MD                _IO (MD_MAJOR, 3)
 
-int Build(char *mddev, int mdfd, int chunk, int level, int layout,
-         int raiddisks,
-         mddev_dev_t devlist, int assume_clean,
-         char *bitmap_file, int bitmap_chunk, int write_behind, int delay, int verbose)
+int Build(char *mddev, int chunk, int level, int layout,
+         int raiddisks, mddev_dev_t devlist, int assume_clean,
+         char *bitmap_file, int bitmap_chunk, int write_behind,
+         int delay, int verbose, int autof)
 {
        /* Build a linear or raid0 arrays without superblocks
         * We cannot really do any checks, we just do it.
@@ -59,6 +59,10 @@ int Build(char *mddev, int mdfd, int chunk, int level, int layout,
        int bitmap_fd;
        unsigned long long size = ~0ULL;
        unsigned long long bitmapsize;
+       int mdfd;
+       char chosen_name[1024];
+       int uuid[4] = {0,0,0,0};
+       struct map_ent *map = NULL;
 
        /* scan all devices, make sure they really are block devices */
        for (dv = devlist; dv; dv=dv->next) {
@@ -112,6 +116,18 @@ int Build(char *mddev, int mdfd, int chunk, int level, int layout,
                        break;
                }
 
+       /* We need to create the device.  It can have no name. */
+       map_lock(&map);
+       mdfd = create_mddev(mddev, NULL, autof, LOCAL,
+                           chosen_name);
+       if (mdfd < 0) {
+               map_unlock(&map);
+               return 1;
+       }
+       mddev = chosen_name;
+
+       map_update(&map, fd2devnum(mdfd), "none", uuid, chosen_name);
+       map_unlock(&map);
 
        vers = md_get_version(mdfd);
 
@@ -140,17 +156,17 @@ int Build(char *mddev, int mdfd, int chunk, int level, int layout,
                if (ioctl(mdfd, SET_ARRAY_INFO, &array)) {
                        fprintf(stderr, Name ": SET_ARRAY_INFO failed for %s: %s\n",
                                mddev, strerror(errno));
-                       return 1;
+                       goto abort;
                }
        } else if (bitmap_file) {
                fprintf(stderr, Name ": bitmaps not supported with this kernel\n");
-               return 1;
+               goto abort;
        }
 
        if (bitmap_file && level <= 0) {
                fprintf(stderr, Name ": bitmaps not meaningful with level %s\n",
                        map_num(pers, level)?:"given");
-               return 1;
+               goto abort;
        }
        /* now add the devices */
        for ((i=0), (dv = devlist) ; dv ; i++, dv=dv->next) {
@@ -211,7 +227,7 @@ int Build(char *mddev, int mdfd, int chunk, int level, int layout,
                                if (bitmap_chunk == UnSet) {
                                        fprintf(stderr, Name ": %s cannot be openned.",
                                                bitmap_file);
-                                       return 1;
+                                       goto abort;
                                }
 #endif
                                if (vers < 9003) {
@@ -224,20 +240,20 @@ int Build(char *mddev, int mdfd, int chunk, int level, int layout,
                                bitmapsize = size>>9; /* FIXME wrong for RAID10 */
                                if (CreateBitmap(bitmap_file, 1, NULL, bitmap_chunk,
                                                 delay, write_behind, bitmapsize, major)) {
-                                       return 1;
+                                       goto abort;
                                }
                                bitmap_fd = open(bitmap_file, O_RDWR);
                                if (bitmap_fd < 0) {
                                        fprintf(stderr, Name ": %s cannot be openned.",
                                                bitmap_file);
-                                       return 1;
+                                       goto abort;
                                }
                        }
                        if (bitmap_fd >= 0) {
                                if (ioctl(mdfd, SET_BITMAP_FILE, bitmap_fd) < 0) {
                                        fprintf(stderr, Name ": Cannot set bitmap file for %s: %s\n",
                                                mddev, strerror(errno));
-                                       return 1;
+                                       goto abort;
                                }
                        }
                }
@@ -265,6 +281,8 @@ int Build(char *mddev, int mdfd, int chunk, int level, int layout,
        if (verbose >= 0)
                fprintf(stderr, Name ": array %s built and started.\n",
                        mddev);
+       close(mdfd);
+       wait_for(mddev);
        return 0;
 
  abort:
@@ -272,5 +290,6 @@ int Build(char *mddev, int mdfd, int chunk, int level, int layout,
            ioctl(mdfd, STOP_ARRAY, 0);
        else
            ioctl(mdfd, STOP_MD, 0);
+       close(mdfd);
        return 1;
 }
index 9e65d0a..d33f891 100644 (file)
--- a/Create.c
+++ b/Create.c
 #include       "md_p.h"
 #include       <ctype.h>
 
-int Create(struct supertype *st, char *mddev, int mdfd,
+static int default_layout(struct supertype *st, int level, int verbose)
+{
+       int layout = UnSet;
+
+       if (st && st->ss->default_layout)
+               layout = st->ss->default_layout(level);
+
+       if (layout == UnSet)
+               switch(level) {
+               default: /* no layout */
+                       layout = 0;
+                       break;
+               case 10:
+                       layout = 0x102; /* near=2, far=1 */
+                       if (verbose > 0)
+                               fprintf(stderr,
+                                       Name ": layout defaults to n1\n");
+                       break;
+               case 5:
+               case 6:
+                       layout = map_name(r5layout, "default");
+                       if (verbose > 0)
+                               fprintf(stderr,
+                                       Name ": layout defaults to %s\n", map_num(r5layout, layout));
+                       break;
+               case LEVEL_FAULTY:
+                       layout = map_name(faultylayout, "default");
+
+                       if (verbose > 0)
+                               fprintf(stderr,
+                                       Name ": layout defaults to %s\n", map_num(faultylayout, layout));
+                       break;
+               }
+
+       return layout;
+}
+
+
+int Create(struct supertype *st, char *mddev,
           int chunk, int level, int layout, unsigned long long size, int raiddisks, int sparedisks,
           char *name, char *homehost, int *uuid,
           int subdevs, mddev_dev_t devlist,
           int runstop, int verbose, int force, int assume_clean,
-          char *bitmap_file, int bitmap_chunk, int write_behind, int delay)
+          char *bitmap_file, int bitmap_chunk, int write_behind, int delay, int autof)
 {
        /*
         * Create a new raid array.
@@ -55,6 +93,7 @@ int Create(struct supertype *st, char *mddev, int mdfd,
         * if runstop==run, or raiddisks disks were used,
         * RUN_ARRAY
         */
+       int mdfd;
        unsigned long long minsize=0, maxsize=0;
        char *mindisc = NULL;
        char *maxdisc = NULL;
@@ -66,31 +105,35 @@ int Create(struct supertype *st, char *mddev, int mdfd,
        int second_missing = subdevs * 2;
        int missing_disks = 0;
        int insert_point = subdevs * 2; /* where to insert a missing drive */
+       int total_slots;
        int pass;
        int vers;
        int rv;
        int bitmap_fd;
+       int have_container = 0;
+       int container_fd = -1;
+       int need_mdmon = 0;
        unsigned long long bitmapsize;
-       struct mdinfo info;
+       struct mdinfo info, *infos;
+       int did_default = 0;
+       int do_default_layout = 0;
+       unsigned long safe_mode_delay = 0;
+       char chosen_name[1024];
+       struct map_ent *map = NULL;
+       unsigned long long newsize;
 
        int major_num = BITMAP_MAJOR_HI;
 
        memset(&info, 0, sizeof(info));
 
-       vers = md_get_version(mdfd);
-       if (vers < 9000) {
-               fprintf(stderr, Name ": Create requires md driver version 0.90.0 or later\n");
-               return 1;
-       } else {
-               mdu_array_info_t inf;
-               memset(&inf, 0, sizeof(inf));
-               ioctl(mdfd, GET_ARRAY_INFO, &inf);
-               if (inf.working_disks != 0) {
-                       fprintf(stderr, Name ": another array by this name"
-                               " is already running.\n");
-                       return 1;
-               }
+       if (level == UnSet) {
+               /* "ddf" and "imsm" metadata only supports one level - should possibly
+                * push this into metadata handler??
+                */
+               if (st && (st->ss == &super_ddf || st->ss == &super_imsm))
+                       level = LEVEL_CONTAINER;
        }
+
        if (level == UnSet) {
                fprintf(stderr,
                        Name ": a RAID level is needed to create an array.\n");
@@ -116,11 +159,55 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                        Name ": This level does not support spare devices\n");
                return 1;
        }
+
+       if (subdevs == 1 && strcmp(devlist->devname, "missing") != 0) {
+               /* If given a single device, it might be a container, and we can
+                * extract a device list from there
+                */
+               mdu_array_info_t inf;
+               int fd;
+
+               memset(&inf, 0, sizeof(inf));
+               fd = open(devlist->devname, O_RDONLY);
+               if (fd >= 0 &&
+                   ioctl(fd, GET_ARRAY_INFO, &inf) == 0 &&
+                   inf.raid_disks == 0) {
+                       /* yep, looks like a container */
+                       if (st) {
+                               rv = st->ss->load_super(st, fd,
+                                                       devlist->devname);
+                               if (rv == 0)
+                                       have_container = 1;
+                       } else {
+                               st = guess_super(fd);
+                               if (st && !(rv = st->ss->
+                                           load_super(st, fd,
+                                                      devlist->devname)))
+                                       have_container = 1;
+                               else
+                                       st = NULL;
+                       }
+                       if (have_container) {
+                               subdevs = raiddisks;
+                               first_missing = subdevs * 2;
+                               second_missing = subdevs * 2;
+                               insert_point = subdevs * 2;
+                       }
+               }
+               if (fd >= 0)
+                       close(fd);
+       }
+       if (st && st->ss->external && sparedisks) {
+               fprintf(stderr,
+                       Name ": This metadata type does not support "
+                       "spare disks are create time\n");
+               return 1;
+       }
        if (subdevs > raiddisks+sparedisks) {
                fprintf(stderr, Name ": You have listed more devices (%d) than are in the array(%d)!\n", subdevs, raiddisks+sparedisks);
                return 1;
        }
-       if (subdevs < raiddisks+sparedisks) {
+       if (!have_container && subdevs < raiddisks+sparedisks) {
                fprintf(stderr, Name ": You haven't given enough devices (real or missing) to create this array\n");
                return 1;
        }
@@ -131,32 +218,12 @@ int Create(struct supertype *st, char *mddev, int mdfd,
        }
 
        /* now set some defaults */
-       if (layout == UnSet)
-               switch(level) {
-               default: /* no layout */
-                       layout = 0;
-                       break;
-               case 10:
-                       layout = 0x102; /* near=2, far=1 */
-                       if (verbose > 0)
-                               fprintf(stderr,
-                                       Name ": layout defaults to n1\n");
-                       break;
-               case 5:
-               case 6:
-                       layout = map_name(r5layout, "default");
-                       if (verbose > 0)
-                               fprintf(stderr,
-                                       Name ": layout defaults to %s\n", map_num(r5layout, layout));
-                       break;
-               case LEVEL_FAULTY:
-                       layout = map_name(faultylayout, "default");
 
-                       if (verbose > 0)
-                               fprintf(stderr,
-                                       Name ": layout defaults to %s\n", map_num(faultylayout, layout));
-                       break;
-               }
+
+       if (layout == UnSet) {
+               do_default_layout = 1;
+               layout = default_layout(st, level, verbose);
+       }
 
        if (level == 10)
                /* check layout fits in array*/
@@ -182,6 +249,7 @@ int Create(struct supertype *st, char *mddev, int mdfd,
        case 1:
        case LEVEL_FAULTY:
        case LEVEL_MULTIPATH:
+       case LEVEL_CONTAINER:
                if (chunk) {
                        chunk = 0;
                        if (verbose > 0)
@@ -192,15 +260,25 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                fprintf(stderr, Name ": unknown level %d\n", level);
                return 1;
        }
+       
+       newsize = size * 2;
+       if (st && ! st->ss->validate_geometry(st, level, layout, raiddisks,
+                                             chunk, size*2, NULL, &newsize, verbose>=0))
+               return 1;
+       if (size == 0) {
+               size = newsize / 2;
+               if (size && verbose > 0)
+                       fprintf(stderr, Name ": setting size to %lluK\n",
+                               (unsigned long long)size);
+       }
 
        /* now look at the subdevs */
        info.array.active_disks = 0;
        info.array.working_disks = 0;
        dnum = 0;
-       for (dv=devlist; dv; dv=dv->next, dnum++) {
+       for (dv=devlist; dv && !have_container; dv=dv->next, dnum++) {
                char *dname = dv->devname;
-               unsigned long long ldsize, freesize;
-               int fd;
+               unsigned long long freesize;
                if (strcasecmp(dname, "missing")==0) {
                        if (first_missing > dnum)
                                first_missing = dnum;
@@ -212,18 +290,6 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                info.array.working_disks++;
                if (dnum < raiddisks)
                        info.array.active_disks++;
-               fd = open(dname, O_RDONLY|O_EXCL);
-               if (fd <0 ) {
-                       fprintf(stderr, Name ": Cannot open %s: %s\n",
-                               dname, strerror(errno));
-                       fail=1;
-                       continue;
-               }
-               if (!get_dev_size(fd, dname, &ldsize)) {
-                       fail = 1;
-                       close(fd);
-                       continue;
-               }
                if (st == NULL) {
                        struct createinfo *ci = conf_get_create_info();
                        if (ci)
@@ -231,33 +297,46 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                }
                if (st == NULL) {
                        /* Need to choose a default metadata, which is different
-                        * depending on the sizes of devices
+                        * depending on geometry of array.
                         */
                        int i;
                        char *name = "default";
-                       if (level >= 1 && ldsize > (0x7fffffffULL<<10))
-                               name = "default/large";
-                       for(i=0; !st && superlist[i]; i++)
+                       for(i=0; !st && superlist[i]; i++) {
                                st = superlist[i]->match_metadata_desc(name);
+                               if (do_default_layout)
+                                       layout = default_layout(st, level, verbose);
+                               if (st && !st->ss->validate_geometry
+                                               (st, level, layout, raiddisks,
+                                                chunk, size*2, dname, &freesize,
+                                                verbose > 0))
+                                       st = NULL;
+                       }
 
                        if (!st) {
-                               fprintf(stderr, Name ": internal error - no default metadata style\n");
+                               fprintf(stderr, Name ": device %s not suitable "
+                                       "for any style of array\n",
+                                       dname);
                                exit(2);
                        }
-                       if (st->ss->major != 0 ||
+                       if (st->ss != &super0 ||
                            st->minor_version != 90)
-                               fprintf(stderr, Name ": Defaulting to version"
-                                       " %d.%d metadata\n",
-                                       st->ss->major,
-                                       st->minor_version);
-               }
-               freesize = st->ss->avail_size(st, ldsize >> 9);
-               if (freesize == 0) {
-                       fprintf(stderr, Name ": %s is too small: %luK\n",
-                               dname, (unsigned long)(ldsize>>10));
-                       fail = 1;
-                       close(fd);
-                       continue;
+                               did_default = 1;
+               } else {
+                       if (do_default_layout)
+                               layout = default_layout(st, level, verbose);
+                       if (!st->ss->validate_geometry(st, level, layout,
+                                                      raiddisks,
+                                                      chunk, size*2, dname,
+                                                      &freesize,
+                                                      verbose > 0)) {
+
+                               fprintf(stderr,
+                                       Name ": %s is not suitable for "
+                                       "this array.\n",
+                                       dname);
+                               fail = 1;
+                               continue;
+                       }
                }
 
                freesize /= 2; /* convert to K */
@@ -267,10 +346,10 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                }
 
                if (size && freesize < size) {
-                       fprintf(stderr, Name ": %s is smaller that given size."
-                               " %lluK < %lluK + superblock\n", dname, freesize, size);
+                       fprintf(stderr, Name ": %s is smaller than given size."
+                               " %lluK < %lluK + metadata\n",
+                               dname, freesize, size);
                        fail = 1;
-                       close(fd);
                        continue;
                }
                if (maxdisc == NULL || (maxdisc && freesize > maxsize)) {
@@ -282,24 +361,38 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                        minsize = freesize;
                }
                if (runstop != 1 || verbose >= 0) {
+                       int fd = open(dname, O_RDONLY);
+                       if (fd <0 ) {
+                               fprintf(stderr, Name ": Cannot open %s: %s\n",
+                                       dname, strerror(errno));
+                               fail=1;
+                               continue;
+                       }
                        warn |= check_ext2(fd, dname);
                        warn |= check_reiser(fd, dname);
                        warn |= check_raid(fd, dname);
+                       close(fd);
                }
-               close(fd);
        }
+       if (have_container)
+               info.array.working_disks = raiddisks;
        if (fail) {
                fprintf(stderr, Name ": create aborted\n");
                return 1;
        }
        if (size == 0) {
-               if (mindisc == NULL) {
+               if (mindisc == NULL && !have_container) {
                        fprintf(stderr, Name ": no size and no drives given - aborting create.\n");
                        return 1;
                }
-               if (level > 0 || level == LEVEL_MULTIPATH || level == LEVEL_FAULTY) {
+               if (level > 0 || level == LEVEL_MULTIPATH
+                   || level == LEVEL_FAULTY
+                   || st->ss->external ) {
                        /* size is meaningful */
-                       if (minsize > 0x100000000ULL && st->ss->major == 0) {
+                       if (!st->ss->validate_geometry(st, level, layout,
+                                                      raiddisks,
+                                                      chunk, minsize*2,
+                                                      NULL, NULL, 0)) {
                                fprintf(stderr, Name ": devices too large for RAID level %d\n", level);
                                return 1;
                        }
@@ -308,13 +401,21 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                                fprintf(stderr, Name ": size set to %lluK\n", size);
                }
        }
-       if (level > 0 && ((maxsize-size)*100 > maxsize)) {
+       if (!have_container && level > 0 && ((maxsize-size)*100 > maxsize)) {
                if (runstop != 1 || verbose >= 0)
-                       fprintf(stderr, Name ": largest drive (%s) exceed size (%lluK) by more than 1%%\n",
+                       fprintf(stderr, Name ": largest drive (%s) exceeds size (%lluK) by more than 1%%\n",
                                maxdisc, size);
                warn = 1;
        }
 
+       if (st->ss->detail_platform && st->ss->detail_platform(0, 1) != 0) {
+               if (runstop != 1 || verbose >= 0)
+                       fprintf(stderr, Name ": %s unable to enumerate platform support\n"
+                               "    array may not be compatible with hardware/firmware\n",
+                               st->ss->name);
+               warn = 1;
+       }
+
        if (warn) {
                if (runstop!= 1) {
                        if (!ask("Continue creating array? ")) {
@@ -331,7 +432,8 @@ int Create(struct supertype *st, char *mddev, int mdfd,
         * as missing, so that a reconstruct happens (faster than re-parity)
         * FIX: Can we do this for raid6 as well?
         */
-       if (assume_clean==0 && force == 0 && first_missing >= raiddisks) {
+       if (st->ss->external == 0 &&
+           assume_clean==0 && force == 0 && first_missing >= raiddisks) {
                switch ( level ) {
                case 4:
                case 5:
@@ -348,6 +450,7 @@ int Create(struct supertype *st, char *mddev, int mdfd,
         * into a spare, else the create will fail
         */
        if (assume_clean == 0 && force == 0 && first_missing < raiddisks &&
+           st->ss->external == 0 &&
            second_missing >= raiddisks && level == 6) {
                insert_point = raiddisks - 1;
                if (insert_point == first_missing)
@@ -357,12 +460,34 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                missing_disks++;
        }
 
-       if (level <= 0 && first_missing != subdevs * 2) {
+       if (level <= 0 && first_missing < subdevs * 2) {
                fprintf(stderr,
                        Name ": This level does not support missing devices\n");
                return 1;
        }
 
+       /* We need to create the device */
+       map_lock(&map);
+       mdfd = create_mddev(mddev, name, autof, LOCAL, chosen_name);
+       if (mdfd < 0)
+               return 1;
+       mddev = chosen_name;
+
+       vers = md_get_version(mdfd);
+       if (vers < 9000) {
+               fprintf(stderr, Name ": Create requires md driver version 0.90.0 or later\n");
+               goto abort;
+       } else {
+               mdu_array_info_t inf;
+               memset(&inf, 0, sizeof(inf));
+               ioctl(mdfd, GET_ARRAY_INFO, &inf);
+               if (inf.working_disks != 0) {
+                       fprintf(stderr, Name ": another array by this name"
+                               " is already running.\n");
+                       goto abort;
+               }
+       }
+
        /* Ok, lets try some ioctls */
 
        info.array.level = level;
@@ -382,12 +507,16 @@ int Create(struct supertype *st, char *mddev, int mdfd,
             ( level == 6 && (insert_point < raiddisks
                              || second_missing < raiddisks))
             ||
+            ( level <= 0 )
+            ||
             assume_clean
-               )
+               ) {
                info.array.state = 1; /* clean, but one+ drive will be missing*/
-       else
+               info.resync_start = ~0ULL;
+       } else {
                info.array.state = 0; /* not clean, but no errors */
-
+               info.resync_start = 0;
+       }
        if (level == 10) {
                /* for raid10, the bitmap size is the capacity of the array,
                 * which is array.size * raid_disks / ncopies;
@@ -424,7 +553,6 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                + info.array.failed_disks;
        info.array.layout = layout;
        info.array.chunk_size = chunk*1024;
-       info.array.major_version = st->ss->major;
 
        if (name == NULL || *name == 0) {
                /* base name on mddev */
@@ -435,6 +563,7 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                 *  /dev/md/home -> home
                 *  /dev/mdhome -> home
                 */
+               /* FIXME compare this with rules in create_mddev */
                name = strrchr(mddev, '/');
                if (name) {
                        name++;
@@ -451,7 +580,37 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                }
        }
        if (!st->ss->init_super(st, &info.array, size, name, homehost, uuid))
-               return 1;
+               goto abort;
+
+       total_slots = info.array.nr_disks;
+       sysfs_init(&info, mdfd, 0);
+       st->ss->getinfo_super(st, &info);
+
+       if (did_default && verbose >= 0) {
+               if (is_subarray(info.text_version)) {
+                       int dnum = devname2devnum(info.text_version+1);
+                       char *path;
+                       int mdp = get_mdp_major();
+                       struct mdinfo *mdi;
+                       if (dnum > 0)
+                               path = map_dev(MD_MAJOR, dnum, 1);
+                       else
+                               path = map_dev(mdp, (-1-dnum)<< 6, 1);
+
+                       mdi = sysfs_read(-1, dnum, GET_VERSION);
+
+                       fprintf(stderr, Name ": Creating array inside "
+                               "%s container %s\n", 
+                               mdi?mdi->text_version:"managed", path);
+                       sysfs_free(mdi);
+               } else
+                       fprintf(stderr, Name ": Defaulting to version"
+                               " %s metadata\n", info.text_version);
+       }
+
+       map_update(&map, fd2devnum(mdfd), info.text_version,
+                  info.uuid, chosen_name);
+       map_unlock(&map);
 
        if (bitmap_file && vers < 9003) {
                major_num = BITMAP_MAJOR_HOSTENDIAN;
@@ -464,31 +623,55 @@ int Create(struct supertype *st, char *mddev, int mdfd,
        if (bitmap_file && strcmp(bitmap_file, "internal")==0) {
                if ((vers%100) < 2) {
                        fprintf(stderr, Name ": internal bitmaps not supported by this kernel.\n");
-                       return 1;
+                       goto abort;
                }
                if (!st->ss->add_internal_bitmap(st, &bitmap_chunk,
                                                 delay, write_behind,
                                                 bitmapsize, 1, major_num)) {
                        fprintf(stderr, Name ": Given bitmap chunk size not supported.\n");
-                       return 1;
+                       goto abort;
                }
                bitmap_file = NULL;
        }
 
 
+       sysfs_init(&info, mdfd, 0);
 
-       if ((vers % 100) >= 1) { /* can use different versions */
-               mdu_array_info_t inf;
-               memset(&inf, 0, sizeof(inf));
-               inf.major_version = st->ss->major;
-               inf.minor_version = st->minor_version;
-               rv = ioctl(mdfd, SET_ARRAY_INFO, &inf);
-       } else
-               rv = ioctl(mdfd, SET_ARRAY_INFO, NULL);
+       if (st->ss->external && st->subarray[0]) {
+               /* member */
+
+               /* When creating a member, we need to be careful
+                * to negotiate with mdmon properly.
+                * If it is already running, we cannot write to
+                * the devices and must ask it to do that part.
+                * If it isn't running, we write to the devices,
+                * and then start it.
+                * We hold an exclusive open on the container
+                * device to make sure mdmon doesn't exit after
+                * we checked that it is running.
+                *
+                * For now, fail if it is already running.
+                */
+               container_fd = open_dev_excl(st->container_dev);
+               if (container_fd < 0) {
+                       fprintf(stderr, Name ": Cannot get exclusive "
+                               "open on container - weird.\n");
+                       goto abort;
+               }
+               if (mdmon_running(st->container_dev)) {
+                       if (verbose)
+                               fprintf(stderr, Name ": reusing mdmon "
+                                       "for %s.\n",
+                                       devnum2devname(st->container_dev));
+                       st->update_tail = &st->updates;
+               } else
+                       need_mdmon = 1;
+       }
+       rv = set_array_info(mdfd, st, &info);
        if (rv) {
-               fprintf(stderr, Name ": SET_ARRAY_INFO failed for %s: %s\n",
+               fprintf(stderr, Name ": failed to set array info for %s: %s\n",
                        mddev, strerror(errno));
-               return 1;
+               goto abort;
        }
 
        if (bitmap_file) {
@@ -499,22 +682,22 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                                 delay, write_behind,
                                 bitmapsize,
                                 major_num)) {
-                       return 1;
+                       goto abort;
                }
                bitmap_fd = open(bitmap_file, O_RDWR);
                if (bitmap_fd < 0) {
                        fprintf(stderr, Name ": weird: %s cannot be openned\n",
                                bitmap_file);
-                       return 1;
+                       goto abort;
                }
                if (ioctl(mdfd, SET_BITMAP_FILE, bitmap_fd) < 0) {
                        fprintf(stderr, Name ": Cannot set bitmap file for %s: %s\n",
                                mddev, strerror(errno));
-                       return 1;
+                       goto abort;
                }
        }
 
-
+       infos = malloc(sizeof(*infos) * total_slots);
 
        for (pass=1; pass <=2 ; pass++) {
                mddev_dev_t moved_disk = NULL; /* the disk that was moved out of the insert point */
@@ -523,76 +706,153 @@ int Create(struct supertype *st, char *mddev, int mdfd,
                     dv=(dv->next)?(dv->next):moved_disk, dnum++) {
                        int fd;
                        struct stat stb;
+                       struct mdinfo *inf = &infos[dnum];
 
-                       info.disk.number = dnum;
+                       if (dnum >= total_slots)
+                               abort();
                        if (dnum == insert_point) {
                                moved_disk = dv;
+                               continue;
                        }
-                       info.disk.raid_disk = info.disk.number;
-                       if (info.disk.raid_disk < raiddisks)
-                               info.disk.state = (1<<MD_DISK_ACTIVE) |
+                       if (strcasecmp(dv->devname, "missing")==0)
+                               continue;
+                       if (have_container)
+                               moved_disk = NULL;
+                       if (have_container && dnum < info.array.raid_disks - 1)
+                               /* repeatedly use the container */
+                               moved_disk = dv;
+
+                       switch(pass) {
+                       case 1:
+                               *inf = info;
+
+                               inf->disk.number = dnum;
+                               inf->disk.raid_disk = dnum;
+                               if (inf->disk.raid_disk < raiddisks)
+                                       inf->disk.state = (1<<MD_DISK_ACTIVE) |
                                                (1<<MD_DISK_SYNC);
-                       else
-                               info.disk.state = 0;
-                       if (dv->writemostly == 1)
-                               info.disk.state |= (1<<MD_DISK_WRITEMOSTLY);
-
-                       if (dnum == insert_point ||
-                           strcasecmp(dv->devname, "missing")==0) {
-                               info.disk.major = 0;
-                               info.disk.minor = 0;
-                               info.disk.state = (1<<MD_DISK_FAULTY);
-                       } else {
-                               fd = open(dv->devname, O_RDONLY|O_EXCL);
-                               if (fd < 0) {
-                                       fprintf(stderr, Name ": failed to open %s after earlier success - aborting\n",
-                                               dv->devname);
-                                       return 1;
+                               else
+                                       inf->disk.state = 0;
+
+                               if (dv->writemostly == 1)
+                                       inf->disk.state |= (1<<MD_DISK_WRITEMOSTLY);
+
+                               if (have_container)
+                                       fd = -1;
+                               else {
+                                       if (st->ss->external && st->subarray[0])
+                                               fd = open(dv->devname, O_RDWR);
+                                       else
+                                               fd = open(dv->devname, O_RDWR|O_EXCL);
+
+                                       if (fd < 0) {
+                                               fprintf(stderr, Name ": failed to open %s "
+                                                       "after earlier success - aborting\n",
+                                                       dv->devname);
+                                               goto abort;
+                                       }
+                                       fstat(fd, &stb);
+                                       inf->disk.major = major(stb.st_rdev);
+                                       inf->disk.minor = minor(stb.st_rdev);
+                               }
+                               if (fd >= 0)
+                                       remove_partitions(fd);
+                               if (st->ss->add_to_super(st, &inf->disk,
+                                                        fd, dv->devname))
+                                       goto abort;
+                               st->ss->getinfo_super(st, inf);
+                               safe_mode_delay = inf->safe_mode_delay;
+
+                               if (have_container && verbose > 0)
+                                       fprintf(stderr, Name ": Using %s for device %d\n",
+                                               map_dev(inf->disk.major,
+                                                       inf->disk.minor,
+                                                       0), dnum);
+
+                               if (!have_container) {
+                                       /* getinfo_super might have lost these ... */
+                                       inf->disk.major = major(stb.st_rdev);
+                                       inf->disk.minor = minor(stb.st_rdev);
                                }
-                               fstat(fd, &stb);
-                               info.disk.major = major(stb.st_rdev);
-                               info.disk.minor = minor(stb.st_rdev);
-                               remove_partitions(fd);
-                               close(fd);
-                       }
-                       switch(pass){
-                       case 1:
-                               st->ss->add_to_super(st, &info.disk);
                                break;
                        case 2:
-                               if (info.disk.state == 1) break;
-                               Kill(dv->devname, 0, 1); /* Just be sure it is clean */
-                               Kill(dv->devname, 0, 1); /* and again, there could be two superblocks */
-                               st->ss->write_init_super(st, &info.disk,
-                                                        dv->devname);
-
-                               if (ioctl(mdfd, ADD_NEW_DISK, &info.disk)) {
-                                       fprintf(stderr, Name ": ADD_NEW_DISK for %s failed: %s\n",
+                               inf->errors = 0;
+                               rv = 0;
+
+                               rv = add_disk(mdfd, st, &info, inf);
+
+                               if (rv) {
+                                       fprintf(stderr,
+                                               Name ": ADD_NEW_DISK for %s "
+                                               "failed: %s\n",
                                                dv->devname, strerror(errno));
                                        st->ss->free_super(st);
-                                       return 1;
+                                       goto abort;
                                }
-
                                break;
                        }
-                       if (dv == moved_disk && dnum != insert_point) break;
+                       if (!have_container &&
+                           dv == moved_disk && dnum != insert_point) break;
+               }
+               if (pass == 1) {
+                       st->ss->write_init_super(st);
+                       flush_metadata_updates(st);
                }
        }
+       free(infos);
        st->ss->free_super(st);
 
-       /* param is not actually used */
-       if (runstop == 1 || subdevs >= raiddisks) {
-               mdu_param_t param;
-               if (ioctl(mdfd, RUN_ARRAY, &param)) {
-                       fprintf(stderr, Name ": RUN_ARRAY failed: %s\n",
-                               strerror(errno));
-                       Manage_runstop(mddev, mdfd, -1, 0);
-                       return 1;
+       if (level == LEVEL_CONTAINER) {
+               /* No need to start.  But we should signal udev to
+                * create links */
+               sysfs_uevent(&info, "change");
+               if (verbose >= 0)
+                       fprintf(stderr, Name ": container %s prepared.\n", mddev);
+               wait_for(chosen_name);
+       } else if (runstop == 1 || subdevs >= raiddisks) {
+               if (st->ss->external) {
+                       switch(level) {
+                       case LEVEL_LINEAR:
+                       case LEVEL_MULTIPATH:
+                       case 0:
+                               sysfs_set_str(&info, NULL, "array_state",
+                                             "active");
+                               need_mdmon = 0;
+                               break;
+                       default:
+                               sysfs_set_str(&info, NULL, "array_state",
+                                             "readonly");
+                               break;
+                       }
+                       sysfs_set_safemode(&info, safe_mode_delay);
+               } else {
+                       /* param is not actually used */
+                       mdu_param_t param;
+                       if (ioctl(mdfd, RUN_ARRAY, &param)) {
+                               fprintf(stderr, Name ": RUN_ARRAY failed: %s\n",
+                                       strerror(errno));
+                               Manage_runstop(mddev, mdfd, -1, 0);
+                               goto abort;
+                       }
                }
                if (verbose >= 0)
                        fprintf(stderr, Name ": array %s started.\n", mddev);
+               if (st->ss->external && st->subarray[0]) {
+                       if (need_mdmon)
+                               start_mdmon(st->container_dev);
+
+                       ping_monitor(devnum2devname(st->container_dev));
+                       close(container_fd);
+               }
+               wait_for(chosen_name);
        } else {
                fprintf(stderr, Name ": not starting array - not enough devices.\n");
        }
+       close(mdfd);
        return 0;
+
+ abort:
+       if (mdfd >= 0)
+               close(mdfd);
+       return 1;
 }
index 8f86ead..dea605e 100644 (file)
--- a/Detail.c
+++ b/Detail.c
@@ -30,6 +30,7 @@
 #include       "mdadm.h"
 #include       "md_p.h"
 #include       "md_u.h"
+#include       <dirent.h>
 
 int Detail(char *dev, int brief, int export, int test, char *homehost)
 {
@@ -56,6 +57,8 @@ int Detail(char *dev, int brief, int export, int test, char *homehost)
        int max_disks = MD_SB_DISKS; /* just a default */
        struct mdinfo info;
        struct mdinfo *sra;
+       char *member = NULL;
+       char *container = NULL;
 
        int rv = test ? 4 : 1;
        int avail_disks = 0;
@@ -96,7 +99,21 @@ int Detail(char *dev, int brief, int export, int test, char *homehost)
                stb.st_rdev = 0;
        rv = 0;
 
-       if (st) max_disks = st->max_devs;
+       if (st)
+               max_disks = st->max_devs;
+
+       if (sra && is_subarray(sra->text_version) &&
+               strchr(sra->text_version+1, '/')) {
+               /* This is a subarray of some container.
+                * We want the name of the container, and the member
+                */
+               char *s = strchr(sra->text_version+1, '/');
+               int dn;
+               *s++ = '\0';
+               member = s;
+               dn = devname2devnum(sra->text_version+1);
+               container = map_dev(dev2major(dn), dev2minor(dn), 1);
+       }
 
        /* try to load a superblock */
        for (d= 0; d<max_disks; d++) {
@@ -111,7 +128,8 @@ int Detail(char *dev, int brief, int export, int test, char *homehost)
                        continue;
                if ((dv=map_dev(disk.major, disk.minor, 1))) {
                        if ((!st || !st->sb) &&
-                           (disk.state & (1<<MD_DISK_ACTIVE))) {
+                           (array.raid_disks == 0 || 
+                            (disk.state & (1<<MD_DISK_ACTIVE)))) {
                                /* try to read the superblock from this device
                                 * to get more info
                                 */
@@ -119,8 +137,9 @@ int Detail(char *dev, int brief, int export, int test, char *homehost)
                                if (fd2 >=0 && st &&
                                    st->ss->load_super(st, fd2, NULL) == 0) {
                                        st->ss->getinfo_super(st, &info);
-                                       if (info.array.ctime != array.ctime ||
-                                           info.array.level != array.level)
+                                       if (array.raid_disks != 0 && /* container */
+                                           (info.array.ctime != array.ctime ||
+                                            info.array.level != array.level))
                                                st->ss->free_super(st);
                                }
                                if (fd2 >= 0) close(fd2);
@@ -132,30 +151,69 @@ int Detail(char *dev, int brief, int export, int test, char *homehost)
        c = map_num(pers, array.level);
 
        if (export) {
-               if (c)
-                       printf("MD_LEVEL=%s\n", c);
-               printf("MD_DEVICES=%d\n", array.raid_disks);
-               if (sra && sra->array.major_version < 0)
-                       printf("MD_METADATA=%s\n", sra->text_version);
-               else
-                       printf("MD_METADATA=%d.%02d\n",
-                              array.major_version, array.minor_version);
+               if (array.raid_disks) {
+                       if (c)
+                               printf("MD_LEVEL=%s\n", c);
+                       printf("MD_DEVICES=%d\n", array.raid_disks);
+               } else {
+                       printf("MD_LEVEL=container\n");
+                       printf("MD_DEVICES=%d\n", array.nr_disks);
+               }
+               if (container) {
+                       printf("MD_CONTAINER=%s\n", container);
+                       printf("MD_MEMBER=%s\n", member);
+               } else {
+                       if (sra && sra->array.major_version < 0)
+                               printf("MD_METADATA=%s\n", sra->text_version);
+                       else
+                               printf("MD_METADATA=%d.%02d\n",
+                                      array.major_version, array.minor_version);
+               }
+               
+               if (st && st->sb) {
+                       struct mdinfo info;
+                       char nbuf[64];
+                       struct map_ent *mp, *map = NULL;
+                       st->ss->getinfo_super(st, &info);
+                       fname_from_uuid(st, &info, nbuf, ':');
+                       printf("MD_UUID=%s\n", nbuf+5);
+                       mp = map_by_uuid(&map, info.uuid);
+                       if (mp && mp->path &&
+                           strncmp(mp->path, "/dev/md/", 8) == 0)
+                               printf("MD_DEVNAME=%s\n", mp->path+8);
 
-               if (st && st->sb)
-                       st->ss->export_detail_super(st);
+                       if (st->ss->export_detail_super)
+                               st->ss->export_detail_super(st);
+               } else {
+                       struct map_ent *mp, *map = NULL;
+                       mp = map_by_devnum(&map, fd2devnum(fd));
+                       if (mp && mp->path &&
+                           strncmp(mp->path, "/dev/md/", 8) == 0)
+                               printf("MD_DEVNAME=%s\n", mp->path+8);
+               }
                goto out;
        }
 
        if (brief) {
                mdu_bitmap_file_t bmf;
-               printf("ARRAY %s level=%s num-devices=%d", dev,
-                      c?c:"-unknown-",
-                      array.raid_disks );
-               if (sra && sra->array.major_version < 0)
-                       printf(" metadata=%s", sra->text_version);
+               if (array.raid_disks)
+                       printf("ARRAY %s level=%s num-devices=%d", dev,
+                              c?c:"-unknown-",
+                              array.raid_disks );
                else
-                       printf(" metadata=%d.%02d",
-                              array.major_version, array.minor_version);
+                       printf("ARRAY %s level=container num-devices=%d",
+                              dev, array.nr_disks);
+
+               if (container) {
+                       printf(" container=%s", container);
+                       printf(" member=%s", member);
+               } else {
+                       if (sra && sra->array.major_version < 0)
+                               printf(" metadata=%s", sra->text_version);
+                       else
+                               printf(" metadata=%d.%02d",
+                                      array.major_version, array.minor_version);
+               }
 
                /* Only try GET_BITMAP_FILE for 0.90.01 and later */
                if (vers >= 9001 &&
@@ -180,14 +238,19 @@ int Detail(char *dev, int brief, int export, int test, char *homehost)
 
                printf("%s:\n", dev);
 
+               if (container)
+                       printf("      Container : %s, member %s\n", container, member);
+               else {
                if (sra && sra->array.major_version < 0)
                        printf("        Version : %s\n", sra->text_version);
                else
                        printf("        Version : %d.%02d\n",
                               array.major_version, array.minor_version);
+               }
 
                atime = array.ctime;
-               printf("  Creation Time : %.24s\n", ctime(&atime));
+               if (atime)
+                       printf("  Creation Time : %.24s\n", ctime(&atime));
                if (array.raid_disks == 0) c = "container";
                printf("     Raid Level : %s\n", c?c:"-unknown-");
                if (larray_size)
@@ -206,9 +269,13 @@ int Detail(char *dev, int brief, int export, int test, char *homehost)
                                printf("  Used Dev Size : %d%s\n", array.size,
                                       human_size((long long)array.size<<10));
                }
-               printf("   Raid Devices : %d\n", array.raid_disks);
+               if (array.raid_disks)
+                       printf("   Raid Devices : %d\n", array.raid_disks);
                printf("  Total Devices : %d\n", array.nr_disks);
-               printf("Preferred Minor : %d\n", array.md_minor);
+               if (!container && 
+                   ((sra == NULL && array.major_version == 0) ||
+                    (sra && sra->array.major_version == 0)))
+                       printf("Preferred Minor : %d\n", array.md_minor);
                if (sra == NULL || sra->array.major_version >= 0)
                        printf("    Persistence : Superblock is %spersistent\n",
                               array.not_persistent?"not ":"");
@@ -222,17 +289,22 @@ int Detail(char *dev, int brief, int export, int test, char *homehost)
                } else if (array.state & (1<<MD_SB_BITMAP_PRESENT))
                        printf("  Intent Bitmap : Internal\n\n");
                atime = array.utime;
-               printf("    Update Time : %.24s\n", ctime(&atime));
-               printf("          State : %s%s%s%s\n",
-                      (array.state&(1<<MD_SB_CLEAN))?"clean":"active",
-                      array.active_disks < array.raid_disks? ", degraded":"",
-                      (!e || e->percent < 0) ? "" :
-                       (e->resync) ? ", resyncing": ", recovering",
-                      larray_size ? "": ", Not Started");
-               printf(" Active Devices : %d\n", array.active_disks);
+               if (atime)
+                       printf("    Update Time : %.24s\n", ctime(&atime));
+               if (array.raid_disks)
+                       printf("          State : %s%s%s%s\n",
+                              (array.state&(1<<MD_SB_CLEAN))?"clean":"active",
+                              array.active_disks < array.raid_disks? ", degraded":"",
+                              (!e || e->percent < 0) ? "" :
+                              (e->resync) ? ", resyncing": ", recovering",
+                              larray_size ? "": ", Not Started");
+               if (array.raid_disks)
+                       printf(" Active Devices : %d\n", array.active_disks);
                printf("Working Devices : %d\n", array.working_disks);
-               printf(" Failed Devices : %d\n", array.failed_disks);
-               printf("  Spare Devices : %d\n", array.spare_disks);
+               if (array.raid_disks) {
+                       printf(" Failed Devices : %d\n", array.failed_disks);
+                       printf("  Spare Devices : %d\n", array.spare_disks);
+               }
                printf("\n");
                if (array.level == 5) {
                        c = map_num(r5layout, array.layout);
@@ -306,7 +378,45 @@ This is pretty boring
                if (st && st->sb)
                        st->ss->detail_super(st, homehost);
 
-               printf("    Number   Major   Minor   RaidDevice State\n");
+               if (array.raid_disks == 0 && sra && sra->array.major_version == -1
+                   && sra->array.minor_version == -2 && sra->text_version[0] != '/') {
+                       /* This looks like a container.  Find any active arrays
+                        * That claim to be a member.
+                        */
+                       DIR *dir = opendir("/sys/block");
+                       struct dirent *de;
+
+                       printf("  Member Arrays :");
+
+                       while (dir && (de = readdir(dir)) != NULL) {
+                               char path[200];
+                               char vbuf[1024];
+                               int nlen = strlen(sra->sys_name);
+                               int dn;
+                               if (de->d_name[0] == '.')
+                                       continue;
+                               sprintf(path, "/sys/block/%s/md/metadata_version",
+                                       de->d_name);
+                               if (load_sys(path, vbuf) < 0)
+                                       continue;
+                               if (strncmp(vbuf, "external:", 9) != 0 ||
+                                   !is_subarray(sra->sys_name+9) ||
+                                   strncmp(vbuf+10, sra->sys_name, nlen) != 0 ||
+                                   vbuf[10+nlen] != '/')
+                                       continue;
+                               dn = devname2devnum(de->d_name);
+                               printf(" %s", map_dev(dev2major(dn),
+                                                     dev2minor(dn), 1));
+                       }
+                       if (dir)
+                               closedir(dir);
+                       printf("\n\n");
+               }
+
+               if (array.raid_disks)
+                       printf("    Number   Major   Minor   RaidDevice State\n");
+               else
+                       printf("    Number   Major   Minor   RaidDevice\n");
        }
        disks = malloc(max_disks * sizeof(mdu_disk_info_t));
        for (d=0; d<max_disks; d++) {
@@ -350,6 +460,9 @@ This is pretty boring
                        else
                                printf("   %5d   %5d    %5d    %5d     ",
                                       disk.number, disk.major, disk.minor, disk.raid_disk);
+               }
+               if (!brief && array.raid_disks) {
+
                        if (disk.state & (1<<MD_DISK_FAULTY)) {
                                printf(" faulty");
                                if (disk.raid_disk < array.raid_disks &&
@@ -401,7 +514,7 @@ This is pretty boring
                }
                if (!brief) printf("\n");
        }
-       if (spares && brief) printf(" spares=%d", spares);
+       if (spares && brief && array.raid_disks) printf(" spares=%d", spares);
        if (brief && st && st->sb)
                st->ss->brief_detail_super(st);
        st->ss->free_super(st);
@@ -417,3 +530,44 @@ out:
        close(fd);
        return rv;
 }
+
+int Detail_Platform(struct superswitch *ss, int scan, int verbose)
+{
+       /* display platform capabilities for the given metadata format
+        * 'scan' in this context means iterate over all metadata types
+        */
+       int i;
+       int err = 1;
+
+       if (ss && ss->detail_platform)
+               err = ss->detail_platform(verbose, 0);
+       else if (ss) {
+               if (verbose)
+                       fprintf(stderr, Name ": %s metadata is platform independent\n",
+                               ss->name ? : "[no name]");
+       } else if (!scan) {
+               if (verbose)
+                       fprintf(stderr, Name ": specify a metadata type or --scan\n");
+       }
+
+       if (!scan)
+               return err;
+
+       for (i = 0; superlist[i]; i++) {
+               struct superswitch *meta = superlist[i];
+
+               if (meta == ss)
+                       continue;
+               if (verbose)
+                       fprintf(stderr, Name ": checking metadata %s\n",
+                               meta->name ? : "[no name]");
+               if (!meta->detail_platform) {
+                       if (verbose)
+                               fprintf(stderr, Name ": %s metadata is platform independent\n",
+                                       meta->name ? : "[no name]");
+               } else
+                       err |= meta->detail_platform(verbose, 0);
+       }
+
+       return err;
+}
index 5de9202..3827e7e 100644 (file)
--- a/Examine.c
+++ b/Examine.c
@@ -123,12 +123,13 @@ int Examine(mddev_dev_t devlist, int brief, int export, int scan,
                                st->ss->getinfo_super(st, &ap->info);
                                st->ss->free_super(st);
                        }
-                       if (!(ap->info.disk.state & MD_DISK_SYNC))
+                       if (!(ap->info.disk.state & (1<<MD_DISK_SYNC)))
                                ap->spares++;
                        d = dl_strdup(devlist->devname);
                        dl_add(ap->devs, d);
                } else if (export) {
-                       st->ss->export_examine_super(st);
+                       if (st->ss->export_examine_super)
+                               st->ss->export_examine_super(st);
                } else {
                        printf("%s:\n",devlist->devname);
                        st->ss->examine_super(st, homehost);
diff --git a/Grow.c b/Grow.c
index a8194bf..14e48f5 100644 (file)
--- a/Grow.c
+++ b/Grow.c
@@ -69,7 +69,7 @@ int Grow_Add_device(char *devname, int fd, char *newdev)
                return 1;
        }
 
-       nfd = open(newdev, O_RDWR|O_EXCL);
+       nfd = open(newdev, O_RDWR|O_EXCL|O_DIRECT);
        if (nfd < 0) {
                fprintf(stderr, Name ": cannot open %s\n", newdev);
                return 1;
@@ -396,7 +396,8 @@ struct mdp_backup_super {
        __u64   arraystart;
        __u64   length;
        __u32   sb_csum;        /* csum of preceeding bytes. */
-};
+       __u8 pad[512-68];
+} __attribute__((aligned(512))) bsb;
 
 int bsb_csum(char *buf, int len)
 {
@@ -420,7 +421,6 @@ int Grow_reshape(char *devname, int fd, int quiet, char *backup_file,
        struct mdu_array_info_s array;
        char *c;
 
-       struct mdp_backup_super bsb;
        struct supertype *st;
 
        int nlevel, olevel;
@@ -720,7 +720,8 @@ int Grow_reshape(char *devname, int fd, int quiet, char *backup_file,
                 * a leading superblock 4K earlier.
                 */
                for (i=array.raid_disks; i<d; i++) {
-                       char buf[4096];
+                       char abuf[4096+512];
+                       char *buf = (char*)(((unsigned long)abuf+511)& ~511);
                        if (i==d-1 && backup_file) {
                                /* This is the backup file */
                                offsets[i] = 8;
@@ -731,7 +732,7 @@ int Grow_reshape(char *devname, int fd, int quiet, char *backup_file,
                                fprintf(stderr, Name ": could not seek...\n");
                                goto abort;
                        }
-                       memset(buf, 0, sizeof(buf));
+                       memset(buf, 0, 4096);
                        bsb.devstart = __cpu_to_le64(offsets[i]);
                        bsb.sb_csum = bsb_csum((char*)&bsb, ((char*)&bsb.sb_csum)-((char*)&bsb));
                        memcpy(buf, &bsb, sizeof(bsb));
@@ -793,7 +794,7 @@ int Grow_reshape(char *devname, int fd, int quiet, char *backup_file,
                        if (lseek64(fdlist[i], (offsets[i]+last_block)<<9, 0) < 0 ||
                            write(fdlist[i], &bsb, sizeof(bsb)) != sizeof(bsb) ||
                            fsync(fdlist[i]) != 0) {
-                               fprintf(stderr, Name ": %s: fail to save metadata for critical region backups.\n",
+                               fprintf(stderr, Name ": %s: failed to save metadata for critical region backups.\n",
                                        devname);
                                goto abort_resume;
                        }
@@ -882,7 +883,6 @@ int Grow_restart(struct supertype *st, struct mdinfo *info, int *fdlist, int cnt
 
        for (i=old_disks-(backup_file?1:0); i<cnt; i++) {
                struct mdinfo dinfo;
-               struct mdp_backup_super bsb;
                char buf[4096];
                int fd;
 
index 08e0e6f..99fc1bf 100644 (file)
@@ -48,7 +48,8 @@ int Incremental(char *devname, int verbose, int runstop,
         * 2/ Find metadata, reject if none appropriate (check
         *       version/name from args)
         * 3/ Check if there is a match in mdadm.conf
-        * 3a/ if not, check for homehost match.  If no match, reject.
+        * 3a/ if not, check for homehost match.  If no match, assemble as
+        *    a 'foreign' array.
         * 4/ Determine device number.
         * - If in mdadm.conf with std name, use that
         * - UUID in /var/run/mdadm.map  use that
@@ -56,6 +57,7 @@ int Incremental(char *devname, int verbose, int runstop,
         * - Choose a free, high number.
         * - Use a partitioned device unless strong suggestion not to.
         *         e.g. auto=md
+        *   Don't choose partitioned for containers.
         * 5/ Find out if array already exists
         * 5a/ if it does not
         * - choose a name, from mdadm.conf or 'name' field in array.
@@ -67,6 +69,7 @@ int Incremental(char *devname, int verbose, int runstop,
         * - add the device
         * 6/ Make sure /var/run/mdadm.map contains this array.
         * 7/ Is there enough devices to possibly start the array?
+        *     For a container, this means running Incremental_container.
         * 7a/ if not, finish with success.
         * 7b/ if yes,
         * - read all metadata and arrange devices like -A does
@@ -74,20 +77,22 @@ int Incremental(char *devname, int verbose, int runstop,
         *   start the array (auto-readonly).
         */
        struct stat stb;
-       struct mdinfo info, info2;
+       struct mdinfo info;
        struct mddev_ident_s *array_list, *match;
        char chosen_name[1024];
        int rv;
-       int devnum;
        struct map_ent *mp, *map = NULL;
        int dfd, mdfd;
        char *avail;
        int active_disks;
+       int trustworthy = FOREIGN;
+       char *name_to_use;
+       mdu_array_info_t ainf;
+
        struct createinfo *ci = conf_get_create_info();
-       char *name;
 
 
-       /* 1/ Check if devices is permitted by mdadm.conf */
+       /* 1/ Check if device is permitted by mdadm.conf */
 
        if (!conf_test_dev(devname)) {
                if (verbose >= 0)
@@ -137,9 +142,10 @@ int Incremental(char *devname, int verbose, int runstop,
                close(dfd);
                return 1;
        }
-       st->ss->getinfo_super(st, &info);
        close (dfd);
 
+       memset(&info, 0, sizeof(info));
+       st->ss->getinfo_super(st, &info);
        /* 3/ Check if there is a match in mdadm.conf */
 
        array_list = conf_get_ident(NULL);
@@ -148,7 +154,7 @@ int Incremental(char *devname, int verbose, int runstop,
                if (array_list->uuid_set &&
                    same_uuid(array_list->uuid, info.uuid, st->ss->swapuuid)
                    == 0) {
-                       if (verbose >= 2)
+                       if (verbose >= 2 && array_list->devname)
                                fprintf(stderr, Name
                                        ": UUID differs from %s.\n",
                                        array_list->devname);
@@ -156,7 +162,7 @@ int Incremental(char *devname, int verbose, int runstop,
                }
                if (array_list->name[0] &&
                    strcasecmp(array_list->name, info.name) != 0) {
-                       if (verbose >= 2)
+                       if (verbose >= 2 && array_list->devname)
                                fprintf(stderr, Name
                                        ": Name differs from %s.\n",
                                        array_list->devname);
@@ -164,7 +170,7 @@ int Incremental(char *devname, int verbose, int runstop,
                }
                if (array_list->devices &&
                    !match_oneof(array_list->devices, devname)) {
-                       if (verbose >= 2)
+                       if (verbose >= 2 && array_list->devname)
                                fprintf(stderr, Name
                                        ": Not a listed device for %s.\n",
                                        array_list->devname);
@@ -172,7 +178,7 @@ int Incremental(char *devname, int verbose, int runstop,
                }
                if (array_list->super_minor != UnSet &&
                    array_list->super_minor != info.array.md_minor) {
-                       if (verbose >= 2)
+                       if (verbose >= 2 && array_list->devname)
                                fprintf(stderr, Name
                                        ": Different super-minor to %s.\n",
                                        array_list->devname);
@@ -182,7 +188,7 @@ int Incremental(char *devname, int verbose, int runstop,
                    !array_list->name[0] &&
                    !array_list->devices &&
                    array_list->super_minor == UnSet) {
-                       if (verbose  >= 2)
+                       if (verbose >= 2 && array_list->devname)
                                fprintf(stderr, Name
                             ": %s doesn't have any identifying information.\n",
                                        array_list->devname);
@@ -191,10 +197,15 @@ int Incremental(char *devname, int verbose, int runstop,
                /* FIXME, should I check raid_disks and level too?? */
 
                if (match) {
-                       if (verbose >= 0)
-                               fprintf(stderr, Name
+                       if (verbose >= 0) {
+                               if (match->devname && array_list->devname)
+                                       fprintf(stderr, Name
                   ": we match both %s and %s - cannot decide which to use.\n",
-                                       match->devname, array_list->devname);
+                                               match->devname, array_list->devname);
+                               else
+                                       fprintf(stderr, Name
+                                               ": multiple lines in mdadm.conf match\n");
+                       }
                        return 2;
                }
                match = array_list;
@@ -204,24 +215,13 @@ int Incremental(char *devname, int verbose, int runstop,
         * but don't trust the 'name' in the array. Thus a 'random' minor
         * number will be assigned, and the device name will be based
         * on that. */
-       name = info.name;
-       if (!match) {
-               if (homehost == NULL ||
-                   st->ss->match_home(st, homehost) == 0) {
-                       if (verbose >= 0)
-                               fprintf(stderr, Name
-             ": not found in mdadm.conf and not identified by homehost.\n");
-                       name = NULL;
-               }
-       }
-       /* 4/ Determine device number. */
-       /* - If in mdadm.conf with std name, get number from name. */
-       /* - UUID in /var/run/mdadm.map  get number from mapping */
-       /* - If name is suggestive, use that. unless in use with */
-       /*           different uuid. */
-       /* - Choose a free, high number. */
-       /* - Use a partitioned device unless strong suggestion not to. */
-       /*         e.g. auto=md */
+       if (match)
+               trustworthy = LOCAL;
+       else if (homehost == NULL ||
+                st->ss->match_home(st, homehost) != 1)
+               trustworthy = FOREIGN;
+       else
+               trustworthy = LOCAL;
 
        /* There are three possible sources for 'autof':  command line,
         * ARRAY line in mdadm.conf, or CREATE line in mdadm.conf.
@@ -233,86 +233,73 @@ int Incremental(char *devname, int verbose, int runstop,
        if (autof == 0)
                autof = ci->autof;
 
-       if (match && (rv = is_standard(match->devname, &devnum))) {
-               devnum = (rv > 0) ? (-1-devnum) : devnum;
-       } else if ((mp = map_by_uuid(&map, info.uuid)) != NULL)
-               devnum = mp->devnum;
-       else {
-               /* Have to guess a bit. */
-               int use_partitions = 1;
-               char *np, *ep;
-               if ((autof&7) == 3 || (autof&7) == 5)
-                       use_partitions = 0;
-               np = name ? strchr(name, ':') : ":NONAME";
-               if (np)
-                       np++;
-               else
-                       np = name;
-               devnum = strtoul(np, &ep, 10);
-               if (ep > np && *ep == 0) {
-                       /* This is a number.  Let check that it is unused. */
-                       if (mddev_busy(use_partitions ? (-1-devnum) : devnum))
-                               devnum = -1;
-               } else
-                       devnum = -1;
-
-               if (devnum < 0) {
-                       /* Haven't found anything yet, choose something free */
-                       devnum = find_free_devnum(use_partitions);
-
-                       if (devnum == NoMdDev) {
-                               fprintf(stderr, Name
-                                       ": No spare md devices!!\n");
-                               return 2;
-                       }
-               } else
-                       devnum = use_partitions ? (-1-devnum) : devnum;
+       if (st->ss->container_content && st->loaded_container) {
+               /* This is a pre-built container array, so we do something
+                * rather different.
+                */
+               return Incremental_container(st, devname, verbose, runstop,
+                                            autof, trustworthy);
        }
-       mdfd = open_mddev_devnum(match ? match->devname : NULL,
-                                devnum,
-                                name,
-                                chosen_name, autof >> 3);
-       if (mdfd < 0) {
-               fprintf(stderr, Name ": failed to open %s: %s.\n",
-                       chosen_name, strerror(errno));
-               return 2;
+       name_to_use = strchr(info.name, ':');
+       if (name_to_use)
+               name_to_use++;
+       else
+               name_to_use = info.name;
+
+       if ((!name_to_use || name_to_use[0] == 0) &&
+           info.array.level == LEVEL_CONTAINER &&
+           trustworthy == LOCAL) {
+               name_to_use = info.text_version;
+               trustworthy = METADATA;
        }
-       /* 5/ Find out if array already exists */
-       if (! mddev_busy(devnum)) {
-       /* 5a/ if it does not */
-       /* - choose a name, from mdadm.conf or 'name' field in array. */
-       /* - create the array */
-       /* - add the device */
-               mdu_array_info_t ainf;
-               mdu_disk_info_t disk;
-               char md[20];
+
+       /* 4/ Check if array exists.
+        */
+       map_lock(&map);
+       mp = map_by_uuid(&map, info.uuid);
+       if (mp) {
+               mdfd = open_mddev(mp->path, 0);
+               if (mdfd < 0 && mddev_busy(mp->devnum)) {
+                       /* maybe udev hasn't created it yet. */
+                       char buf[50];
+                       sprintf(buf, "%d:%d", dev2major(mp->devnum),
+                               dev2minor(mp->devnum));
+                       mdfd = dev_open(buf, O_RDWR);
+               }
+       } else
+               mdfd = -1;
+
+       if (mdfd < 0) {
                struct mdinfo *sra;
+               struct mdinfo dinfo;
 
-               memset(&ainf, 0, sizeof(ainf));
-               ainf.major_version = st->ss->major;
-               ainf.minor_version = st->minor_version;
-               if (ioctl(mdfd, SET_ARRAY_INFO, &ainf) != 0) {
-                       fprintf(stderr, Name
-                               ": SET_ARRAY_INFO failed for %s: %s\b",
+               /* Couldn't find an existing array, maybe make a new one */
+               mdfd = create_mddev(match ? match->devname : NULL,
+                                   name_to_use, autof, trustworthy, chosen_name);
+
+               if (mdfd < 0)
+                       return 1;
+
+               sysfs_init(&info, mdfd, 0);
+
+               if (set_array_info(mdfd, st, &info) != 0) {
+                       fprintf(stderr, Name ": failed to set array info for %s: %s\n",
                                chosen_name, strerror(errno));
                        close(mdfd);
                        return 2;
                }
-               sprintf(md, "%d.%d\n", st->ss->major, st->minor_version);
-               sra = sysfs_read(mdfd, devnum, GET_VERSION);
-               sysfs_set_str(sra, NULL, "metadata_version", md);
-               memset(&disk, 0, sizeof(disk));
-               disk.major = major(stb.st_rdev);
-               disk.minor = minor(stb.st_rdev);
-               sysfs_free(sra);
-               if (ioctl(mdfd, ADD_NEW_DISK, &disk) != 0) {
+
+               dinfo = info;
+               dinfo.disk.major = major(stb.st_rdev);
+               dinfo.disk.minor = minor(stb.st_rdev);
+               if (add_disk(mdfd, st, &info, &dinfo) != 0) {
                        fprintf(stderr, Name ": failed to add %s to %s: %s.\n",
                                devname, chosen_name, strerror(errno));
                        ioctl(mdfd, STOP_ARRAY, 0);
                        close(mdfd);
                        return 2;
                }
-               sra = sysfs_read(mdfd, devnum, GET_DEVS);
+               sra = sysfs_read(mdfd, fd2devnum(mdfd), GET_DEVS);
                if (!sra || !sra->devs || sra->devs->disk.raid_disk >= 0) {
                        /* It really should be 'none' - must be old buggy
                         * kernel, and mdadm -I may not be able to complete.
@@ -326,6 +313,12 @@ int Incremental(char *devname, int verbose, int runstop,
                        sysfs_free(sra);
                        return 2;
                }
+               info.array.working_disks = 1;
+               sysfs_free(sra);
+               /* 6/ Make sure /var/run/mdadm.map contains this array. */
+               map_update(&map, fd2devnum(mdfd),
+                          info.text_version,
+                          info.uuid, chosen_name);
        } else {
        /* 5b/ if it does */
        /* - check one drive in array to make sure metadata is a reasonably */
@@ -333,38 +326,31 @@ int Incremental(char *devname, int verbose, int runstop,
        /* - add the device */
                char dn[20];
                int dfd2;
-               mdu_disk_info_t disk;
                int err;
                struct mdinfo *sra;
                struct supertype *st2;
-               sra = sysfs_read(mdfd, devnum, (GET_VERSION | GET_DEVS |
-                                               GET_STATE));
+               struct mdinfo info2, *d;
+
+               strcpy(chosen_name, mp->path);
+
+               sra = sysfs_read(mdfd, fd2devnum(mdfd), (GET_DEVS | GET_STATE));
 
-               if (sra->array.major_version != st->ss->major ||
-                   sra->array.minor_version != st->minor_version) {
-                       if (verbose >= 0)
-                               fprintf(stderr, Name
-             ": %s has different metadata to chosen array %s %d.%d %d.%d.\n",
-                                       devname, chosen_name,
-                                       sra->array.major_version,
-                                       sra->array.minor_version,
-                                       st->ss->major, st->minor_version);
-                       close(mdfd);
-                       return 1;
-               }
                sprintf(dn, "%d:%d", sra->devs->disk.major,
                        sra->devs->disk.minor);
                dfd2 = dev_open(dn, O_RDONLY);
                st2 = dup_super(st);
-               if (st2->ss->load_super(st2, dfd2, NULL)) {
+               if (st2->ss->load_super(st2, dfd2, NULL) ||
+                   st->ss->compare_super(st, st2) != 0) {
                        fprintf(stderr, Name
-                               ": Strange error loading metadata for %s.\n",
-                               chosen_name);
+                               ": metadata mismatch between %s and "
+                               "chosen array %s\n",
+                               devname, chosen_name);
                        close(mdfd);
                        close(dfd2);
                        return 2;
                }
                close(dfd2);
+               memset(&info2, 0, sizeof(info2));
                st2->ss->getinfo_super(st2, &info2);
                st2->ss->free_super(st2);
                if (info.array.level != info2.array.level ||
@@ -376,17 +362,19 @@ int Incremental(char *devname, int verbose, int runstop,
                        close(mdfd);
                        return 2;
                }
-               memset(&disk, 0, sizeof(disk));
-               disk.major = major(stb.st_rdev);
-               disk.minor = minor(stb.st_rdev);
-               err = ioctl(mdfd, ADD_NEW_DISK, &disk);
+               info2.disk.major = major(stb.st_rdev);
+               info2.disk.minor = minor(stb.st_rdev);
+               /* add disk needs to know about containers */
+               if (st->ss->external)
+                       sra->array.level = LEVEL_CONTAINER;
+               err = add_disk(mdfd, st2, sra, &info2);
                if (err < 0 && errno == EBUSY) {
                        /* could be another device present with the same
                         * disk.number. Find and reject any such
                         */
                        find_reject(mdfd, st, sra, info.disk.number,
                                    info.events, verbose, chosen_name);
-                       err = ioctl(mdfd, ADD_NEW_DISK, &disk);
+                       err = add_disk(mdfd, st2, sra, &info2);
                }
                if (err < 0) {
                        fprintf(stderr, Name ": failed to add %s to %s: %s.\n",
@@ -394,25 +382,41 @@ int Incremental(char *devname, int verbose, int runstop,
                        close(mdfd);
                        return 2;
                }
+               info.array.working_disks = 0;
+               for (d = sra->devs; d; d=d->next)
+                       info.array.working_disks ++;
+                       
        }
-       /* 6/ Make sure /var/run/mdadm.map contains this array. */
-       map_update(&map, devnum,
-                  info.array.major_version,
-                  info.array.minor_version,
-                  info.uuid, chosen_name);
 
        /* 7/ Is there enough devices to possibly start the array? */
        /* 7a/ if not, finish with success. */
+       if (info.array.level == LEVEL_CONTAINER) {
+               /* Try to assemble within the container */
+               close(mdfd);
+               map_unlock(&map);
+               sysfs_uevent(&info, "change");
+               if (verbose >= 0)
+                       fprintf(stderr, Name
+                               ": container %s now has %d devices\n",
+                               chosen_name, info.array.working_disks);
+               wait_for(chosen_name);
+               if (runstop < 0)
+                       return 0; /* don't try to assemble */
+               return Incremental(chosen_name, verbose, runstop,
+                                  NULL, homehost, autof);
+       }
        avail = NULL;
        active_disks = count_active(st, mdfd, &avail, &info);
        if (enough(info.array.level, info.array.raid_disks,
                   info.array.layout, info.array.state & 1,
-                  avail, active_disks) == 0) {
+                  avail, active_disks) == 0 ||
+           (runstop < 0 && active_disks < info.array.raid_disks)) {
                free(avail);
                if (verbose >= 0)
                        fprintf(stderr, Name
                             ": %s attached to %s, not enough to start (%d).\n",
                                devname, chosen_name, active_disks);
+               map_unlock(&map);
                close(mdfd);
                return 0;
        }
@@ -423,18 +427,18 @@ int Incremental(char *devname, int verbose, int runstop,
        /*             are enough, */
        /*   + add any bitmap file  */
        /*   + start the array (auto-readonly). */
-{
-       mdu_array_info_t ainf;
 
        if (ioctl(mdfd, GET_ARRAY_INFO, &ainf) == 0) {
                if (verbose >= 0)
                        fprintf(stderr, Name
                           ": %s attached to %s which is already active.\n",
                                devname, chosen_name);
-               close (mdfd);
+               close(mdfd);
+               map_unlock(&map);
                return 0;
        }
-}
+
+       map_unlock(&map);
        if (runstop > 0 || active_disks >= info.array.working_disks) {
                struct mdinfo *sra;
                /* Let's try to start it */
@@ -457,9 +461,9 @@ int Incremental(char *devname, int verbose, int runstop,
                        }
                        close(bmfd);
                }
-               sra = sysfs_read(mdfd, devnum, 0);
+               sra = sysfs_read(mdfd, fd2devnum(mdfd), 0);
                if ((sra == NULL || active_disks >= info.array.working_disks)
-                   && name != NULL)
+                   && trustworthy != FOREIGN)
                        rv = ioctl(mdfd, RUN_ARRAY, NULL);
                else
                        rv = sysfs_set_str(sra, NULL,
@@ -470,6 +474,7 @@ int Incremental(char *devname, int verbose, int runstop,
                           ": %s attached to %s, which has been started.\n",
                                        devname, chosen_name);
                        rv = 0;
+                       wait_for(chosen_name);
                } else {
                        fprintf(stderr, Name
                              ": %s attached to %s, but failed to start: %s.\n",
@@ -620,12 +625,11 @@ int IncrementalScan(int verbose)
        devs = conf_get_ident(NULL);
 
        for (me = mapl ; me ; me = me->next) {
-               char path[1024];
                mdu_array_info_t array;
                mdu_bitmap_file_t bmf;
                struct mdinfo *sra;
-               int mdfd = open_mddev_devnum(me->path, me->devnum,
-                                            NULL, path, 0);
+               int mdfd = open_mddev(me->path, 0);
+
                if (mdfd < 0)
                        continue;
                if (ioctl(mdfd, GET_ARRAY_INFO, &array) == 0 ||
@@ -635,7 +639,8 @@ int IncrementalScan(int verbose)
                }
                /* Ok, we can try this one.   Maybe it needs a bitmap */
                for (mddev = devs ; mddev ; mddev = mddev->next)
-                       if (strcmp(mddev->devname, me->path) == 0)
+                       if (mddev->devname
+                           && strcmp(mddev->devname, me->path) == 0)
                                break;
                if (mddev && mddev->bitmap_file) {
                        /*
@@ -680,3 +685,115 @@ int IncrementalScan(int verbose)
        }
        return rv;
 }
+
+static char *container2devname(char *devname)
+{
+       char *mdname = NULL;
+
+       if (devname[0] == '/') {
+               int fd = open(devname, O_RDONLY);
+               if (fd >= 0) {
+                       mdname = devnum2devname(fd2devnum(fd));
+                       close(fd);
+               }
+       } else {
+               int uuid[4];
+               struct map_ent *mp, *map = NULL;
+                                       
+               if (!parse_uuid(devname, uuid))
+                       return mdname;
+               mp = map_by_uuid(&map, uuid);
+               if (mp)
+                       mdname = devnum2devname(mp->devnum);
+               map_free(map);
+       }
+
+       return mdname;
+}
+
+int Incremental_container(struct supertype *st, char *devname, int verbose,
+                         int runstop, int autof, int trustworthy)
+{
+       /* Collect the contents of this container and for each
+        * array, choose a device name and assemble the array.
+        */
+
+       struct mdinfo *list = st->ss->container_content(st);
+       struct mdinfo *ra;
+       struct map_ent *map = NULL;
+
+       map_lock(&map);
+
+       for (ra = list ; ra ; ra = ra->next) {
+               int mdfd;
+               char chosen_name[1024];
+               struct map_ent *mp;
+               struct mddev_ident_s *match = NULL;
+               int err;
+
+               mp = map_by_uuid(&map, ra->uuid);
+
+               if (mp) {
+                       mdfd = open_dev(mp->devnum);
+                       strcpy(chosen_name, mp->path);
+               } else {
+
+                       /* Check in mdadm.conf for devices == devname and
+                        * member == ra->text_version after second slash.
+                        */
+                       char *sub = strchr(ra->text_version+1, '/');
+                       struct mddev_ident_s *array_list;
+                       if (sub) {
+                               sub++;
+                               array_list = conf_get_ident(NULL);
+                       } else
+                               array_list = NULL;
+                       for(; array_list ; array_list = array_list->next) {
+                               char *dn;
+                               if (array_list->member == NULL ||
+                                   array_list->container == NULL)
+                                       continue;
+                               if (strcmp(array_list->member, sub) != 0)
+                                       continue;
+                               if (array_list->uuid_set &&
+                                   !same_uuid(ra->uuid, array_list->uuid, st->ss->swapuuid))
+                                       continue;
+                               dn = container2devname(array_list->container);
+                               if (dn == NULL)
+                                       continue;
+                               if (strncmp(dn, ra->text_version+1,
+                                           strlen(dn)) != 0 ||
+                                   ra->text_version[strlen(dn)+1] != '/') {
+                                       free(dn);
+                                       continue;
+                               }
+                               free(dn);
+                               /* we have a match */
+                               match = array_list;
+                               if (verbose>0)
+                                       fprintf(stderr, Name ": match found for member %s\n",
+                                               array_list->member);
+                               break;
+                       }
+
+                       mdfd = create_mddev(match ? match->devname : NULL,
+                                           ra->name,
+                                           autof,
+                                           trustworthy,
+                                           chosen_name);
+               }
+
+               if (mdfd < 0) {
+                       fprintf(stderr, Name ": failed to open %s: %s.\n",
+                               chosen_name, strerror(errno));
+                       return 2;
+               }
+
+               err = assemble_container_content(st, mdfd, ra, runstop,
+                                                chosen_name, verbose);
+               if (err)
+                       return err;
+       }
+       map_unlock(&map);
+       return 0;
+}
diff --git a/Kill.c b/Kill.c
index b1e19b5..96b270f 100644 (file)
--- a/Kill.c
+++ b/Kill.c
@@ -34,7 +34,7 @@
 #include       "md_u.h"
 #include       "md_p.h"
 
-int Kill(char *dev, int force, int quiet)
+int Kill(char *dev, int force, int quiet, int noexcl)
 {
        /*
         * Nothing fancy about Kill.  It just zeroes out a superblock
@@ -44,6 +44,8 @@ int Kill(char *dev, int force, int quiet)
        int fd, rv = 0;
        struct supertype *st;
 
+       if (force)
+               noexcl = 1;
        fd = open(dev, O_RDWR|(force ? 0 : O_EXCL));
        if (fd < 0) {
                if (!quiet)
@@ -63,10 +65,8 @@ int Kill(char *dev, int force, int quiet)
        if (force && rv >= 2)
                rv = 0; /* ignore bad data in superblock */
        if (rv== 0 || (force && rv >= 2)) {
-               mdu_array_info_t info;
-               info.major_version = -1; /* zero superblock */
                st->ss->free_super(st);
-               st->ss->init_super(st, &info, 0, "", NULL, NULL);
+               st->ss->init_super(st, NULL, 0, "", NULL, NULL);
                if (st->ss->store_super(st, fd)) {
                        if (!quiet)
                                fprintf(stderr, Name ": Could not zero superblock on %s\n",
index 24ad694..94a55d9 100644 (file)
--- a/Makefile
+++ b/Makefile
 # e.g.  make CXFLAGS=-O to optimise
 TCC = tcc
 UCLIBC_GCC = $(shell for nm in i386-uclibc-linux-gcc i386-uclibc-gcc; do which $$nm > /dev/null && { echo $$nm ; exit; } ; done; echo false No uclibc found )
-DIET_GCC = diet gcc
+#DIET_GCC = diet gcc
+# sorry, but diet-libc doesn't know about posix_memalign, 
+# so we cannot use it any more.
+DIET_GCC = gcc -DHAVE_STDINT_H
 
 KLIBC=/home/src/klibc/klibc-0.77
 
@@ -40,6 +43,9 @@ KLIBC_GCC = gcc -nostdinc -iwithprefix include -I$(KLIBC)/klibc/include -I$(KLIB
 CC = $(CROSS_COMPILE)gcc
 CXFLAGS = -ggdb
 CWFLAGS = -Wall -Werror -Wstrict-prototypes
+ifdef WARN_UNUSED
+CWFLAGS += -Wp,-D_FORTIFY_SOURCE=2 -O
+endif
 
 ifdef DEBIAN
 CPPFLAGS= -DDEBIAN
@@ -69,27 +75,37 @@ MAN8DIR = $(MANDIR)/man8
 OBJS =  mdadm.o config.o mdstat.o  ReadMe.o util.o Manage.o Assemble.o Build.o \
        Create.o Detail.o Examine.o Grow.o Monitor.o dlink.o Kill.o Query.o \
        Incremental.o \
-       mdopen.o super0.o super1.o bitmap.o restripe.o sysfs.o sha1.o \
-       mapfile.o
+       mdopen.o super0.o super1.o super-ddf.o super-intel.o bitmap.o \
+       restripe.o sysfs.o sha1.o mapfile.o crc32.o sg_io.o msg.o \
+       platform-intel.o probe_roms.o
+
 SRCS =  mdadm.c config.c mdstat.c  ReadMe.c util.c Manage.c Assemble.c Build.c \
        Create.c Detail.c Examine.c Grow.c Monitor.c dlink.c Kill.c Query.c \
        Incremental.c \
-       mdopen.c super0.c super1.c bitmap.c restripe.c sysfs.c sha1.c \
-       mapfile.c
+       mdopen.c super0.c super1.c super-ddf.c super-intel.c bitmap.c \
+       restripe.c sysfs.c sha1.c mapfile.c crc32.c sg_io.c msg.c \
+       platform-intel.c probe_roms.c
+
+MON_OBJS = mdmon.o monitor.o managemon.o util.o mdstat.o sysfs.o config.o \
+       Kill.o sg_io.o dlink.o ReadMe.o super0.o super1.o super-intel.o \
+       super-ddf.o sha1.o crc32.o msg.o Monitor.o bitmap.o \
+       platform-intel.o probe_roms.o
+
 
 STATICSRC = pwgr.c
 STATICOBJS = pwgr.o
 
 ASSEMBLE_SRCS := mdassemble.c Assemble.c Manage.c config.c dlink.c util.c \
-       super0.c super1.c sha1.c sysfs.c
-ASSEMBLE_AUTO_SRCS := mdopen.c mdstat.c
+       super0.c super1.c super-ddf.c super-intel.c sha1.c crc32.c sg_io.c mdstat.c \
+       platform-intel.c probe_roms.c sysfs.c
+ASSEMBLE_AUTO_SRCS := mdopen.c
 ASSEMBLE_FLAGS:= $(CFLAGS) -DMDASSEMBLE
 ifdef MDASSEMBLE_AUTO
 ASSEMBLE_SRCS += $(ASSEMBLE_AUTO_SRCS)
 ASSEMBLE_FLAGS += -DMDASSEMBLE_AUTO
 endif
 
-all : mdadm mdadm.man md.man mdadm.conf.man
+all : mdadm mdmon mdadm.man md.man mdadm.conf.man
 
 everything: all mdadm.static swap_super test_stripe \
        mdassemble mdassemble.auto mdassemble.static mdassemble.man \
@@ -119,6 +135,10 @@ mdadm.Os : $(SRCS) mdadm.h
 mdadm.O2 : $(SRCS) mdadm.h
        gcc -o mdadm.O2 $(CFLAGS)  -DHAVE_STDINT_H -O2 $(SRCS)
 
+mdmon : $(MON_OBJS)
+       $(CC) $(LDFLAGS) -o mdmon $(MON_OBJS) $(LDLIBS)
+msg.o: msg.c msg.h
+
 test_stripe : restripe.c mdadm.h
        $(CC) $(CXFLAGS) $(LDFLAGS) -o test_stripe -DMAIN restripe.c
 
@@ -156,13 +176,15 @@ mdadm.conf.man : mdadm.conf.5
 mdassemble.man : mdassemble.8
        nroff -man mdassemble.8 > mdassemble.man
 
-$(OBJS) : mdadm.h bitmap.h
+$(OBJS) : mdadm.h mdmon.h bitmap.h
+$(MON_OBJS) : mdadm.h mdmon.h bitmap.h
 
 sha1.o : sha1.c sha1.h md5.h
        $(CC) $(CFLAGS) -DHAVE_STDINT_H -o sha1.o -c sha1.c
 
-install : mdadm install-man
+install : mdadm mdmon install-man install-udev
        $(INSTALL) -D $(STRIP) -m 755 mdadm $(DESTDIR)$(BINDIR)/mdadm
+       $(INSTALL) -D $(STRIP) -m 755 mdmon $(DESTDIR)$(BINDIR)/mdmon
 
 install-static : mdadm.static install-man
        $(INSTALL) -D $(STRIP) -m 755 mdadm.static $(DESTDIR)$(BINDIR)/mdadm
@@ -181,6 +203,9 @@ install-man: mdadm.8 md.4 mdadm.conf.5
        $(INSTALL) -D -m 644 md.4 $(DESTDIR)$(MAN4DIR)/md.4
        $(INSTALL) -D -m 644 mdadm.conf.5 $(DESTDIR)$(MAN5DIR)/mdadm.conf.5
 
+install-udev: udev-md-raid.rules
+       $(INSTALL) -D -m 644 udev-md-raid.rules $(DESTDIR)/lib/udev/rules.d/64-md-raid.rules
+
 uninstall:
        rm -f $(DESTDIR)$(MAN8DIR)/mdadm.8 md.4 $(DESTDIR)$(MAN4DIR)/md.4 $(DESTDIR)$(MAN5DIR)/mdadm.conf.5 $(DESTDIR)$(BINDIR)/mdadm
 
@@ -188,7 +213,8 @@ test: mdadm test_stripe swap_super
        @echo "Please run 'sh ./test' as root"
 
 clean : 
-       rm -f mdadm $(OBJS) $(STATICOBJS) core *.man mdadm.tcc mdadm.uclibc mdadm.static *.orig *.porig *.rej *.alt \
+       rm -f mdadm mdmon $(OBJS) $(MON_OBJS) $(STATICOBJS) core *.man \
+       mdadm.tcc mdadm.uclibc mdadm.static *.orig *.porig *.rej *.alt \
        mdadm.Os mdadm.O2 \
        mdassemble mdassemble.static mdassemble.auto mdassemble.uclibc \
        mdassemble.klibc swap_super \
index 160778e..7afd89b 100644 (file)
--- a/Manage.c
+++ b/Manage.c
@@ -30,6 +30,7 @@
 #include "mdadm.h"
 #include "md_u.h"
 #include "md_p.h"
+#include <ctype.h>
 
 #define REGISTER_DEV           _IO (MD_MAJOR, 1)
 #define START_MD               _IO (MD_MAJOR, 2)
@@ -45,11 +46,57 @@ int Manage_ro(char *devname, int fd, int readonly)
         *
         */
        mdu_array_info_t array;
+#ifndef MDASSEMBLE
+       struct mdinfo *mdi;
+#endif
 
        if (md_get_version(fd) < 9000) {
                fprintf(stderr, Name ": need md driver version 0.90.0 or later\n");
                return 1;
        }
+#ifndef MDASSEMBLE
+       /* If this is an externally-manage array, we need to modify the
+        * metadata_version so that mdmon doesn't undo our change.
+        */
+       mdi = sysfs_read(fd, -1, GET_LEVEL|GET_VERSION);
+       if (mdi &&
+           mdi->array.major_version == -1 &&
+           mdi->array.level > 0 &&
+           is_subarray(mdi->text_version)) {
+               char vers[64];
+               strcpy(vers, "external:");
+               strcat(vers, mdi->text_version);
+               if (readonly > 0) {
+                       int rv;
+                       /* We set readonly ourselves. */
+                       vers[9] = '-';
+                       sysfs_set_str(mdi, NULL, "metadata_version", vers);
+
+                       close(fd);
+                       rv = sysfs_set_str(mdi, NULL, "array_state", "readonly");
+
+                       if (rv < 0) {
+                               fprintf(stderr, Name ": failed to set readonly for %s: %s\n",
+                                       devname, strerror(errno));
+
+                               vers[9] = mdi->text_version[0];
+                               sysfs_set_str(mdi, NULL, "metadata_version", vers);
+                               return 1;
+                       }
+               } else {
+                       char *cp;
+                       /* We cannot set read/write - must signal mdmon */
+                       vers[9] = '/';
+                       sysfs_set_str(mdi, NULL, "metadata_version", vers);
+
+                       cp = strchr(vers+10, '/');
+                       if (*cp)
+                               *cp = 0;
+                       ping_monitor(vers+10);
+               }
+               return 0;
+       }
+#endif
        if (ioctl(fd, GET_ARRAY_INFO, &array)) {
                fprintf(stderr, Name ": %s does not appear to be active.\n",
                        devname);
@@ -74,17 +121,70 @@ int Manage_ro(char *devname, int fd, int readonly)
 
 #ifndef MDASSEMBLE
 
+static void remove_devices(int devnum, char *path)
+{
+       /* Remove all 'standard' devices for 'devnum', including
+        * partitions.  Also remove names at 'path' - possibly with
+        * partition suffixes - which link to those names.
+        */
+       char base[40];
+       char *path2;
+       char link[1024];
+       int n;
+       int part;
+       char *be;
+       char *pe;
+
+       if (devnum >= 0)
+               sprintf(base, "/dev/md%d", devnum);
+       else
+               sprintf(base, "/dev/md_d%d", -1-devnum);
+       be = base + strlen(base);
+       if (path) {
+               path2 = malloc(strlen(path)+20);
+               strcpy(path2, path);
+               pe = path2 + strlen(path2);
+       } else
+               path = NULL;
+       
+       for (part = 0; part < 16; part++) {
+               if (part) {
+                       sprintf(be, "p%d", part);
+                       if (path) {
+                               if (isdigit(pe[-1]))
+                                       sprintf(pe, "p%d", part);
+                               else
+                                       sprintf(pe, "%d", part);
+                       }
+               }
+               /* FIXME test if really is md device ?? */
+               unlink(base);
+               if (path) {
+                       n = readlink(path2, link, sizeof(link));
+                       if (n && strlen(base) == n &&
+                           strncmp(link, base, n) == 0)
+                               unlink(path2);
+               }
+       }
+}
+       
+
 int Manage_runstop(char *devname, int fd, int runstop, int quiet)
 {
        /* Run or stop the array. array must already be configured
         * required >= 0.90.0
+        * Only print failure messages if quiet == 0;
+        * quiet > 0 means really be quiet
+        * quiet < 0 means we will try again if it fails.
         */
        mdu_param_t param; /* unused */
 
        if (runstop == -1 && md_get_version(fd) < 9000) {
                if (ioctl(fd, STOP_MD, 0)) {
-                       if (!quiet) fprintf(stderr, Name ": stopping device %s failed: %s\n",
-                                           devname, strerror(errno));
+                       if (quiet == 0) fprintf(stderr,
+                                               Name ": stopping device %s "
+                                               "failed: %s\n",
+                                               devname, strerror(errno));
                        return 1;
                }
        }
@@ -111,25 +211,77 @@ int Manage_runstop(char *devname, int fd, int runstop, int quiet)
        } else if (runstop < 0){
                struct map_ent *map = NULL;
                struct stat stb;
-               if (ioctl(fd, STOP_ARRAY, NULL)) {
-                       if (quiet==0) {
-                               fprintf(stderr, Name ": fail to stop array %s: %s\n",
+               struct mdinfo *mdi;
+               int devnum;
+               /* If this is an mdmon managed array, just write 'inactive'
+                * to the array state and let mdmon clear up.
+                */
+               devnum = fd2devnum(fd);
+               mdi = sysfs_read(fd, -1, GET_LEVEL|GET_VERSION);
+               if (mdi &&
+                   mdi->array.level > 0 &&
+                   is_subarray(mdi->text_version)) {
+                       /* This is mdmon managed. */
+                       close(fd);
+                       if (sysfs_set_str(mdi, NULL,
+                                         "array_state", "inactive") < 0) {
+                               if (quiet == 0)
+                                       fprintf(stderr, Name
+                                               ": failed to stop array %s: %s\n",
+                                               devname, strerror(errno));
+                               return 1;
+                       }
+
+                       /* Give monitor a chance to act */
+                       ping_monitor(mdi->text_version);
+
+                       fd = open(devname, O_RDONLY);
+               } else if (mdi &&
+                          mdi->array.major_version == -1 &&
+                          mdi->array.minor_version == -2 &&
+                          !is_subarray(mdi->text_version)) {
+                       /* container, possibly mdmon-managed.
+                        * Make sure mdmon isn't opening it, which
+                        * would interfere with the 'stop'
+                        */
+                       ping_monitor(mdi->sys_name);
+               }
+
+               if (fd >= 0 && ioctl(fd, STOP_ARRAY, NULL)) {
+                       if (quiet == 0) {
+                               fprintf(stderr, Name
+                                       ": failed to stop array %s: %s\n",
                                        devname, strerror(errno));
                                if (errno == EBUSY)
                                        fprintf(stderr, "Perhaps a running "
                                                "process, mounted filesystem "
                                                "or active volume group?\n");
                        }
+                       if (mdi)
+                               sysfs_free(mdi);
                        return 1;
                }
+               /* prior to 2.6.28, KOBJ_CHANGE was not sent when an md array
+                * was stopped, so We'll do it here just to be sure.  Drop any
+                * partitions as well...
+                */
+               if (fd >= 0)
+                       ioctl(fd, BLKRRPART, 0);
+               if (mdi)
+                       sysfs_uevent(mdi, "change");
+
+               
+               if (devnum != NoMdDev &&
+                   (stat("/dev/.udev", &stb) != 0 ||
+                    check_env("MDADM_NO_UDEV"))) {
+                       struct map_ent *mp = map_by_devnum(&map, devnum);
+                       remove_devices(devnum, mp ? mp->path : NULL);
+               }
+
+
                if (quiet <= 0)
                        fprintf(stderr, Name ": stopped %s\n", devname);
-               if (fstat(fd, &stb) == 0) {
-                       int devnum;
-                       if (major(stb.st_rdev) == MD_MAJOR)
-                               devnum = minor(stb.st_rdev);
-                       else
-                               devnum = -1-(minor(stb.st_rdev)>>6);
+               if (devnum != NoMdDev) {
                        map_delete(&map, devnum);
                        map_write(map);
                        map_free(map);
@@ -201,6 +353,7 @@ int Manage_subdevs(char *devname, int fd,
        struct supertype *st, *tst;
        int duuid[4];
        int ouuid[4];
+       int lfd = -1;
 
        if (ioctl(fd, GET_ARRAY_INFO, &array)) {
                fprintf(stderr, Name ": cannot get array info for %s\n",
@@ -227,6 +380,7 @@ int Manage_subdevs(char *devname, int fd,
                unsigned long long ldsize;
                char dvname[20];
                char *dnprintable = dv->devname;
+               int err;
 
                next = dv->next;
                jnext = 0;
@@ -311,9 +465,14 @@ int Manage_subdevs(char *devname, int fd,
                        return 1;
                case 'a':
                        /* add the device */
-
+                       if (tst->subarray[0]) {
+                               fprintf(stderr, Name ": Cannot add disks to a"
+                                       " \'member\' array, perform this"
+                                       " operation on the parent container\n");
+                               return 1;
+                       }
                        /* Make sure it isn't in use (in 2.6 or later) */
-                       tfd = open(dv->devname, O_RDONLY|O_EXCL);
+                       tfd = open(dv->devname, O_RDONLY|O_EXCL|O_DIRECT);
                        if (tfd < 0) {
                                fprintf(stderr, Name ": Cannot open %s: %s\n",
                                        dv->devname, strerror(errno));
@@ -332,7 +491,9 @@ int Manage_subdevs(char *devname, int fd,
                        }
                        close(tfd);
 
-                       if (array.major_version == 0 &&
+
+                       if (!tst->ss->external &&
+                           array.major_version == 0 &&
                            md_get_version(fd)%100 < 2) {
                                if (ioctl(fd, HOT_ADD_DISK,
                                          (unsigned long)stb.st_rdev)==0) {
@@ -347,12 +508,16 @@ int Manage_subdevs(char *devname, int fd,
                                return 1;
                        }
 
-                       if (array.not_persistent == 0) {
+                       if (array.not_persistent == 0 || tst->ss->external) {
 
                                /* need to find a sample superblock to copy, and
-                                * a spare slot to use
+                                * a spare slot to use.
+                                * For 'external' array (well, container based),
+                                * We can just load the metadata for the array.
                                 */
-                               for (j = 0; j < tst->max_devs; j++) {
+                               if (tst->ss->external) {
+                                       tst->ss->load_super(tst, fd, NULL);
+                               } else for (j = 0; j < tst->max_devs; j++) {
                                        char *dev;
                                        int dfd;
                                        disc.number = j;
@@ -374,6 +539,7 @@ int Manage_subdevs(char *devname, int fd,
                                        close(dfd);
                                        break;
                                }
+                               /* FIXME this is a bad test to be using */
                                if (!tst->sb) {
                                        fprintf(stderr, Name ": cannot find valid superblock in this array - HELP\n");
                                        return 1;
@@ -453,12 +619,21 @@ int Manage_subdevs(char *devname, int fd,
                        disc.minor = minor(stb.st_rdev);
                        disc.number =j;
                        disc.state = 0;
-                       if (array.not_persistent==0) {
+                       if (array.not_persistent==0 || tst->ss->external) {
+                               int dfd;
                                if (dv->writemostly == 1)
                                        disc.state |= 1 << MD_DISK_WRITEMOSTLY;
-                               tst->ss->add_to_super(tst, &disc);
-                               if (tst->ss->write_init_super(tst, &disc,
-                                                             dv->devname))
+                               dfd = open(dv->devname, O_RDWR | O_EXCL|O_DIRECT);
+                               if (tst->ss->add_to_super(tst, &disc, dfd,
+                                                         dv->devname)) {
+                                       close(dfd);
+                                       return 1;
+                               }
+                               /* write_init_super will close 'dfd' */
+                               if (tst->ss->external)
+                                       /* mdmon will write the metadata */
+                                       close(dfd);
+                               else if (tst->ss->write_init_super(tst))
                                        return 1;
                        } else if (dv->re_add) {
                                /*  this had better be raid1.
@@ -491,7 +666,52 @@ int Manage_subdevs(char *devname, int fd,
                        }
                        if (dv->writemostly == 1)
                                disc.state |= (1 << MD_DISK_WRITEMOSTLY);
-                       if (ioctl(fd,ADD_NEW_DISK, &disc)) {
+                       if (tst->ss->external) {
+                               /* add a disk to an external metadata container
+                                * only if mdmon is around to see it
+                                */
+                               struct mdinfo new_mdi;
+                               struct mdinfo *sra;
+                               int container_fd;
+                               int devnum = fd2devnum(fd);
+
+                               container_fd = open_dev_excl(devnum);
+                               if (container_fd < 0) {
+                                       fprintf(stderr, Name ": add failed for %s:"
+                                               " could not get exclusive access to container\n",
+                                               dv->devname);
+                                       return 1;
+                               }
+
+                               if (!mdmon_running(devnum)) {
+                                       fprintf(stderr, Name ": add failed for %s: mdmon not running\n",
+                                               dv->devname);
+                                       close(container_fd);
+                                       return 1;
+                               }
+
+                               sra = sysfs_read(container_fd, -1, 0);
+                               if (!sra) {
+                                       fprintf(stderr, Name ": add failed for %s: sysfs_read failed\n",
+                                               dv->devname);
+                                       close(container_fd);
+                                       return 1;
+                               }
+                               sra->array.level = LEVEL_CONTAINER;
+                               /* Need to set data_offset and component_size */
+                               tst->ss->getinfo_super(tst, &new_mdi);
+                               new_mdi.disk.major = disc.major;
+                               new_mdi.disk.minor = disc.minor;
+                               if (sysfs_add_disk(sra, &new_mdi) != 0) {
+                                       fprintf(stderr, Name ": add new device to external metadata"
+                                               " failed for %s\n", dv->devname);
+                                       close(container_fd);
+                                       return 1;
+                               }
+                               ping_monitor(devnum2devname(devnum));
+                               sysfs_free(sra);
+                               close(container_fd);
+                       } else if (ioctl(fd, ADD_NEW_DISK, &disc)) {
                                fprintf(stderr, Name ": add new device failed for %s as %d: %s\n",
                                        dv->devname, j, strerror(errno));
                                return 1;
@@ -502,13 +722,94 @@ int Manage_subdevs(char *devname, int fd,
 
                case 'r':
                        /* hot remove */
+                       if (tst->subarray[0]) {
+                               fprintf(stderr, Name ": Cannot remove disks from a"
+                                       " \'member\' array, perform this"
+                                       " operation on the parent container\n");
+                               return 1;
+                       }
+                       if (tst->ss->external) {
+                               /* To remove a device from a container, we must
+                                * check that it isn't in use in an array.
+                                * This involves looking in the 'holders'
+                                * directory - there must be just one entry,
+                                * the container.
+                                * To ensure that it doesn't get used as a
+                                * hold spare while we are checking, we
+                                * get an O_EXCL open on the container
+                                */
+                               int dnum = fd2devnum(fd);
+                               lfd = open_dev_excl(dnum);
+                               if (lfd < 0) {
+                                       fprintf(stderr, Name
+                                               ": Cannot get exclusive access "
+                                               " to container - odd\n");
+                                       return 1;
+                               }
+                               /* in the detached case it is not possible to
+                                * check if we are the unique holder, so just
+                                * rely on the 'detached' checks
+                                */
+                               if (strcmp(dv->devname, "detached") == 0 ||
+                                   sysfs_unique_holder(dnum, stb.st_rdev))
+                                       /* pass */;
+                               else {
+                                       fprintf(stderr, Name
+                                               ": %s is %s, cannot remove.\n",
+                                               dnprintable,
+                                               errno == EEXIST ? "still in use":
+                                               "not a member");
+                                       close(lfd);
+                                       return 1;
+                               }
+                       }
                        /* FIXME check that it is a current member */
-                       if (ioctl(fd, HOT_REMOVE_DISK, (unsigned long)stb.st_rdev)) {
+                       err = ioctl(fd, HOT_REMOVE_DISK, (unsigned long)stb.st_rdev);
+                       if (err && errno == ENODEV) {
+                               /* Old kernels rejected this if no personality
+                                * registered */
+                               struct mdinfo *sra = sysfs_read(fd, 0, GET_DEVS);
+                               struct mdinfo *dv = NULL;
+                               if (sra)
+                                       dv = sra->devs;
+                               for ( ; dv ; dv=dv->next)
+                                       if (dv->disk.major == major(stb.st_rdev) &&
+                                           dv->disk.minor == minor(stb.st_rdev))
+                                               break;
+                               if (dv)
+                                       err = sysfs_set_str(sra, dv,
+                                                           "state", "remove");
+                               else
+                                       err = -1;
+                               if (sra)
+                                       sysfs_free(sra);
+                       }
+                       if (err) {
                                fprintf(stderr, Name ": hot remove failed "
                                        "for %s: %s\n", dnprintable,
                                        strerror(errno));
+                               if (lfd >= 0)
+                                       close(lfd);
                                return 1;
                        }
+                       if (tst->ss->external) {
+                               /*
+                                * Before dropping our exclusive open we make an
+                                * attempt at preventing mdmon from seeing an
+                                * 'add' event before reconciling this 'remove'
+                                * event.
+                                */
+                               char *name = devnum2devname(fd2devnum(fd));
+
+                               if (!name) {
+                                       fprintf(stderr, Name ": unable to get container name\n");
+                                       return 1;
+                               }
+
+                               ping_manager(name);
+                               free(name);
+                       }
+                       close(lfd);
                        if (verbose >= 0)
                                fprintf(stderr, Name ": hot removed %s\n",
                                        dnprintable);
index 3825600..af53129 100644 (file)
--- a/Monitor.c
+++ b/Monitor.c
@@ -165,7 +165,10 @@ int Monitor(mddev_dev_t devlist,
        if (devlist == NULL) {
                mddev_ident_t mdlist = conf_get_ident(NULL);
                for (; mdlist; mdlist=mdlist->next) {
-                       struct state *st = malloc(sizeof *st);
+                       struct state *st;
+                       if (mdlist->devname == NULL)
+                               continue;
+                       st = malloc(sizeof *st);
                        if (st == NULL)
                                continue;
                        st->devname = strdup(mdlist->devname);
@@ -604,10 +607,7 @@ int Wait(char *dev)
                        strerror(errno));
                return 2;
        }
-       if (major(stb.st_rdev) == MD_MAJOR)
-               devnum = minor(stb.st_rdev);
-       else
-               devnum = -1-(minor(stb.st_rdev)/64);
+       devnum = stat2devnum(&stb);
 
        while(1) {
                struct mdstat_ent *ms = mdstat_read(1, 0);
@@ -618,6 +618,13 @@ int Wait(char *dev)
                                break;
 
                if (!e || e->percent < 0) {
+                       if (e && e->metadata_version &&
+                           strncmp(e->metadata_version, "external:", 9) == 0) {
+                               if (is_subarray(&e->metadata_version[9]))
+                                       ping_monitor(&e->metadata_version[9]);
+                               else
+                                       ping_monitor(devnum2devname(devnum));
+                       }
                        free_mdstat(ms);
                        return rv;
                }
@@ -626,3 +633,107 @@ int Wait(char *dev)
                mdstat_wait(5);
        }
 }
+
+static char *clean_states[] = {
+       "clear", "inactive", "readonly", "read-auto", "clean", NULL };
+
+int WaitClean(char *dev, int verbose)
+{
+       int fd;
+       struct mdinfo *mdi;
+       int rv = 1;
+       int devnum;
+
+       fd = open(dev, O_RDONLY); 
+       if (fd < 0) {
+               if (verbose)
+                       fprintf(stderr, Name ": Couldn't open %s: %s\n", dev, strerror(errno));
+               return 1;
+       }
+
+       devnum = fd2devnum(fd);
+       mdi = sysfs_read(fd, devnum, GET_VERSION|GET_LEVEL|GET_SAFEMODE);
+       if (!mdi) {
+               if (verbose)
+                       fprintf(stderr, Name ": Failed to read sysfs attributes for "
+                               "%s\n", dev);
+               close(fd);
+               return 0;
+       }
+
+       switch(mdi->array.level) {
+       case LEVEL_LINEAR:
+       case LEVEL_MULTIPATH:
+       case 0:
+               /* safemode delay is irrelevant for these levels */
+               rv = 0;
+               
+       }
+
+       /* for internal metadata the kernel handles the final clean
+        * transition, containers can never be dirty
+        */
+       if (!is_subarray(mdi->text_version))
+               rv = 0;
+
+       /* safemode disabled ? */
+       if (mdi->safe_mode_delay == 0)
+               rv = 0;
+
+       if (rv) {
+               int state_fd = sysfs_open(fd2devnum(fd), NULL, "array_state");
+               char buf[20];
+               fd_set fds;
+               struct timeval tm;
+
+               /* minimize the safe_mode_delay and prepare to wait up to 5s
+                * for writes to quiesce
+                */
+               sysfs_set_safemode(mdi, 1);
+               tm.tv_sec = 5;
+               tm.tv_usec = 0;
+
+               /* give mdmon a chance to checkpoint resync */
+               sysfs_set_str(mdi, NULL, "sync_action", "idle");
+
+               FD_ZERO(&fds);
+
+               /* wait for array_state to be clean */
+               while (1) {
+                       rv = read(state_fd, buf, sizeof(buf));
+                       if (rv < 0)
+                               break;
+                       if (sysfs_match_word(buf, clean_states) <= 4)
+                               break;
+                       FD_SET(state_fd, &fds);
+                       rv = select(state_fd + 1, &fds, NULL, NULL, &tm);
+                       if (rv < 0 && errno != EINTR)
+                               break;
+                       lseek(state_fd, 0, SEEK_SET);
+               }
+               if (rv < 0)
+                       rv = 1;
+               else if (ping_monitor(mdi->text_version) == 0) {
+                       /* we need to ping to close the window between array
+                        * state transitioning to clean and the metadata being
+                        * marked clean
+                        */
+                       rv = 0;
+               } else
+                       rv = 1;
+               if (rv && verbose)
+                       fprintf(stderr, Name ": Error waiting for %s to be clean\n",
+                               dev);
+
+               /* restore the original safe_mode_delay */
+               sysfs_set_safemode(mdi, mdi->safe_mode_delay);
+               close(state_fd);
+       }
+
+       sysfs_free(mdi);
+       close(fd);
+
+       return rv;
+}
+
+
diff --git a/Query.c b/Query.c
index 190ee29..dc69eb8 100644 (file)
--- a/Query.c
+++ b/Query.c
@@ -96,7 +96,7 @@ int Query(char *dev)
        if (superror == 0) {
                /* array might be active... */
                st->ss->getinfo_super(st, &info);
-               if (st->ss->major == 0) {
+               if (st->ss == &super0) {
                        mddev = get_md_name(info.array.md_minor);
                        disc.number = info.disk.number;
                        activity = "undetected";
@@ -121,7 +121,7 @@ int Query(char *dev)
                       activity,
                       map_num(pers, info.array.level),
                       mddev);
-               if (st->ss->major == 0)
+               if (st->ss == &super0)
                        put_md_name(mddev);
        }
        return 0;
index 818be0a..dd53c48 100644 (file)
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -24,7 +24,7 @@
 
 #include "mdadm.h"
 
-char Version[] = Name " - v2.6.9 - 10th March 2009\n";
+char Version[] = Name " - v3.0-devel2 - 5th November 2008\n";
 
 /*
  * File: ReadMe.c
@@ -107,6 +107,7 @@ struct option long_options[] = {
     {"query",    0, 0, 'Q'},
     {"examine-bitmap", 0, 0, 'X'},
     {"auto-detect", 0, 0, AutoDetect},
+    {"detail-platform", 0, 0, DetailPlatform},
 
     /* synonyms */
     {"monitor",   0, 0, 'F'},
@@ -161,6 +162,7 @@ struct option long_options[] = {
     {"readwrite", 0, 0, 'w'},
     {"no-degraded",0,0,  NoDegraded },
     {"wait",     0, 0, 'W'},
+    {"wait-clean", 0, 0, Waitclean },
 
     /* For Detail/Examine */
     {"brief",    0, 0, 'b'},
@@ -465,6 +467,7 @@ char Help_misc[] =
 "  --query       -Q   : Display general information about how a\n"
 "                       device relates to the md driver\n"
 "  --detail      -D   : Display details of an array\n"
+"  --detail-platform  : Display hardware/firmware details\n"
 "  --examine     -E   : Examine superblock on an array component\n"
 "  --examine-bitmap -X: Display contents of a bitmap file\n"
 "  --zero-superblock  : erase the MD superblock from a device.\n"
@@ -581,16 +584,49 @@ char Help_config[] =
 /* name/number mappings */
 
 mapping_t r5layout[] = {
-       { "left-asymmetric", 0},
-       { "right-asymmetric", 1},
-       { "left-symmetric", 2},
-       { "right-symmetric", 3},
-
-       { "default", 2},
-       { "la", 0},
-       { "ra", 1},
-       { "ls", 2},
-       { "rs", 3},
+       { "left-asymmetric", ALGORITHM_LEFT_ASYMMETRIC},
+       { "right-asymmetric", ALGORITHM_RIGHT_ASYMMETRIC},
+       { "left-symmetric", ALGORITHM_LEFT_SYMMETRIC},
+       { "right-symmetric", ALGORITHM_RIGHT_SYMMETRIC},
+
+       { "default", ALGORITHM_LEFT_SYMMETRIC},
+       { "la", ALGORITHM_LEFT_ASYMMETRIC},
+       { "ra", ALGORITHM_RIGHT_ASYMMETRIC},
+       { "ls", ALGORITHM_LEFT_SYMMETRIC},
+       { "rs", ALGORITHM_RIGHT_SYMMETRIC},
+
+       { "parity-first", ALGORITHM_PARITY_0},
+       { "parity-last", ALGORITHM_PARITY_N},
+       { "ddf-zero-restart", ALGORITHM_RIGHT_ASYMMETRIC},
+       { "ddf-N-restart", ALGORITHM_LEFT_ASYMMETRIC},
+       { "ddf-N-continue", ALGORITHM_LEFT_SYMMETRIC},
+
+       { NULL, 0}
+};
+mapping_t r6layout[] = {
+       { "left-asymmetric", ALGORITHM_LEFT_ASYMMETRIC},
+       { "right-asymmetric", ALGORITHM_RIGHT_ASYMMETRIC},
+       { "left-symmetric", ALGORITHM_LEFT_SYMMETRIC},
+       { "right-symmetric", ALGORITHM_RIGHT_SYMMETRIC},
+
+       { "default", ALGORITHM_LEFT_SYMMETRIC},
+       { "la", ALGORITHM_LEFT_ASYMMETRIC},
+       { "ra", ALGORITHM_RIGHT_ASYMMETRIC},
+       { "ls", ALGORITHM_LEFT_SYMMETRIC},
+       { "rs", ALGORITHM_RIGHT_SYMMETRIC},
+
+       { "parity-first", ALGORITHM_PARITY_0},
+       { "parity-last", ALGORITHM_PARITY_N},
+       { "ddf-zero-restart", ALGORITHM_ROTATING_ZERO_RESTART},
+       { "ddf-N-restart", ALGORITHM_ROTATING_N_RESTART},
+       { "ddf-N-continue", ALGORITHM_ROTATING_N_CONTINUE},
+
+       { "left-asymmetric-6", ALGORITHM_LEFT_ASYMMETRIC_6},
+       { "right-asymmetric-6", ALGORITHM_RIGHT_ASYMMETRIC_6},
+       { "left-symmetric-6", ALGORITHM_LEFT_SYMMETRIC_6},
+       { "right-symmetric-6", ALGORITHM_RIGHT_SYMMETRIC_6},
+       { "parity-first-6", ALGORITHM_PARITY_0_6},
+
        { NULL, 0}
 };
 
@@ -613,6 +649,7 @@ mapping_t pers[] = {
        { "raid10", 10},
        { "10", 10},
        { "faulty", LEVEL_FAULTY},
+       { "container", LEVEL_CONTAINER},
        { NULL, 0}
 };
 
diff --git a/TODO b/TODO
index f79163b..279d20d 100644 (file)
--- a/TODO
+++ b/TODO
@@ -1,3 +1,38 @@
+ - add 'name' field to metadata type and use it.
+ - use validate_geometry more
+ - metadata should be able to check/reject bitmap stuff.
+
+DDF:
+  Three new metadata types:
+    ddf - used only to create a container.
+    ddf-bvd - used to create an array in a container
+    ddf-svd - used to create a secondary array from bvds.
+
+  Usage:
+    mdadm -C /dev/ddf1 /dev/sd[abcdef]
+    mdadm -C /dev/md1 -e ddf /dev/sd[a-f]
+    mdadm -C /dev/md1 -l container /dev/sd[a-f]
+
+        Each of these create a new ddf container using all those
+       devices.  The name 'ddf*' signals that ddf metadata should be used.
+       '-e ddf' only supports one level - 'container'.  'container' is only
+       supported by ddf.
+
+    mdadm -C /dev/md1 -l0 -n4 /dev/ddf1 # or maybe not ???
+    mdadm -C /dev/md1 -l1 -n2 /dev/sda /dev/sdb
+       If exactly one device is given, and it is a container, we select
+       devices from that container.
+       If devices are given that are already in use, they must be in use by
+       a container, and the array is created in the container.
+       If devices given are bvds, we slip under the hood to make
+         the svd arrays.
+
+    mdadm -A /dev/ddf ......
+       base drives make a container.  Anything in that container is started
+        auto-read-only.
+        if /dev/ddf is already assembled, we assemble bvds and svds inside it.
+
+
 2005-dec-20
   Want an incremental assembly mode to work nicely with udev.
   Core usage would be something like
index 352be5d..b9bbaeb 100644 (file)
--- a/bitmap.c
+++ b/bitmap.c
@@ -131,11 +131,13 @@ bitmap_info_t *bitmap_fd_read(int fd, int brief)
         */
        unsigned long long total_bits = 0, read_bits = 0, dirty_bits = 0;
        bitmap_info_t *info;
-       char *buf, *unaligned;
+       void *buf;
        int n, skip;
 
-       unaligned = malloc(8192*2);
-       buf = (char*) ((unsigned long)unaligned | 8191)+1;
+       if (posix_memalign(&buf, 512, 8192) != 0) {
+               fprintf(stderr, Name ": failed to allocate 8192 bytes\n");
+               return NULL;
+       }
        n = read(fd, buf, 8192);
 
        info = malloc(sizeof(*info));
@@ -154,7 +156,6 @@ bitmap_info_t *bitmap_fd_read(int fd, int brief)
                fprintf(stderr, Name ": failed to read superblock of bitmap "
                        "file: %s\n", strerror(errno));
                free(info);
-               free(unaligned);
                return NULL;
        }
        memcpy(&info->sb, buf, sizeof(info->sb));
index 78bbb9d..7e09b5c 100644 (file)
--- a/config.c
+++ b/config.c
@@ -261,12 +261,44 @@ mddev_dev_t load_partitions(void)
                d->devname = strdup(name);
                d->next = rv;
                d->used = 0;
+               d->content = NULL;
                rv = d;
        }
        fclose(f);
        return rv;
 }
 
+mddev_dev_t load_containers(void)
+{
+       struct mdstat_ent *mdstat = mdstat_read(1, 0);
+       struct mdstat_ent *ent;
+       mddev_dev_t d;
+       mddev_dev_t rv = NULL;
+
+       if (!mdstat)
+               return NULL;
+
+       for (ent = mdstat; ent; ent = ent->next)
+               if (ent->metadata_version &&
+                   strncmp(ent->metadata_version, "external:", 9) == 0 &&
+                   !is_subarray(&ent->metadata_version[9])) {
+                       d = malloc(sizeof(*d));
+                       if (!d)
+                               continue;
+                       if (asprintf(&d->devname, "/dev/%s", ent->dev) < 0) {
+                               free(d);
+                               continue;
+                       }
+                       d->next = rv;
+                       d->used = 0;
+                       d->content = NULL;
+                       rv = d;
+               }
+       free_mdstat(mdstat);
+
+       return rv;
+}
+
 struct createinfo createinfo = {
        .autof = 2, /* by default, create devices with standard names */
        .symlinks = 1,
@@ -398,7 +430,8 @@ void devline(char *line)
        struct conf_dev *cd;
 
        for (w=dl_next(line); w != line; w=dl_next(w)) {
-               if (w[0] == '/' || strcasecmp(w, "partitions") == 0) {
+               if (w[0] == '/' || strcasecmp(w, "partitions") == 0 ||
+                   strcasecmp(w, "containers") == 0) {
                        cd = malloc(sizeof(*cd));
                        cd->name = strdup(w);
                        cd->next = cdevlist;
@@ -434,6 +467,8 @@ void arrayline(char *line)
        mis.bitmap_fd = -1;
        mis.bitmap_file = NULL;
        mis.name[0] = 0;
+       mis.container = NULL;
+       mis.member = NULL;
 
        for (w=dl_next(line); w!=line; w=dl_next(w)) {
                if (w[0] == '/') {
@@ -516,19 +551,24 @@ void arrayline(char *line)
                } else if (strncasecmp(w, "auto=", 5) == 0 ) {
                        /* whether to create device special files as needed */
                        mis.autof = parse_auto(w+5, "auto type", 0);
+               } else if (strncasecmp(w, "member=", 7) == 0) {
+                       /* subarray within a container */
+                       mis.member = strdup(w+7);
+               } else if (strncasecmp(w, "container=", 10) == 0) {
+                       /* the container holding this subarray.  Either a device name
+                        * or a uuid */
+                       mis.container = strdup(w+10);
                } else {
                        fprintf(stderr, Name ": unrecognised word on ARRAY line: %s\n",
                                w);
                }
        }
-       if (mis.devname == NULL)
-               fprintf(stderr, Name ": ARRAY line with no device\n");
-       else if (mis.uuid_set == 0 && mis.devices == NULL && mis.super_minor == UnSet && mis.name[0] == 0)
+       if (mis.uuid_set == 0 && mis.devices == NULL && mis.super_minor == UnSet && mis.name[0] == 0)
                fprintf(stderr, Name ": ARRAY line %s has no identity information.\n", mis.devname);
        else {
                mi = malloc(sizeof(*mi));
                *mi = mis;
-               mi->devname = strdup(mis.devname);
+               mi->devname = mis.devname ? strdup(mis.devname) : NULL;
                mi->next = NULL;
                *mddevlp = mi;
                mddevlp = &mi->next;
@@ -558,10 +598,12 @@ void mailfromline(char *line)
                if (alert_mail_from == NULL)
                        alert_mail_from = strdup(w);
                else {
-                       char *t= NULL;
-                       xasprintf(&t, "%s %s", alert_mail_from, w);
-                       free(alert_mail_from);
-                       alert_mail_from = t;
+                       char *t = NULL;
+
+                       if (xasprintf(&t, "%s %s", alert_mail_from, w) > 0) {
+                               free(alert_mail_from);
+                               alert_mail_from = t;
+                       }
                }
        }
 }
@@ -711,11 +753,19 @@ mddev_ident_t conf_get_ident(char *dev)
        mddev_ident_t rv;
        load_conffile();
        rv = mddevlist;
-       while (dev && rv && strcmp(dev, rv->devname)!=0)
+       while (dev && rv && (rv->devname == NULL
+                            || strcmp(dev, rv->devname)!=0))
                rv = rv->next;
        return rv;
 }
 
+static void append_dlist(mddev_dev_t *dlp, mddev_dev_t list)
+{
+       while (*dlp)
+               dlp = &(*dlp)->next;
+       *dlp = list;
+}
+
 mddev_dev_t conf_get_devs()
 {
        glob_t globbuf;
@@ -733,13 +783,17 @@ mddev_dev_t conf_get_devs()
 
        load_conffile();
 
-       if (cdevlist == NULL)
-               /* default to 'partitions */
+       if (cdevlist == NULL) {
+               /* default to 'partitions' and 'containers' */
                dlist = load_partitions();
+               append_dlist(&dlist, load_containers());
+       }
 
        for (cd=cdevlist; cd; cd=cd->next) {
-               if (strcasecmp(cd->name, "partitions")==0 && dlist == NULL)
-                       dlist = load_partitions();
+               if (strcasecmp(cd->name, "partitions")==0)
+                       append_dlist(&dlist, load_partitions());
+               else if (strcasecmp(cd->name, "containers")==0)
+                       append_dlist(&dlist, load_containers());
                else {
                        glob(cd->name, flags, NULL, &globbuf);
                        flags |= GLOB_APPEND;
@@ -751,6 +805,7 @@ mddev_dev_t conf_get_devs()
                        t->devname = strdup(globbuf.gl_pathv[i]);
                        t->next = dlist;
                        t->used = 0;
+                       t->content = NULL;
                        dlist = t;
 /*     printf("one dev is %s\n", t->devname);*/
                }
diff --git a/crc32.c b/crc32.c
new file mode 100644 (file)
index 0000000..12d08e5
--- /dev/null
+++ b/crc32.c
@@ -0,0 +1,340 @@
+/* crc32.c -- compute the CRC-32 of a data stream
+ * Copyright (C) 1995-2003 Mark Adler
+ * For conditions of distribution and use, see copyright notice in zlib.h
+ *
+ * Thanks to Rodney Brown <rbrown64@csc.com.au> for his contribution of faster
+ * CRC methods: exclusive-oring 32 bits of data at a time, and pre-computing
+ * tables for updating the shift register in one step with three exclusive-ors
+ * instead of four steps with four exclusive-ors.  This results about a factor
+ * of two increase in speed on a Power PC G4 (PPC7455) using gcc -O3.
+ */
+
+/* @(#) $Id$ */
+
+/*
+  Note on the use of DYNAMIC_CRC_TABLE: there is no mutex or semaphore
+  protection on the static variables used to control the first-use generation
+  of the crc tables.  Therefore, if you #define DYNAMIC_CRC_TABLE, you should
+  first call get_crc_table() to initialize the tables before allowing more than
+  one thread to use crc32().
+ */
+
+#ifdef MAKECRCH
+#  include <stdio.h>
+#  ifndef DYNAMIC_CRC_TABLE
+#    define DYNAMIC_CRC_TABLE
+#  endif /* !DYNAMIC_CRC_TABLE */
+#endif /* MAKECRCH */
+
+/* #include "zutil.h"      / * for STDC and FAR definitions */
+#define STDC
+#define FAR
+#define Z_NULL ((void*)0)
+#define OF(X) X
+#define ZEXPORT
+typedef long ptrdiff_t;
+#define NOBYFOUR
+
+#define local static
+
+/* Find a four-byte integer type for crc32_little() and crc32_big(). */
+#ifndef NOBYFOUR
+#  ifdef STDC           /* need ANSI C limits.h to determine sizes */
+#    include <limits.h>
+#    define BYFOUR
+#    if (UINT_MAX == 0xffffffffUL)
+       typedef unsigned int u4;
+#    else
+#      if (ULONG_MAX == 0xffffffffUL)
+         typedef unsigned long u4;
+#      else
+#        if (USHRT_MAX == 0xffffffffUL)
+           typedef unsigned short u4;
+#        else
+#          undef BYFOUR     /* can't find a four-byte integer type! */
+#        endif
+#      endif
+#    endif
+#  endif /* STDC */
+#endif /* !NOBYFOUR */
+
+/* Definitions for doing the crc four data bytes at a time. */
+#ifdef BYFOUR
+#  define REV(w) (((w)>>24)+(((w)>>8)&0xff00)+ \
+                (((w)&0xff00)<<8)+(((w)&0xff)<<24))
+   local unsigned long crc32_little OF((unsigned long,
+                        const unsigned char FAR *, unsigned));
+   local unsigned long crc32_big OF((unsigned long,
+                        const unsigned char FAR *, unsigned));
+#  define TBLS 8
+#else
+#  define TBLS 1
+#endif /* BYFOUR */
+
+#ifdef DYNAMIC_CRC_TABLE
+
+local volatile int crc_table_empty = 1;
+local unsigned long FAR crc_table[TBLS][256];
+local void make_crc_table OF((void));
+#ifdef MAKECRCH
+   local void write_table OF((FILE *, const unsigned long FAR *));
+#endif /* MAKECRCH */
+
+/*
+  Generate tables for a byte-wise 32-bit CRC calculation on the polynomial:
+  x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1.
+
+  Polynomials over GF(2) are represented in binary, one bit per coefficient,
+  with the lowest powers in the most significant bit.  Then adding polynomials
+  is just exclusive-or, and multiplying a polynomial by x is a right shift by
+  one.  If we call the above polynomial p, and represent a byte as the
+  polynomial q, also with the lowest power in the most significant bit (so the
+  byte 0xb1 is the polynomial x^7+x^3+x+1), then the CRC is (q*x^32) mod p,
+  where a mod b means the remainder after dividing a by b.
+
+  This calculation is done using the shift-register method of multiplying and
+  taking the remainder.  The register is initialized to zero, and for each
+  incoming bit, x^32 is added mod p to the register if the bit is a one (where
+  x^32 mod p is p+x^32 = x^26+...+1), and the register is multiplied mod p by
+  x (which is shifting right by one and adding x^32 mod p if the bit shifted
+  out is a one).  We start with the highest power (least significant bit) of
+  q and repeat for all eight bits of q.
+
+  The first table is simply the CRC of all possible eight bit values.  This is
+  all the information needed to generate CRCs on data a byte at a time for all
+  combinations of CRC register values and incoming bytes.  The remaining tables
+  allow for word-at-a-time CRC calculation for both big-endian and little-
+  endian machines, where a word is four bytes.
+*/
+local void make_crc_table()
+{
+    unsigned long c;
+    int n, k;
+    unsigned long poly;                 /* polynomial exclusive-or pattern */
+    /* terms of polynomial defining this crc (except x^32): */
+    static volatile int first = 1;      /* flag to limit concurrent making */
+    static const unsigned char p[] = {0,1,2,4,5,7,8,10,11,12,16,22,23,26};
+
+    /* See if another task is already doing this (not thread-safe, but better
+       than nothing -- significantly reduces duration of vulnerability in
+       case the advice about DYNAMIC_CRC_TABLE is ignored) */
+    if (first) {
+        first = 0;
+
+        /* make exclusive-or pattern from polynomial (0xedb88320UL) */
+        poly = 0UL;
+        for (n = 0; n < sizeof(p)/sizeof(unsigned char); n++)
+            poly |= 1UL << (31 - p[n]);
+
+        /* generate a crc for every 8-bit value */
+        for (n = 0; n < 256; n++) {
+            c = (unsigned long)n;
+            for (k = 0; k < 8; k++)
+                c = c & 1 ? poly ^ (c >> 1) : c >> 1;
+            crc_table[0][n] = c;
+        }
+
+#ifdef BYFOUR
+        /* generate crc for each value followed by one, two, and three zeros,
+           and then the byte reversal of those as well as the first table */
+        for (n = 0; n < 256; n++) {
+            c = crc_table[0][n];
+            crc_table[4][n] = REV(c);
+            for (k = 1; k < 4; k++) {
+                c = crc_table[0][c & 0xff] ^ (c >> 8);
+                crc_table[k][n] = c;
+                crc_table[k + 4][n] = REV(c);
+            }
+        }
+#endif /* BYFOUR */
+
+        crc_table_empty = 0;
+    }
+    else {      /* not first */
+        /* wait for the other guy to finish (not efficient, but rare) */
+        while (crc_table_empty)
+            ;
+    }
+
+#ifdef MAKECRCH
+    /* write out CRC tables to crc32.h */
+    {
+        FILE *out;
+
+        out = fopen("crc32.h", "w");
+        if (out == NULL) return;
+        fprintf(out, "/* crc32.h -- tables for rapid CRC calculation\n");
+        fprintf(out, " * Generated automatically by crc32.c\n */\n\n");
+        fprintf(out, "local const unsigned long FAR ");
+        fprintf(out, "crc_table[TBLS][256] =\n{\n  {\n");
+        write_table(out, crc_table[0]);
+#  ifdef BYFOUR
+        fprintf(out, "#ifdef BYFOUR\n");
+        for (k = 1; k < 8; k++) {
+            fprintf(out, "  },\n  {\n");
+            write_table(out, crc_table[k]);
+        }
+        fprintf(out, "#endif\n");
+#  endif /* BYFOUR */
+        fprintf(out, "  }\n};\n");
+        fclose(out);
+    }
+#endif /* MAKECRCH */
+}
+
+#ifdef MAKECRCH
+local void write_table(out, table)
+    FILE *out;
+    const unsigned long FAR *table;
+{
+    int n;
+
+    for (n = 0; n < 256; n++)
+        fprintf(out, "%s0x%08lxUL%s", n % 5 ? "" : "    ", table[n],
+                n == 255 ? "\n" : (n % 5 == 4 ? ",\n" : ", "));
+}
+#endif /* MAKECRCH */
+
+#else /* !DYNAMIC_CRC_TABLE */
+/* ========================================================================
+ * Tables of CRC-32s of all single-byte values, made by make_crc_table().
+ */
+#include "crc32.h"
+#endif /* DYNAMIC_CRC_TABLE */
+
+/* =========================================================================
+ * This function can be used by asm versions of crc32()
+ */
+const unsigned long FAR * ZEXPORT get_crc_table(void)
+{
+#ifdef DYNAMIC_CRC_TABLE
+    if (crc_table_empty)
+        make_crc_table();
+#endif /* DYNAMIC_CRC_TABLE */
+    return (const unsigned long FAR *)crc_table;
+}
+
+/* ========================================================================= */
+#define DO1 crc = crc_table[0][((int)crc ^ (*buf++)) & 0xff] ^ (crc >> 8)
+#define DO8 DO1; DO1; DO1; DO1; DO1; DO1; DO1; DO1
+
+/* ========================================================================= */
+unsigned long ZEXPORT crc32(
+       unsigned long crc,
+       const unsigned char FAR *buf,
+       unsigned len)
+{
+    if (buf == Z_NULL) return 0UL;
+
+#ifdef DYNAMIC_CRC_TABLE
+    if (crc_table_empty)
+        make_crc_table();
+#endif /* DYNAMIC_CRC_TABLE */
+
+#ifdef BYFOUR
+    if (sizeof(void *) == sizeof(ptrdiff_t)) {
+        u4 endian;
+
+        endian = 1;
+        if (*((unsigned char *)(&endian)))
+            return crc32_little(crc, buf, len);
+        else
+            return crc32_big(crc, buf, len);
+    }
+#endif /* BYFOUR */
+/*    crc = crc ^ 0xffffffffUL;*/
+    while (len >= 8) {
+        DO8;
+        len -= 8;
+    }
+    if (len) do {
+        DO1;
+    } while (--len);
+    return crc /* ^ 0xffffffffUL*/;
+}
+
+#ifdef BYFOUR
+
+/* ========================================================================= */
+#define DOLIT4 c ^= *buf4++; \
+        c = crc_table[3][c & 0xff] ^ crc_table[2][(c >> 8) & 0xff] ^ \
+            crc_table[1][(c >> 16) & 0xff] ^ crc_table[0][c >> 24]
+#define DOLIT32 DOLIT4; DOLIT4; DOLIT4; DOLIT4; DOLIT4; DOLIT4; DOLIT4; DOLIT4
+
+/* ========================================================================= */
+local unsigned long crc32_little(crc, buf, len)
+    unsigned long crc;
+    const unsigned char FAR *buf;
+    unsigned len;
+{
+    register u4 c;
+    register const u4 FAR *buf4;
+
+    c = (u4)crc;
+    c = ~c;
+    while (len && ((ptrdiff_t)buf & 3)) {
+        c = crc_table[0][(c ^ *buf++) & 0xff] ^ (c >> 8);
+        len--;
+    }
+
+    buf4 = (const u4 FAR *)buf;
+    while (len >= 32) {
+        DOLIT32;
+        len -= 32;
+    }
+    while (len >= 4) {
+        DOLIT4;
+        len -= 4;
+    }
+    buf = (const unsigned char FAR *)buf4;
+
+    if (len) do {
+        c = crc_table[0][(c ^ *buf++) & 0xff] ^ (c >> 8);
+    } while (--len);
+    c = ~c;
+    return (unsigned long)c;
+}
+
+/* ========================================================================= */
+#define DOBIG4 c ^= *++buf4; \
+        c = crc_table[4][c & 0xff] ^ crc_table[5][(c >> 8) & 0xff] ^ \
+            crc_table[6][(c >> 16) & 0xff] ^ crc_table[7][c >> 24]
+#define DOBIG32 DOBIG4; DOBIG4; DOBIG4; DOBIG4; DOBIG4; DOBIG4; DOBIG4; DOBIG4
+
+/* ========================================================================= */
+local unsigned long crc32_big(crc, buf, len)
+    unsigned long crc;
+    const unsigned char FAR *buf;
+    unsigned len;
+{
+    register u4 c;
+    register const u4 FAR *buf4;
+
+    c = REV((u4)crc);
+    c = ~c;
+    while (len && ((ptrdiff_t)buf & 3)) {
+        c = crc_table[4][(c >> 24) ^ *buf++] ^ (c << 8);
+        len--;
+    }
+
+    buf4 = (const u4 FAR *)buf;
+    buf4--;
+    while (len >= 32) {
+        DOBIG32;
+        len -= 32;
+    }
+    while (len >= 4) {
+        DOBIG4;
+        len -= 4;
+    }
+    buf4++;
+    buf = (const unsigned char FAR *)buf4;
+
+    if (len) do {
+        c = crc_table[4][(c >> 24) ^ *buf++] ^ (c << 8);
+    } while (--len);
+    c = ~c;
+    return (unsigned long)(REV(c));
+}
+
+#endif /* BYFOUR */
diff --git a/crc32.h b/crc32.h
new file mode 100644 (file)
index 0000000..8053b61
--- /dev/null
+++ b/crc32.h
@@ -0,0 +1,441 @@
+/* crc32.h -- tables for rapid CRC calculation
+ * Generated automatically by crc32.c
+ */
+
+local const unsigned long FAR crc_table[TBLS][256] =
+{
+  {
+    0x00000000UL, 0x77073096UL, 0xee0e612cUL, 0x990951baUL, 0x076dc419UL,
+    0x706af48fUL, 0xe963a535UL, 0x9e6495a3UL, 0x0edb8832UL, 0x79dcb8a4UL,
+    0xe0d5e91eUL, 0x97d2d988UL, 0x09b64c2bUL, 0x7eb17cbdUL, 0xe7b82d07UL,
+    0x90bf1d91UL, 0x1db71064UL, 0x6ab020f2UL, 0xf3b97148UL, 0x84be41deUL,
+    0x1adad47dUL, 0x6ddde4ebUL, 0xf4d4b551UL, 0x83d385c7UL, 0x136c9856UL,
+    0x646ba8c0UL, 0xfd62f97aUL, 0x8a65c9ecUL, 0x14015c4fUL, 0x63066cd9UL,
+    0xfa0f3d63UL, 0x8d080df5UL, 0x3b6e20c8UL, 0x4c69105eUL, 0xd56041e4UL,
+    0xa2677172UL, 0x3c03e4d1UL, 0x4b04d447UL, 0xd20d85fdUL, 0xa50ab56bUL,
+    0x35b5a8faUL, 0x42b2986cUL, 0xdbbbc9d6UL, 0xacbcf940UL, 0x32d86ce3UL,
+    0x45df5c75UL, 0xdcd60dcfUL, 0xabd13d59UL, 0x26d930acUL, 0x51de003aUL,
+    0xc8d75180UL, 0xbfd06116UL, 0x21b4f4b5UL, 0x56b3c423UL, 0xcfba9599UL,
+    0xb8bda50fUL, 0x2802b89eUL, 0x5f058808UL, 0xc60cd9b2UL, 0xb10be924UL,
+    0x2f6f7c87UL, 0x58684c11UL, 0xc1611dabUL, 0xb6662d3dUL, 0x76dc4190UL,
+    0x01db7106UL, 0x98d220bcUL, 0xefd5102aUL, 0x71b18589UL, 0x06b6b51fUL,
+    0x9fbfe4a5UL, 0xe8b8d433UL, 0x7807c9a2UL, 0x0f00f934UL, 0x9609a88eUL,
+    0xe10e9818UL, 0x7f6a0dbbUL, 0x086d3d2dUL, 0x91646c97UL, 0xe6635c01UL,
+    0x6b6b51f4UL, 0x1c6c6162UL, 0x856530d8UL, 0xf262004eUL, 0x6c0695edUL,
+    0x1b01a57bUL, 0x8208f4c1UL, 0xf50fc457UL, 0x65b0d9c6UL, 0x12b7e950UL,
+    0x8bbeb8eaUL, 0xfcb9887cUL, 0x62dd1ddfUL, 0x15da2d49UL, 0x8cd37cf3UL,
+    0xfbd44c65UL, 0x4db26158UL, 0x3ab551ceUL, 0xa3bc0074UL, 0xd4bb30e2UL,
+    0x4adfa541UL, 0x3dd895d7UL, 0xa4d1c46dUL, 0xd3d6f4fbUL, 0x4369e96aUL,
+    0x346ed9fcUL, 0xad678846UL, 0xda60b8d0UL, 0x44042d73UL, 0x33031de5UL,
+    0xaa0a4c5fUL, 0xdd0d7cc9UL, 0x5005713cUL, 0x270241aaUL, 0xbe0b1010UL,
+    0xc90c2086UL, 0x5768b525UL, 0x206f85b3UL, 0xb966d409UL, 0xce61e49fUL,
+    0x5edef90eUL, 0x29d9c998UL, 0xb0d09822UL, 0xc7d7a8b4UL, 0x59b33d17UL,
+    0x2eb40d81UL, 0xb7bd5c3bUL, 0xc0ba6cadUL, 0xedb88320UL, 0x9abfb3b6UL,
+    0x03b6e20cUL, 0x74b1d29aUL, 0xead54739UL, 0x9dd277afUL, 0x04db2615UL,
+    0x73dc1683UL, 0xe3630b12UL, 0x94643b84UL, 0x0d6d6a3eUL, 0x7a6a5aa8UL,
+    0xe40ecf0bUL, 0x9309ff9dUL, 0x0a00ae27UL, 0x7d079eb1UL, 0xf00f9344UL,
+    0x8708a3d2UL, 0x1e01f268UL, 0x6906c2feUL, 0xf762575dUL, 0x806567cbUL,
+    0x196c3671UL, 0x6e6b06e7UL, 0xfed41b76UL, 0x89d32be0UL, 0x10da7a5aUL,
+    0x67dd4accUL, 0xf9b9df6fUL, 0x8ebeeff9UL, 0x17b7be43UL, 0x60b08ed5UL,
+    0xd6d6a3e8UL, 0xa1d1937eUL, 0x38d8c2c4UL, 0x4fdff252UL, 0xd1bb67f1UL,
+    0xa6bc5767UL, 0x3fb506ddUL, 0x48b2364bUL, 0xd80d2bdaUL, 0xaf0a1b4cUL,
+    0x36034af6UL, 0x41047a60UL, 0xdf60efc3UL, 0xa867df55UL, 0x316e8eefUL,
+    0x4669be79UL, 0xcb61b38cUL, 0xbc66831aUL, 0x256fd2a0UL, 0x5268e236UL,
+    0xcc0c7795UL, 0xbb0b4703UL, 0x220216b9UL, 0x5505262fUL, 0xc5ba3bbeUL,
+    0xb2bd0b28UL, 0x2bb45a92UL, 0x5cb36a04UL, 0xc2d7ffa7UL, 0xb5d0cf31UL,
+    0x2cd99e8bUL, 0x5bdeae1dUL, 0x9b64c2b0UL, 0xec63f226UL, 0x756aa39cUL,
+    0x026d930aUL, 0x9c0906a9UL, 0xeb0e363fUL, 0x72076785UL, 0x05005713UL,
+    0x95bf4a82UL, 0xe2b87a14UL, 0x7bb12baeUL, 0x0cb61b38UL, 0x92d28e9bUL,
+    0xe5d5be0dUL, 0x7cdcefb7UL, 0x0bdbdf21UL, 0x86d3d2d4UL, 0xf1d4e242UL,
+    0x68ddb3f8UL, 0x1fda836eUL, 0x81be16cdUL, 0xf6b9265bUL, 0x6fb077e1UL,
+    0x18b74777UL, 0x88085ae6UL, 0xff0f6a70UL, 0x66063bcaUL, 0x11010b5cUL,
+    0x8f659effUL, 0xf862ae69UL, 0x616bffd3UL, 0x166ccf45UL, 0xa00ae278UL,
+    0xd70dd2eeUL, 0x4e048354UL, 0x3903b3c2UL, 0xa7672661UL, 0xd06016f7UL,
+    0x4969474dUL, 0x3e6e77dbUL, 0xaed16a4aUL, 0xd9d65adcUL, 0x40df0b66UL,
+    0x37d83bf0UL, 0xa9bcae53UL, 0xdebb9ec5UL, 0x47b2cf7fUL, 0x30b5ffe9UL,
+    0xbdbdf21cUL, 0xcabac28aUL, 0x53b39330UL, 0x24b4a3a6UL, 0xbad03605UL,
+    0xcdd70693UL, 0x54de5729UL, 0x23d967bfUL, 0xb3667a2eUL, 0xc4614ab8UL,
+    0x5d681b02UL, 0x2a6f2b94UL, 0xb40bbe37UL, 0xc30c8ea1UL, 0x5a05df1bUL,
+    0x2d02ef8dUL
+#ifdef BYFOUR
+  },
+  {
+    0x00000000UL, 0x191b3141UL, 0x32366282UL, 0x2b2d53c3UL, 0x646cc504UL,
+    0x7d77f445UL, 0x565aa786UL, 0x4f4196c7UL, 0xc8d98a08UL, 0xd1c2bb49UL,
+    0xfaefe88aUL, 0xe3f4d9cbUL, 0xacb54f0cUL, 0xb5ae7e4dUL, 0x9e832d8eUL,
+    0x87981ccfUL, 0x4ac21251UL, 0x53d92310UL, 0x78f470d3UL, 0x61ef4192UL,
+    0x2eaed755UL, 0x37b5e614UL, 0x1c98b5d7UL, 0x05838496UL, 0x821b9859UL,
+    0x9b00a918UL, 0xb02dfadbUL, 0xa936cb9aUL, 0xe6775d5dUL, 0xff6c6c1cUL,
+    0xd4413fdfUL, 0xcd5a0e9eUL, 0x958424a2UL, 0x8c9f15e3UL, 0xa7b24620UL,
+    0xbea97761UL, 0xf1e8e1a6UL, 0xe8f3d0e7UL, 0xc3de8324UL, 0xdac5b265UL,
+    0x5d5daeaaUL, 0x44469febUL, 0x6f6bcc28UL, 0x7670fd69UL, 0x39316baeUL,
+    0x202a5aefUL, 0x0b07092cUL, 0x121c386dUL, 0xdf4636f3UL, 0xc65d07b2UL,
+    0xed705471UL, 0xf46b6530UL, 0xbb2af3f7UL, 0xa231c2b6UL, 0x891c9175UL,
+    0x9007a034UL, 0x179fbcfbUL, 0x0e848dbaUL, 0x25a9de79UL, 0x3cb2ef38UL,
+    0x73f379ffUL, 0x6ae848beUL, 0x41c51b7dUL, 0x58de2a3cUL, 0xf0794f05UL,
+    0xe9627e44UL, 0xc24f2d87UL, 0xdb541cc6UL, 0x94158a01UL, 0x8d0ebb40UL,
+    0xa623e883UL, 0xbf38d9c2UL, 0x38a0c50dUL, 0x21bbf44cUL, 0x0a96a78fUL,
+    0x138d96ceUL, 0x5ccc0009UL, 0x45d73148UL, 0x6efa628bUL, 0x77e153caUL,
+    0xbabb5d54UL, 0xa3a06c15UL, 0x888d3fd6UL, 0x91960e97UL, 0xded79850UL,
+    0xc7cca911UL, 0xece1fad2UL, 0xf5facb93UL, 0x7262d75cUL, 0x6b79e61dUL,
+    0x4054b5deUL, 0x594f849fUL, 0x160e1258UL, 0x0f152319UL, 0x243870daUL,
+    0x3d23419bUL, 0x65fd6ba7UL, 0x7ce65ae6UL, 0x57cb0925UL, 0x4ed03864UL,
+    0x0191aea3UL, 0x188a9fe2UL, 0x33a7cc21UL, 0x2abcfd60UL, 0xad24e1afUL,
+    0xb43fd0eeUL, 0x9f12832dUL, 0x8609b26cUL, 0xc94824abUL, 0xd05315eaUL,
+    0xfb7e4629UL, 0xe2657768UL, 0x2f3f79f6UL, 0x362448b7UL, 0x1d091b74UL,
+    0x04122a35UL, 0x4b53bcf2UL, 0x52488db3UL, 0x7965de70UL, 0x607eef31UL,
+    0xe7e6f3feUL, 0xfefdc2bfUL, 0xd5d0917cUL, 0xcccba03dUL, 0x838a36faUL,
+    0x9a9107bbUL, 0xb1bc5478UL, 0xa8a76539UL, 0x3b83984bUL, 0x2298a90aUL,
+    0x09b5fac9UL, 0x10aecb88UL, 0x5fef5d4fUL, 0x46f46c0eUL, 0x6dd93fcdUL,
+    0x74c20e8cUL, 0xf35a1243UL, 0xea412302UL, 0xc16c70c1UL, 0xd8774180UL,
+    0x9736d747UL, 0x8e2de606UL, 0xa500b5c5UL, 0xbc1b8484UL, 0x71418a1aUL,
+    0x685abb5bUL, 0x4377e898UL, 0x5a6cd9d9UL, 0x152d4f1eUL, 0x0c367e5fUL,
+    0x271b2d9cUL, 0x3e001cddUL, 0xb9980012UL, 0xa0833153UL, 0x8bae6290UL,
+    0x92b553d1UL, 0xddf4c516UL, 0xc4eff457UL, 0xefc2a794UL, 0xf6d996d5UL,
+    0xae07bce9UL, 0xb71c8da8UL, 0x9c31de6bUL, 0x852aef2aUL, 0xca6b79edUL,
+    0xd37048acUL, 0xf85d1b6fUL, 0xe1462a2eUL, 0x66de36e1UL, 0x7fc507a0UL,
+    0x54e85463UL, 0x4df36522UL, 0x02b2f3e5UL, 0x1ba9c2a4UL, 0x30849167UL,
+    0x299fa026UL, 0xe4c5aeb8UL, 0xfdde9ff9UL, 0xd6f3cc3aUL, 0xcfe8fd7bUL,
+    0x80a96bbcUL, 0x99b25afdUL, 0xb29f093eUL, 0xab84387fUL, 0x2c1c24b0UL,
+    0x350715f1UL, 0x1e2a4632UL, 0x07317773UL, 0x4870e1b4UL, 0x516bd0f5UL,
+    0x7a468336UL, 0x635db277UL, 0xcbfad74eUL, 0xd2e1e60fUL, 0xf9ccb5ccUL,
+    0xe0d7848dUL, 0xaf96124aUL, 0xb68d230bUL, 0x9da070c8UL, 0x84bb4189UL,
+    0x03235d46UL, 0x1a386c07UL, 0x31153fc4UL, 0x280e0e85UL, 0x674f9842UL,
+    0x7e54a903UL, 0x5579fac0UL, 0x4c62cb81UL, 0x8138c51fUL, 0x9823f45eUL,
+    0xb30ea79dUL, 0xaa1596dcUL, 0xe554001bUL, 0xfc4f315aUL, 0xd7626299UL,
+    0xce7953d8UL, 0x49e14f17UL, 0x50fa7e56UL, 0x7bd72d95UL, 0x62cc1cd4UL,
+    0x2d8d8a13UL, 0x3496bb52UL, 0x1fbbe891UL, 0x06a0d9d0UL, 0x5e7ef3ecUL,
+    0x4765c2adUL, 0x6c48916eUL, 0x7553a02fUL, 0x3a1236e8UL, 0x230907a9UL,
+    0x0824546aUL, 0x113f652bUL, 0x96a779e4UL, 0x8fbc48a5UL, 0xa4911b66UL,
+    0xbd8a2a27UL, 0xf2cbbce0UL, 0xebd08da1UL, 0xc0fdde62UL, 0xd9e6ef23UL,
+    0x14bce1bdUL, 0x0da7d0fcUL, 0x268a833fUL, 0x3f91b27eUL, 0x70d024b9UL,
+    0x69cb15f8UL, 0x42e6463bUL, 0x5bfd777aUL, 0xdc656bb5UL, 0xc57e5af4UL,
+    0xee530937UL, 0xf7483876UL, 0xb809aeb1UL, 0xa1129ff0UL, 0x8a3fcc33UL,
+    0x9324fd72UL
+  },
+  {
+    0x00000000UL, 0x01c26a37UL, 0x0384d46eUL, 0x0246be59UL, 0x0709a8dcUL,
+    0x06cbc2ebUL, 0x048d7cb2UL, 0x054f1685UL, 0x0e1351b8UL, 0x0fd13b8fUL,
+    0x0d9785d6UL, 0x0c55efe1UL, 0x091af964UL, 0x08d89353UL, 0x0a9e2d0aUL,
+    0x0b5c473dUL, 0x1c26a370UL, 0x1de4c947UL, 0x1fa2771eUL, 0x1e601d29UL,
+    0x1b2f0bacUL, 0x1aed619bUL, 0x18abdfc2UL, 0x1969b5f5UL, 0x1235f2c8UL,
+    0x13f798ffUL, 0x11b126a6UL, 0x10734c91UL, 0x153c5a14UL, 0x14fe3023UL,
+    0x16b88e7aUL, 0x177ae44dUL, 0x384d46e0UL, 0x398f2cd7UL, 0x3bc9928eUL,
+    0x3a0bf8b9UL, 0x3f44ee3cUL, 0x3e86840bUL, 0x3cc03a52UL, 0x3d025065UL,
+    0x365e1758UL, 0x379c7d6fUL, 0x35dac336UL, 0x3418a901UL, 0x3157bf84UL,
+    0x3095d5b3UL, 0x32d36beaUL, 0x331101ddUL, 0x246be590UL, 0x25a98fa7UL,
+    0x27ef31feUL, 0x262d5bc9UL, 0x23624d4cUL, 0x22a0277bUL, 0x20e69922UL,
+    0x2124f315UL, 0x2a78b428UL, 0x2bbade1fUL, 0x29fc6046UL, 0x283e0a71UL,
+    0x2d711cf4UL, 0x2cb376c3UL, 0x2ef5c89aUL, 0x2f37a2adUL, 0x709a8dc0UL,
+    0x7158e7f7UL, 0x731e59aeUL, 0x72dc3399UL, 0x7793251cUL, 0x76514f2bUL,
+    0x7417f172UL, 0x75d59b45UL, 0x7e89dc78UL, 0x7f4bb64fUL, 0x7d0d0816UL,
+    0x7ccf6221UL, 0x798074a4UL, 0x78421e93UL, 0x7a04a0caUL, 0x7bc6cafdUL,
+    0x6cbc2eb0UL, 0x6d7e4487UL, 0x6f38fadeUL, 0x6efa90e9UL, 0x6bb5866cUL,
+    0x6a77ec5bUL, 0x68315202UL, 0x69f33835UL, 0x62af7f08UL, 0x636d153fUL,
+    0x612bab66UL, 0x60e9c151UL, 0x65a6d7d4UL, 0x6464bde3UL, 0x662203baUL,
+    0x67e0698dUL, 0x48d7cb20UL, 0x4915a117UL, 0x4b531f4eUL, 0x4a917579UL,
+    0x4fde63fcUL, 0x4e1c09cbUL, 0x4c5ab792UL, 0x4d98dda5UL, 0x46c49a98UL,
+    0x4706f0afUL, 0x45404ef6UL, 0x448224c1UL, 0x41cd3244UL, 0x400f5873UL,
+    0x4249e62aUL, 0x438b8c1dUL, 0x54f16850UL, 0x55330267UL, 0x5775bc3eUL,
+    0x56b7d609UL, 0x53f8c08cUL, 0x523aaabbUL, 0x507c14e2UL, 0x51be7ed5UL,
+    0x5ae239e8UL, 0x5b2053dfUL, 0x5966ed86UL, 0x58a487b1UL, 0x5deb9134UL,
+    0x5c29fb03UL, 0x5e6f455aUL, 0x5fad2f6dUL, 0xe1351b80UL, 0xe0f771b7UL,
+    0xe2b1cfeeUL, 0xe373a5d9UL, 0xe63cb35cUL, 0xe7fed96bUL, 0xe5b86732UL,
+    0xe47a0d05UL, 0xef264a38UL, 0xeee4200fUL, 0xeca29e56UL, 0xed60f461UL,
+    0xe82fe2e4UL, 0xe9ed88d3UL, 0xebab368aUL, 0xea695cbdUL, 0xfd13b8f0UL,
+    0xfcd1d2c7UL, 0xfe976c9eUL, 0xff5506a9UL, 0xfa1a102cUL, 0xfbd87a1bUL,
+    0xf99ec442UL, 0xf85cae75UL, 0xf300e948UL, 0xf2c2837fUL, 0xf0843d26UL,
+    0xf1465711UL, 0xf4094194UL, 0xf5cb2ba3UL, 0xf78d95faUL, 0xf64fffcdUL,
+    0xd9785d60UL, 0xd8ba3757UL, 0xdafc890eUL, 0xdb3ee339UL, 0xde71f5bcUL,
+    0xdfb39f8bUL, 0xddf521d2UL, 0xdc374be5UL, 0xd76b0cd8UL, 0xd6a966efUL,
+    0xd4efd8b6UL, 0xd52db281UL, 0xd062a404UL, 0xd1a0ce33UL, 0xd3e6706aUL,
+    0xd2241a5dUL, 0xc55efe10UL, 0xc49c9427UL, 0xc6da2a7eUL, 0xc7184049UL,
+    0xc25756ccUL, 0xc3953cfbUL, 0xc1d382a2UL, 0xc011e895UL, 0xcb4dafa8UL,
+    0xca8fc59fUL, 0xc8c97bc6UL, 0xc90b11f1UL, 0xcc440774UL, 0xcd866d43UL,
+    0xcfc0d31aUL, 0xce02b92dUL, 0x91af9640UL, 0x906dfc77UL, 0x922b422eUL,
+    0x93e92819UL, 0x96a63e9cUL, 0x976454abUL, 0x9522eaf2UL, 0x94e080c5UL,
+    0x9fbcc7f8UL, 0x9e7eadcfUL, 0x9c381396UL, 0x9dfa79a1UL, 0x98b56f24UL,
+    0x99770513UL, 0x9b31bb4aUL, 0x9af3d17dUL, 0x8d893530UL, 0x8c4b5f07UL,
+    0x8e0de15eUL, 0x8fcf8b69UL, 0x8a809decUL, 0x8b42f7dbUL, 0x89044982UL,
+    0x88c623b5UL, 0x839a6488UL, 0x82580ebfUL, 0x801eb0e6UL, 0x81dcdad1UL,
+    0x8493cc54UL, 0x8551a663UL, 0x8717183aUL, 0x86d5720dUL, 0xa9e2d0a0UL,
+    0xa820ba97UL, 0xaa6604ceUL, 0xaba46ef9UL, 0xaeeb787cUL, 0xaf29124bUL,
+    0xad6fac12UL, 0xacadc625UL, 0xa7f18118UL, 0xa633eb2fUL, 0xa4755576UL,
+    0xa5b73f41UL, 0xa0f829c4UL, 0xa13a43f3UL, 0xa37cfdaaUL, 0xa2be979dUL,
+    0xb5c473d0UL, 0xb40619e7UL, 0xb640a7beUL, 0xb782cd89UL, 0xb2cddb0cUL,
+    0xb30fb13bUL, 0xb1490f62UL, 0xb08b6555UL, 0xbbd72268UL, 0xba15485fUL,
+    0xb853f606UL, 0xb9919c31UL, 0xbcde8ab4UL, 0xbd1ce083UL, 0xbf5a5edaUL,
+    0xbe9834edUL
+  },
+  {
+    0x00000000UL, 0xb8bc6765UL, 0xaa09c88bUL, 0x12b5afeeUL, 0x8f629757UL,
+    0x37def032UL, 0x256b5fdcUL, 0x9dd738b9UL, 0xc5b428efUL, 0x7d084f8aUL,
+    0x6fbde064UL, 0xd7018701UL, 0x4ad6bfb8UL, 0xf26ad8ddUL, 0xe0df7733UL,
+    0x58631056UL, 0x5019579fUL, 0xe8a530faUL, 0xfa109f14UL, 0x42acf871UL,
+    0xdf7bc0c8UL, 0x67c7a7adUL, 0x75720843UL, 0xcdce6f26UL, 0x95ad7f70UL,
+    0x2d111815UL, 0x3fa4b7fbUL, 0x8718d09eUL, 0x1acfe827UL, 0xa2738f42UL,
+    0xb0c620acUL, 0x087a47c9UL, 0xa032af3eUL, 0x188ec85bUL, 0x0a3b67b5UL,
+    0xb28700d0UL, 0x2f503869UL, 0x97ec5f0cUL, 0x8559f0e2UL, 0x3de59787UL,
+    0x658687d1UL, 0xdd3ae0b4UL, 0xcf8f4f5aUL, 0x7733283fUL, 0xeae41086UL,
+    0x525877e3UL, 0x40edd80dUL, 0xf851bf68UL, 0xf02bf8a1UL, 0x48979fc4UL,
+    0x5a22302aUL, 0xe29e574fUL, 0x7f496ff6UL, 0xc7f50893UL, 0xd540a77dUL,
+    0x6dfcc018UL, 0x359fd04eUL, 0x8d23b72bUL, 0x9f9618c5UL, 0x272a7fa0UL,
+    0xbafd4719UL, 0x0241207cUL, 0x10f48f92UL, 0xa848e8f7UL, 0x9b14583dUL,
+    0x23a83f58UL, 0x311d90b6UL, 0x89a1f7d3UL, 0x1476cf6aUL, 0xaccaa80fUL,
+    0xbe7f07e1UL, 0x06c36084UL, 0x5ea070d2UL, 0xe61c17b7UL, 0xf4a9b859UL,
+    0x4c15df3cUL, 0xd1c2e785UL, 0x697e80e0UL, 0x7bcb2f0eUL, 0xc377486bUL,
+    0xcb0d0fa2UL, 0x73b168c7UL, 0x6104c729UL, 0xd9b8a04cUL, 0x446f98f5UL,
+    0xfcd3ff90UL, 0xee66507eUL, 0x56da371bUL, 0x0eb9274dUL, 0xb6054028UL,
+    0xa4b0efc6UL, 0x1c0c88a3UL, 0x81dbb01aUL, 0x3967d77fUL, 0x2bd27891UL,
+    0x936e1ff4UL, 0x3b26f703UL, 0x839a9066UL, 0x912f3f88UL, 0x299358edUL,
+    0xb4446054UL, 0x0cf80731UL, 0x1e4da8dfUL, 0xa6f1cfbaUL, 0xfe92dfecUL,
+    0x462eb889UL, 0x549b1767UL, 0xec277002UL, 0x71f048bbUL, 0xc94c2fdeUL,
+    0xdbf98030UL, 0x6345e755UL, 0x6b3fa09cUL, 0xd383c7f9UL, 0xc1366817UL,
+    0x798a0f72UL, 0xe45d37cbUL, 0x5ce150aeUL, 0x4e54ff40UL, 0xf6e89825UL,
+    0xae8b8873UL, 0x1637ef16UL, 0x048240f8UL, 0xbc3e279dUL, 0x21e91f24UL,
+    0x99557841UL, 0x8be0d7afUL, 0x335cb0caUL, 0xed59b63bUL, 0x55e5d15eUL,
+    0x47507eb0UL, 0xffec19d5UL, 0x623b216cUL, 0xda874609UL, 0xc832e9e7UL,
+    0x708e8e82UL, 0x28ed9ed4UL, 0x9051f9b1UL, 0x82e4565fUL, 0x3a58313aUL,
+    0xa78f0983UL, 0x1f336ee6UL, 0x0d86c108UL, 0xb53aa66dUL, 0xbd40e1a4UL,
+    0x05fc86c1UL, 0x1749292fUL, 0xaff54e4aUL, 0x322276f3UL, 0x8a9e1196UL,
+    0x982bbe78UL, 0x2097d91dUL, 0x78f4c94bUL, 0xc048ae2eUL, 0xd2fd01c0UL,
+    0x6a4166a5UL, 0xf7965e1cUL, 0x4f2a3979UL, 0x5d9f9697UL, 0xe523f1f2UL,
+    0x4d6b1905UL, 0xf5d77e60UL, 0xe762d18eUL, 0x5fdeb6ebUL, 0xc2098e52UL,
+    0x7ab5e937UL, 0x680046d9UL, 0xd0bc21bcUL, 0x88df31eaUL, 0x3063568fUL,
+    0x22d6f961UL, 0x9a6a9e04UL, 0x07bda6bdUL, 0xbf01c1d8UL, 0xadb46e36UL,
+    0x15080953UL, 0x1d724e9aUL, 0xa5ce29ffUL, 0xb77b8611UL, 0x0fc7e174UL,
+    0x9210d9cdUL, 0x2aacbea8UL, 0x38191146UL, 0x80a57623UL, 0xd8c66675UL,
+    0x607a0110UL, 0x72cfaefeUL, 0xca73c99bUL, 0x57a4f122UL, 0xef189647UL,
+    0xfdad39a9UL, 0x45115eccUL, 0x764dee06UL, 0xcef18963UL, 0xdc44268dUL,
+    0x64f841e8UL, 0xf92f7951UL, 0x41931e34UL, 0x5326b1daUL, 0xeb9ad6bfUL,
+    0xb3f9c6e9UL, 0x0b45a18cUL, 0x19f00e62UL, 0xa14c6907UL, 0x3c9b51beUL,
+    0x842736dbUL, 0x96929935UL, 0x2e2efe50UL, 0x2654b999UL, 0x9ee8defcUL,
+    0x8c5d7112UL, 0x34e11677UL, 0xa9362eceUL, 0x118a49abUL, 0x033fe645UL,
+    0xbb838120UL, 0xe3e09176UL, 0x5b5cf613UL, 0x49e959fdUL, 0xf1553e98UL,
+    0x6c820621UL, 0xd43e6144UL, 0xc68bceaaUL, 0x7e37a9cfUL, 0xd67f4138UL,
+    0x6ec3265dUL, 0x7c7689b3UL, 0xc4caeed6UL, 0x591dd66fUL, 0xe1a1b10aUL,
+    0xf3141ee4UL, 0x4ba87981UL, 0x13cb69d7UL, 0xab770eb2UL, 0xb9c2a15cUL,
+    0x017ec639UL, 0x9ca9fe80UL, 0x241599e5UL, 0x36a0360bUL, 0x8e1c516eUL,
+    0x866616a7UL, 0x3eda71c2UL, 0x2c6fde2cUL, 0x94d3b949UL, 0x090481f0UL,
+    0xb1b8e695UL, 0xa30d497bUL, 0x1bb12e1eUL, 0x43d23e48UL, 0xfb6e592dUL,
+    0xe9dbf6c3UL, 0x516791a6UL, 0xccb0a91fUL, 0x740cce7aUL, 0x66b96194UL,
+    0xde0506f1UL
+  },
+  {
+    0x00000000UL, 0x96300777UL, 0x2c610eeeUL, 0xba510999UL, 0x19c46d07UL,
+    0x8ff46a70UL, 0x35a563e9UL, 0xa395649eUL, 0x3288db0eUL, 0xa4b8dc79UL,
+    0x1ee9d5e0UL, 0x88d9d297UL, 0x2b4cb609UL, 0xbd7cb17eUL, 0x072db8e7UL,
+    0x911dbf90UL, 0x6410b71dUL, 0xf220b06aUL, 0x4871b9f3UL, 0xde41be84UL,
+    0x7dd4da1aUL, 0xebe4dd6dUL, 0x51b5d4f4UL, 0xc785d383UL, 0x56986c13UL,
+    0xc0a86b64UL, 0x7af962fdUL, 0xecc9658aUL, 0x4f5c0114UL, 0xd96c0663UL,
+    0x633d0ffaUL, 0xf50d088dUL, 0xc8206e3bUL, 0x5e10694cUL, 0xe44160d5UL,
+    0x727167a2UL, 0xd1e4033cUL, 0x47d4044bUL, 0xfd850dd2UL, 0x6bb50aa5UL,
+    0xfaa8b535UL, 0x6c98b242UL, 0xd6c9bbdbUL, 0x40f9bcacUL, 0xe36cd832UL,
+    0x755cdf45UL, 0xcf0dd6dcUL, 0x593dd1abUL, 0xac30d926UL, 0x3a00de51UL,
+    0x8051d7c8UL, 0x1661d0bfUL, 0xb5f4b421UL, 0x23c4b356UL, 0x9995bacfUL,
+    0x0fa5bdb8UL, 0x9eb80228UL, 0x0888055fUL, 0xb2d90cc6UL, 0x24e90bb1UL,
+    0x877c6f2fUL, 0x114c6858UL, 0xab1d61c1UL, 0x3d2d66b6UL, 0x9041dc76UL,
+    0x0671db01UL, 0xbc20d298UL, 0x2a10d5efUL, 0x8985b171UL, 0x1fb5b606UL,
+    0xa5e4bf9fUL, 0x33d4b8e8UL, 0xa2c90778UL, 0x34f9000fUL, 0x8ea80996UL,
+    0x18980ee1UL, 0xbb0d6a7fUL, 0x2d3d6d08UL, 0x976c6491UL, 0x015c63e6UL,
+    0xf4516b6bUL, 0x62616c1cUL, 0xd8306585UL, 0x4e0062f2UL, 0xed95066cUL,
+    0x7ba5011bUL, 0xc1f40882UL, 0x57c40ff5UL, 0xc6d9b065UL, 0x50e9b712UL,
+    0xeab8be8bUL, 0x7c88b9fcUL, 0xdf1ddd62UL, 0x492dda15UL, 0xf37cd38cUL,
+    0x654cd4fbUL, 0x5861b24dUL, 0xce51b53aUL, 0x7400bca3UL, 0xe230bbd4UL,
+    0x41a5df4aUL, 0xd795d83dUL, 0x6dc4d1a4UL, 0xfbf4d6d3UL, 0x6ae96943UL,
+    0xfcd96e34UL, 0x468867adUL, 0xd0b860daUL, 0x732d0444UL, 0xe51d0333UL,
+    0x5f4c0aaaUL, 0xc97c0dddUL, 0x3c710550UL, 0xaa410227UL, 0x10100bbeUL,
+    0x86200cc9UL, 0x25b56857UL, 0xb3856f20UL, 0x09d466b9UL, 0x9fe461ceUL,
+    0x0ef9de5eUL, 0x98c9d929UL, 0x2298d0b0UL, 0xb4a8d7c7UL, 0x173db359UL,
+    0x810db42eUL, 0x3b5cbdb7UL, 0xad6cbac0UL, 0x2083b8edUL, 0xb6b3bf9aUL,
+    0x0ce2b603UL, 0x9ad2b174UL, 0x3947d5eaUL, 0xaf77d29dUL, 0x1526db04UL,
+    0x8316dc73UL, 0x120b63e3UL, 0x843b6494UL, 0x3e6a6d0dUL, 0xa85a6a7aUL,
+    0x0bcf0ee4UL, 0x9dff0993UL, 0x27ae000aUL, 0xb19e077dUL, 0x44930ff0UL,
+    0xd2a30887UL, 0x68f2011eUL, 0xfec20669UL, 0x5d5762f7UL, 0xcb676580UL,
+    0x71366c19UL, 0xe7066b6eUL, 0x761bd4feUL, 0xe02bd389UL, 0x5a7ada10UL,
+    0xcc4add67UL, 0x6fdfb9f9UL, 0xf9efbe8eUL, 0x43beb717UL, 0xd58eb060UL,
+    0xe8a3d6d6UL, 0x7e93d1a1UL, 0xc4c2d838UL, 0x52f2df4fUL, 0xf167bbd1UL,
+    0x6757bca6UL, 0xdd06b53fUL, 0x4b36b248UL, 0xda2b0dd8UL, 0x4c1b0aafUL,
+    0xf64a0336UL, 0x607a0441UL, 0xc3ef60dfUL, 0x55df67a8UL, 0xef8e6e31UL,
+    0x79be6946UL, 0x8cb361cbUL, 0x1a8366bcUL, 0xa0d26f25UL, 0x36e26852UL,
+    0x95770cccUL, 0x03470bbbUL, 0xb9160222UL, 0x2f260555UL, 0xbe3bbac5UL,
+    0x280bbdb2UL, 0x925ab42bUL, 0x046ab35cUL, 0xa7ffd7c2UL, 0x31cfd0b5UL,
+    0x8b9ed92cUL, 0x1daede5bUL, 0xb0c2649bUL, 0x26f263ecUL, 0x9ca36a75UL,
+    0x0a936d02UL, 0xa906099cUL, 0x3f360eebUL, 0x85670772UL, 0x13570005UL,
+    0x824abf95UL, 0x147ab8e2UL, 0xae2bb17bUL, 0x381bb60cUL, 0x9b8ed292UL,
+    0x0dbed5e5UL, 0xb7efdc7cUL, 0x21dfdb0bUL, 0xd4d2d386UL, 0x42e2d4f1UL,
+    0xf8b3dd68UL, 0x6e83da1fUL, 0xcd16be81UL, 0x5b26b9f6UL, 0xe177b06fUL,
+    0x7747b718UL, 0xe65a0888UL, 0x706a0fffUL, 0xca3b0666UL, 0x5c0b0111UL,
+    0xff9e658fUL, 0x69ae62f8UL, 0xd3ff6b61UL, 0x45cf6c16UL, 0x78e20aa0UL,
+    0xeed20dd7UL, 0x5483044eUL, 0xc2b30339UL, 0x612667a7UL, 0xf71660d0UL,
+    0x4d476949UL, 0xdb776e3eUL, 0x4a6ad1aeUL, 0xdc5ad6d9UL, 0x660bdf40UL,
+    0xf03bd837UL, 0x53aebca9UL, 0xc59ebbdeUL, 0x7fcfb247UL, 0xe9ffb530UL,
+    0x1cf2bdbdUL, 0x8ac2bacaUL, 0x3093b353UL, 0xa6a3b424UL, 0x0536d0baUL,
+    0x9306d7cdUL, 0x2957de54UL, 0xbf67d923UL, 0x2e7a66b3UL, 0xb84a61c4UL,
+    0x021b685dUL, 0x942b6f2aUL, 0x37be0bb4UL, 0xa18e0cc3UL, 0x1bdf055aUL,
+    0x8def022dUL
+  },
+  {
+    0x00000000UL, 0x41311b19UL, 0x82623632UL, 0xc3532d2bUL, 0x04c56c64UL,
+    0x45f4777dUL, 0x86a75a56UL, 0xc796414fUL, 0x088ad9c8UL, 0x49bbc2d1UL,
+    0x8ae8effaUL, 0xcbd9f4e3UL, 0x0c4fb5acUL, 0x4d7eaeb5UL, 0x8e2d839eUL,
+    0xcf1c9887UL, 0x5112c24aUL, 0x1023d953UL, 0xd370f478UL, 0x9241ef61UL,
+    0x55d7ae2eUL, 0x14e6b537UL, 0xd7b5981cUL, 0x96848305UL, 0x59981b82UL,
+    0x18a9009bUL, 0xdbfa2db0UL, 0x9acb36a9UL, 0x5d5d77e6UL, 0x1c6c6cffUL,
+    0xdf3f41d4UL, 0x9e0e5acdUL, 0xa2248495UL, 0xe3159f8cUL, 0x2046b2a7UL,
+    0x6177a9beUL, 0xa6e1e8f1UL, 0xe7d0f3e8UL, 0x2483dec3UL, 0x65b2c5daUL,
+    0xaaae5d5dUL, 0xeb9f4644UL, 0x28cc6b6fUL, 0x69fd7076UL, 0xae6b3139UL,
+    0xef5a2a20UL, 0x2c09070bUL, 0x6d381c12UL, 0xf33646dfUL, 0xb2075dc6UL,
+    0x715470edUL, 0x30656bf4UL, 0xf7f32abbUL, 0xb6c231a2UL, 0x75911c89UL,
+    0x34a00790UL, 0xfbbc9f17UL, 0xba8d840eUL, 0x79dea925UL, 0x38efb23cUL,
+    0xff79f373UL, 0xbe48e86aUL, 0x7d1bc541UL, 0x3c2ade58UL, 0x054f79f0UL,
+    0x447e62e9UL, 0x872d4fc2UL, 0xc61c54dbUL, 0x018a1594UL, 0x40bb0e8dUL,
+    0x83e823a6UL, 0xc2d938bfUL, 0x0dc5a038UL, 0x4cf4bb21UL, 0x8fa7960aUL,
+    0xce968d13UL, 0x0900cc5cUL, 0x4831d745UL, 0x8b62fa6eUL, 0xca53e177UL,
+    0x545dbbbaUL, 0x156ca0a3UL, 0xd63f8d88UL, 0x970e9691UL, 0x5098d7deUL,
+    0x11a9ccc7UL, 0xd2fae1ecUL, 0x93cbfaf5UL, 0x5cd76272UL, 0x1de6796bUL,
+    0xdeb55440UL, 0x9f844f59UL, 0x58120e16UL, 0x1923150fUL, 0xda703824UL,
+    0x9b41233dUL, 0xa76bfd65UL, 0xe65ae67cUL, 0x2509cb57UL, 0x6438d04eUL,
+    0xa3ae9101UL, 0xe29f8a18UL, 0x21cca733UL, 0x60fdbc2aUL, 0xafe124adUL,
+    0xeed03fb4UL, 0x2d83129fUL, 0x6cb20986UL, 0xab2448c9UL, 0xea1553d0UL,
+    0x29467efbUL, 0x687765e2UL, 0xf6793f2fUL, 0xb7482436UL, 0x741b091dUL,
+    0x352a1204UL, 0xf2bc534bUL, 0xb38d4852UL, 0x70de6579UL, 0x31ef7e60UL,
+    0xfef3e6e7UL, 0xbfc2fdfeUL, 0x7c91d0d5UL, 0x3da0cbccUL, 0xfa368a83UL,
+    0xbb07919aUL, 0x7854bcb1UL, 0x3965a7a8UL, 0x4b98833bUL, 0x0aa99822UL,
+    0xc9fab509UL, 0x88cbae10UL, 0x4f5def5fUL, 0x0e6cf446UL, 0xcd3fd96dUL,
+    0x8c0ec274UL, 0x43125af3UL, 0x022341eaUL, 0xc1706cc1UL, 0x804177d8UL,
+    0x47d73697UL, 0x06e62d8eUL, 0xc5b500a5UL, 0x84841bbcUL, 0x1a8a4171UL,
+    0x5bbb5a68UL, 0x98e87743UL, 0xd9d96c5aUL, 0x1e4f2d15UL, 0x5f7e360cUL,
+    0x9c2d1b27UL, 0xdd1c003eUL, 0x120098b9UL, 0x533183a0UL, 0x9062ae8bUL,
+    0xd153b592UL, 0x16c5f4ddUL, 0x57f4efc4UL, 0x94a7c2efUL, 0xd596d9f6UL,
+    0xe9bc07aeUL, 0xa88d1cb7UL, 0x6bde319cUL, 0x2aef2a85UL, 0xed796bcaUL,
+    0xac4870d3UL, 0x6f1b5df8UL, 0x2e2a46e1UL, 0xe136de66UL, 0xa007c57fUL,
+    0x6354e854UL, 0x2265f34dUL, 0xe5f3b202UL, 0xa4c2a91bUL, 0x67918430UL,
+    0x26a09f29UL, 0xb8aec5e4UL, 0xf99fdefdUL, 0x3accf3d6UL, 0x7bfde8cfUL,
+    0xbc6ba980UL, 0xfd5ab299UL, 0x3e099fb2UL, 0x7f3884abUL, 0xb0241c2cUL,
+    0xf1150735UL, 0x32462a1eUL, 0x73773107UL, 0xb4e17048UL, 0xf5d06b51UL,
+    0x3683467aUL, 0x77b25d63UL, 0x4ed7facbUL, 0x0fe6e1d2UL, 0xccb5ccf9UL,
+    0x8d84d7e0UL, 0x4a1296afUL, 0x0b238db6UL, 0xc870a09dUL, 0x8941bb84UL,
+    0x465d2303UL, 0x076c381aUL, 0xc43f1531UL, 0x850e0e28UL, 0x42984f67UL,
+    0x03a9547eUL, 0xc0fa7955UL, 0x81cb624cUL, 0x1fc53881UL, 0x5ef42398UL,
+    0x9da70eb3UL, 0xdc9615aaUL, 0x1b0054e5UL, 0x5a314ffcUL, 0x996262d7UL,
+    0xd85379ceUL, 0x174fe149UL, 0x567efa50UL, 0x952dd77bUL, 0xd41ccc62UL,
+    0x138a8d2dUL, 0x52bb9634UL, 0x91e8bb1fUL, 0xd0d9a006UL, 0xecf37e5eUL,
+    0xadc26547UL, 0x6e91486cUL, 0x2fa05375UL, 0xe836123aUL, 0xa9070923UL,
+    0x6a542408UL, 0x2b653f11UL, 0xe479a796UL, 0xa548bc8fUL, 0x661b91a4UL,
+    0x272a8abdUL, 0xe0bccbf2UL, 0xa18dd0ebUL, 0x62defdc0UL, 0x23efe6d9UL,
+    0xbde1bc14UL, 0xfcd0a70dUL, 0x3f838a26UL, 0x7eb2913fUL, 0xb924d070UL,
+    0xf815cb69UL, 0x3b46e642UL, 0x7a77fd5bUL, 0xb56b65dcUL, 0xf45a7ec5UL,
+    0x370953eeUL, 0x763848f7UL, 0xb1ae09b8UL, 0xf09f12a1UL, 0x33cc3f8aUL,
+    0x72fd2493UL
+  },
+  {
+    0x00000000UL, 0x376ac201UL, 0x6ed48403UL, 0x59be4602UL, 0xdca80907UL,
+    0xebc2cb06UL, 0xb27c8d04UL, 0x85164f05UL, 0xb851130eUL, 0x8f3bd10fUL,
+    0xd685970dUL, 0xe1ef550cUL, 0x64f91a09UL, 0x5393d808UL, 0x0a2d9e0aUL,
+    0x3d475c0bUL, 0x70a3261cUL, 0x47c9e41dUL, 0x1e77a21fUL, 0x291d601eUL,
+    0xac0b2f1bUL, 0x9b61ed1aUL, 0xc2dfab18UL, 0xf5b56919UL, 0xc8f23512UL,
+    0xff98f713UL, 0xa626b111UL, 0x914c7310UL, 0x145a3c15UL, 0x2330fe14UL,
+    0x7a8eb816UL, 0x4de47a17UL, 0xe0464d38UL, 0xd72c8f39UL, 0x8e92c93bUL,
+    0xb9f80b3aUL, 0x3cee443fUL, 0x0b84863eUL, 0x523ac03cUL, 0x6550023dUL,
+    0x58175e36UL, 0x6f7d9c37UL, 0x36c3da35UL, 0x01a91834UL, 0x84bf5731UL,
+    0xb3d59530UL, 0xea6bd332UL, 0xdd011133UL, 0x90e56b24UL, 0xa78fa925UL,
+    0xfe31ef27UL, 0xc95b2d26UL, 0x4c4d6223UL, 0x7b27a022UL, 0x2299e620UL,
+    0x15f32421UL, 0x28b4782aUL, 0x1fdeba2bUL, 0x4660fc29UL, 0x710a3e28UL,
+    0xf41c712dUL, 0xc376b32cUL, 0x9ac8f52eUL, 0xada2372fUL, 0xc08d9a70UL,
+    0xf7e75871UL, 0xae591e73UL, 0x9933dc72UL, 0x1c259377UL, 0x2b4f5176UL,
+    0x72f11774UL, 0x459bd575UL, 0x78dc897eUL, 0x4fb64b7fUL, 0x16080d7dUL,
+    0x2162cf7cUL, 0xa4748079UL, 0x931e4278UL, 0xcaa0047aUL, 0xfdcac67bUL,
+    0xb02ebc6cUL, 0x87447e6dUL, 0xdefa386fUL, 0xe990fa6eUL, 0x6c86b56bUL,
+    0x5bec776aUL, 0x02523168UL, 0x3538f369UL, 0x087faf62UL, 0x3f156d63UL,
+    0x66ab2b61UL, 0x51c1e960UL, 0xd4d7a665UL, 0xe3bd6464UL, 0xba032266UL,
+    0x8d69e067UL, 0x20cbd748UL, 0x17a11549UL, 0x4e1f534bUL, 0x7975914aUL,
+    0xfc63de4fUL, 0xcb091c4eUL, 0x92b75a4cUL, 0xa5dd984dUL, 0x989ac446UL,
+    0xaff00647UL, 0xf64e4045UL, 0xc1248244UL, 0x4432cd41UL, 0x73580f40UL,
+    0x2ae64942UL, 0x1d8c8b43UL, 0x5068f154UL, 0x67023355UL, 0x3ebc7557UL,
+    0x09d6b756UL, 0x8cc0f853UL, 0xbbaa3a52UL, 0xe2147c50UL, 0xd57ebe51UL,
+    0xe839e25aUL, 0xdf53205bUL, 0x86ed6659UL, 0xb187a458UL, 0x3491eb5dUL,
+    0x03fb295cUL, 0x5a456f5eUL, 0x6d2fad5fUL, 0x801b35e1UL, 0xb771f7e0UL,
+    0xeecfb1e2UL, 0xd9a573e3UL, 0x5cb33ce6UL, 0x6bd9fee7UL, 0x3267b8e5UL,
+    0x050d7ae4UL, 0x384a26efUL, 0x0f20e4eeUL, 0x569ea2ecUL, 0x61f460edUL,
+    0xe4e22fe8UL, 0xd388ede9UL, 0x8a36abebUL, 0xbd5c69eaUL, 0xf0b813fdUL,
+    0xc7d2d1fcUL, 0x9e6c97feUL, 0xa90655ffUL, 0x2c101afaUL, 0x1b7ad8fbUL,
+    0x42c49ef9UL, 0x75ae5cf8UL, 0x48e900f3UL, 0x7f83c2f2UL, 0x263d84f0UL,
+    0x115746f1UL, 0x944109f4UL, 0xa32bcbf5UL, 0xfa958df7UL, 0xcdff4ff6UL,
+    0x605d78d9UL, 0x5737bad8UL, 0x0e89fcdaUL, 0x39e33edbUL, 0xbcf571deUL,
+    0x8b9fb3dfUL, 0xd221f5ddUL, 0xe54b37dcUL, 0xd80c6bd7UL, 0xef66a9d6UL,
+    0xb6d8efd4UL, 0x81b22dd5UL, 0x04a462d0UL, 0x33cea0d1UL, 0x6a70e6d3UL,
+    0x5d1a24d2UL, 0x10fe5ec5UL, 0x27949cc4UL, 0x7e2adac6UL, 0x494018c7UL,
+    0xcc5657c2UL, 0xfb3c95c3UL, 0xa282d3c1UL, 0x95e811c0UL, 0xa8af4dcbUL,
+    0x9fc58fcaUL, 0xc67bc9c8UL, 0xf1110bc9UL, 0x740744ccUL, 0x436d86cdUL,
+    0x1ad3c0cfUL, 0x2db902ceUL, 0x4096af91UL, 0x77fc6d90UL, 0x2e422b92UL,
+    0x1928e993UL, 0x9c3ea696UL, 0xab546497UL, 0xf2ea2295UL, 0xc580e094UL,
+    0xf8c7bc9fUL, 0xcfad7e9eUL, 0x9613389cUL, 0xa179fa9dUL, 0x246fb598UL,
+    0x13057799UL, 0x4abb319bUL, 0x7dd1f39aUL, 0x3035898dUL, 0x075f4b8cUL,
+    0x5ee10d8eUL, 0x698bcf8fUL, 0xec9d808aUL, 0xdbf7428bUL, 0x82490489UL,
+    0xb523c688UL, 0x88649a83UL, 0xbf0e5882UL, 0xe6b01e80UL, 0xd1dadc81UL,
+    0x54cc9384UL, 0x63a65185UL, 0x3a181787UL, 0x0d72d586UL, 0xa0d0e2a9UL,
+    0x97ba20a8UL, 0xce0466aaUL, 0xf96ea4abUL, 0x7c78ebaeUL, 0x4b1229afUL,
+    0x12ac6fadUL, 0x25c6adacUL, 0x1881f1a7UL, 0x2feb33a6UL, 0x765575a4UL,
+    0x413fb7a5UL, 0xc429f8a0UL, 0xf3433aa1UL, 0xaafd7ca3UL, 0x9d97bea2UL,
+    0xd073c4b5UL, 0xe71906b4UL, 0xbea740b6UL, 0x89cd82b7UL, 0x0cdbcdb2UL,
+    0x3bb10fb3UL, 0x620f49b1UL, 0x55658bb0UL, 0x6822d7bbUL, 0x5f4815baUL,
+    0x06f653b8UL, 0x319c91b9UL, 0xb48adebcUL, 0x83e01cbdUL, 0xda5e5abfUL,
+    0xed3498beUL
+  },
+  {
+    0x00000000UL, 0x6567bcb8UL, 0x8bc809aaUL, 0xeeafb512UL, 0x5797628fUL,
+    0x32f0de37UL, 0xdc5f6b25UL, 0xb938d79dUL, 0xef28b4c5UL, 0x8a4f087dUL,
+    0x64e0bd6fUL, 0x018701d7UL, 0xb8bfd64aUL, 0xddd86af2UL, 0x3377dfe0UL,
+    0x56106358UL, 0x9f571950UL, 0xfa30a5e8UL, 0x149f10faUL, 0x71f8ac42UL,
+    0xc8c07bdfUL, 0xada7c767UL, 0x43087275UL, 0x266fcecdUL, 0x707fad95UL,
+    0x1518112dUL, 0xfbb7a43fUL, 0x9ed01887UL, 0x27e8cf1aUL, 0x428f73a2UL,
+    0xac20c6b0UL, 0xc9477a08UL, 0x3eaf32a0UL, 0x5bc88e18UL, 0xb5673b0aUL,
+    0xd00087b2UL, 0x6938502fUL, 0x0c5fec97UL, 0xe2f05985UL, 0x8797e53dUL,
+    0xd1878665UL, 0xb4e03addUL, 0x5a4f8fcfUL, 0x3f283377UL, 0x8610e4eaUL,
+    0xe3775852UL, 0x0dd8ed40UL, 0x68bf51f8UL, 0xa1f82bf0UL, 0xc49f9748UL,
+    0x2a30225aUL, 0x4f579ee2UL, 0xf66f497fUL, 0x9308f5c7UL, 0x7da740d5UL,
+    0x18c0fc6dUL, 0x4ed09f35UL, 0x2bb7238dUL, 0xc518969fUL, 0xa07f2a27UL,
+    0x1947fdbaUL, 0x7c204102UL, 0x928ff410UL, 0xf7e848a8UL, 0x3d58149bUL,
+    0x583fa823UL, 0xb6901d31UL, 0xd3f7a189UL, 0x6acf7614UL, 0x0fa8caacUL,
+    0xe1077fbeUL, 0x8460c306UL, 0xd270a05eUL, 0xb7171ce6UL, 0x59b8a9f4UL,
+    0x3cdf154cUL, 0x85e7c2d1UL, 0xe0807e69UL, 0x0e2fcb7bUL, 0x6b4877c3UL,
+    0xa20f0dcbUL, 0xc768b173UL, 0x29c70461UL, 0x4ca0b8d9UL, 0xf5986f44UL,
+    0x90ffd3fcUL, 0x7e5066eeUL, 0x1b37da56UL, 0x4d27b90eUL, 0x284005b6UL,
+    0xc6efb0a4UL, 0xa3880c1cUL, 0x1ab0db81UL, 0x7fd76739UL, 0x9178d22bUL,
+    0xf41f6e93UL, 0x03f7263bUL, 0x66909a83UL, 0x883f2f91UL, 0xed589329UL,
+    0x546044b4UL, 0x3107f80cUL, 0xdfa84d1eUL, 0xbacff1a6UL, 0xecdf92feUL,
+    0x89b82e46UL, 0x67179b54UL, 0x027027ecUL, 0xbb48f071UL, 0xde2f4cc9UL,
+    0x3080f9dbUL, 0x55e74563UL, 0x9ca03f6bUL, 0xf9c783d3UL, 0x176836c1UL,
+    0x720f8a79UL, 0xcb375de4UL, 0xae50e15cUL, 0x40ff544eUL, 0x2598e8f6UL,
+    0x73888baeUL, 0x16ef3716UL, 0xf8408204UL, 0x9d273ebcUL, 0x241fe921UL,
+    0x41785599UL, 0xafd7e08bUL, 0xcab05c33UL, 0x3bb659edUL, 0x5ed1e555UL,
+    0xb07e5047UL, 0xd519ecffUL, 0x6c213b62UL, 0x094687daUL, 0xe7e932c8UL,
+    0x828e8e70UL, 0xd49eed28UL, 0xb1f95190UL, 0x5f56e482UL, 0x3a31583aUL,
+    0x83098fa7UL, 0xe66e331fUL, 0x08c1860dUL, 0x6da63ab5UL, 0xa4e140bdUL,
+    0xc186fc05UL, 0x2f294917UL, 0x4a4ef5afUL, 0xf3762232UL, 0x96119e8aUL,
+    0x78be2b98UL, 0x1dd99720UL, 0x4bc9f478UL, 0x2eae48c0UL, 0xc001fdd2UL,
+    0xa566416aUL, 0x1c5e96f7UL, 0x79392a4fUL, 0x97969f5dUL, 0xf2f123e5UL,
+    0x05196b4dUL, 0x607ed7f5UL, 0x8ed162e7UL, 0xebb6de5fUL, 0x528e09c2UL,
+    0x37e9b57aUL, 0xd9460068UL, 0xbc21bcd0UL, 0xea31df88UL, 0x8f566330UL,
+    0x61f9d622UL, 0x049e6a9aUL, 0xbda6bd07UL, 0xd8c101bfUL, 0x366eb4adUL,
+    0x53090815UL, 0x9a4e721dUL, 0xff29cea5UL, 0x11867bb7UL, 0x74e1c70fUL,
+    0xcdd91092UL, 0xa8beac2aUL, 0x46111938UL, 0x2376a580UL, 0x7566c6d8UL,
+    0x10017a60UL, 0xfeaecf72UL, 0x9bc973caUL, 0x22f1a457UL, 0x479618efUL,
+    0xa939adfdUL, 0xcc5e1145UL, 0x06ee4d76UL, 0x6389f1ceUL, 0x8d2644dcUL,
+    0xe841f864UL, 0x51792ff9UL, 0x341e9341UL, 0xdab12653UL, 0xbfd69aebUL,
+    0xe9c6f9b3UL, 0x8ca1450bUL, 0x620ef019UL, 0x07694ca1UL, 0xbe519b3cUL,
+    0xdb362784UL, 0x35999296UL, 0x50fe2e2eUL, 0x99b95426UL, 0xfcdee89eUL,
+    0x12715d8cUL, 0x7716e134UL, 0xce2e36a9UL, 0xab498a11UL, 0x45e63f03UL,
+    0x208183bbUL, 0x7691e0e3UL, 0x13f65c5bUL, 0xfd59e949UL, 0x983e55f1UL,
+    0x2106826cUL, 0x44613ed4UL, 0xaace8bc6UL, 0xcfa9377eUL, 0x38417fd6UL,
+    0x5d26c36eUL, 0xb389767cUL, 0xd6eecac4UL, 0x6fd61d59UL, 0x0ab1a1e1UL,
+    0xe41e14f3UL, 0x8179a84bUL, 0xd769cb13UL, 0xb20e77abUL, 0x5ca1c2b9UL,
+    0x39c67e01UL, 0x80fea99cUL, 0xe5991524UL, 0x0b36a036UL, 0x6e511c8eUL,
+    0xa7166686UL, 0xc271da3eUL, 0x2cde6f2cUL, 0x49b9d394UL, 0xf0810409UL,
+    0x95e6b8b1UL, 0x7b490da3UL, 0x1e2eb11bUL, 0x483ed243UL, 0x2d596efbUL,
+    0xc3f6dbe9UL, 0xa6916751UL, 0x1fa9b0ccUL, 0x7ace0c74UL, 0x9461b966UL,
+    0xf10605deUL
+#endif
+  }
+};
index 32e0827..f52ec02 100755 (executable)
--- a/inventory
+++ b/inventory
@@ -24,6 +24,8 @@ ANNOUNCE-2.6.6
 ANNOUNCE-2.6.7
 ANNOUNCE-2.6.8
 ANNOUNCE-2.6.9
+ANNOUNCE-3.0-devel1
+ANNOUNCE-3.0-devel2
 Assemble.c
 bitmap.c
 bitmap.h
@@ -31,6 +33,8 @@ Build.c
 ChangeLog
 config.c
 COPYING
+crc32.c
+crc32.h
 Create.c
 Detail.c
 dlink.c
@@ -44,10 +48,13 @@ inventory
 kernel-patch-2.6.18
 kernel-patch-2.6.18.6
 kernel-patch-2.6.19
+kernel-patch-2.6.25
+kernel-patch-2.6.27
 Kill.c
 makedist
 Makefile
 Manage.c
+managemon.c
 mapfile.c
 md.4
 md5.h
@@ -59,6 +66,8 @@ mdadm.h
 mdadm.spec
 mdassemble.8
 mdassemble.c
+mdmon.c
+mdmon.h
 mdopen.c
 md_p.h
 mdstat.c
@@ -66,17 +75,23 @@ md_u.h
 misc/
 misc/syslog-events
 mkinitramfs
+monitor.c
 Monitor.c
+msg.c
+msg.h
 pwgr.c
 Query.c
 raid5extend.c
 ReadMe.c
 README.initramfs
 restripe.c
+sg_io.c
 sha1.c
 sha1.h
 super0.c
 super1.c
+super-ddf.c
+super-intel.c
 swap_super.c
 sysfs.c
 test
@@ -128,4 +143,5 @@ tests/check
 tests/testdev
 tests/ToTest
 TODO
+udev-md-raid.rules
 util.c
diff --git a/kernel-patch-2.6.25 b/kernel-patch-2.6.25
new file mode 100644 (file)
index 0000000..2329007
--- /dev/null
@@ -0,0 +1,199 @@
+Status: ok
+
+Support adding a spare to a live md array with external metadata.
+
+i.e. extend the 'md/dev-XXX/slot' attribute so that you can
+tell a device to fill an vacant slot in an and md array.
+
+
+Signed-off-by: Neil Brown <neilb@suse.de>
+
+### Diffstat output
+ ./drivers/md/md.c        |   44 ++++++++++++++++++++++++++++++++++++++++----
+ ./drivers/md/multipath.c |    7 ++++++-
+ ./drivers/md/raid1.c     |    7 ++++++-
+ ./drivers/md/raid10.c    |   10 ++++++++--
+ ./drivers/md/raid5.c     |   10 ++++++++--
+ 5 files changed, 68 insertions(+), 10 deletions(-)
+
+diff .prev/drivers/md/md.c ./drivers/md/md.c
+--- .prev/drivers/md/md.c      2008-06-05 09:19:56.000000000 +1000
++++ ./drivers/md/md.c  2008-06-10 10:41:21.000000000 +1000
+@@ -1932,7 +1932,7 @@ slot_store(mdk_rdev_t *rdev, const char 
+               slot = -1;
+       else if (e==buf || (*e && *e!= '\n'))
+               return -EINVAL;
+-      if (rdev->mddev->pers) {
++      if (rdev->mddev->pers && slot == -1) {
+               /* Setting 'slot' on an active array requires also
+                * updating the 'rd%d' link, and communicating
+                * with the personality with ->hot_*_disk.
+@@ -1940,8 +1940,6 @@ slot_store(mdk_rdev_t *rdev, const char 
+                * failed/spare devices.  This normally happens automatically,
+                * but not when the metadata is externally managed.
+                */
+-              if (slot != -1)
+-                      return -EBUSY;
+               if (rdev->raid_disk == -1)
+                       return -EEXIST;
+               /* personality does all needed checks */
+@@ -1955,6 +1953,44 @@ slot_store(mdk_rdev_t *rdev, const char 
+               sysfs_remove_link(&rdev->mddev->kobj, nm);
+               set_bit(MD_RECOVERY_NEEDED, &rdev->mddev->recovery);
+               md_wakeup_thread(rdev->mddev->thread);
++      } else if (rdev->mddev->pers) {
++              mdk_rdev_t *rdev2;
++              struct list_head *tmp;
++              /* Activating a spare .. or possibly reactivating
++               * if we every get bitmaps working here.
++               */
++
++              if (rdev->raid_disk != -1)
++                      return -EBUSY;
++
++              if (rdev->mddev->pers->hot_add_disk == NULL)
++                      return -EINVAL;
++
++              rdev_for_each(rdev2, tmp, rdev->mddev)
++                      if (rdev2->raid_disk == slot)
++                              return -EEXIST;
++
++              rdev->raid_disk = slot;
++              if (test_bit(In_sync, &rdev->flags))
++                      rdev->saved_raid_disk = slot;
++              else
++                      rdev->saved_raid_disk = -1;
++              err = rdev->mddev->pers->
++                      hot_add_disk(rdev->mddev, rdev);
++              if (err != 1) {
++                      rdev->raid_disk = -1;
++                      if (err == 0)
++                              return -EEXIST;
++                      return err;
++              }
++              sprintf(nm, "rd%d", rdev->raid_disk);
++              if (sysfs_create_link(&rdev->mddev->kobj, &rdev->kobj, nm))
++                      printk(KERN_WARNING
++                             "md: cannot register "
++                             "%s for %s\n",
++                             nm, mdname(rdev->mddev));
++
++              /* don't wakeup anyone, leave that to userspace. */
+       } else {
+               if (slot >= rdev->mddev->raid_disks)
+                       return -ENOSPC;
+@@ -4205,7 +4241,7 @@ static int add_new_disk(mddev_t * mddev,
+                       super_types[mddev->major_version].
+                               validate_super(mddev, rdev);
+                       err = mddev->pers->hot_add_disk(mddev, rdev);
+-                      if (err)
++                      if (err < 0)
+                               unbind_rdev_from_array(rdev);
+               }
+               if (err)
+
+diff .prev/drivers/md/multipath.c ./drivers/md/multipath.c
+--- .prev/drivers/md/multipath.c       2008-05-30 14:49:31.000000000 +1000
++++ ./drivers/md/multipath.c   2008-06-10 10:35:03.000000000 +1000
+@@ -284,10 +284,15 @@ static int multipath_add_disk(mddev_t *m
+       int found = 0;
+       int path;
+       struct multipath_info *p;
++      int first = 0;
++      int last = mddev->raid_disks - 1;
++
++      if (rdev->raid_disk >= 0)
++              first = last = rdev->raid_disk;
+       print_multipath_conf(conf);
+-      for (path=0; path<mddev->raid_disks; path++) 
++      for (path = first; path <= last; path++)
+               if ((p=conf->multipaths+path)->rdev == NULL) {
+                       q = rdev->bdev->bd_disk->queue;
+                       blk_queue_stack_limits(mddev->queue, q);
+
+diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
+--- .prev/drivers/md/raid10.c  2008-05-30 14:49:31.000000000 +1000
++++ ./drivers/md/raid10.c      2008-06-10 10:28:53.000000000 +1000
+@@ -1116,6 +1116,8 @@ static int raid10_add_disk(mddev_t *mdde
+       int found = 0;
+       int mirror;
+       mirror_info_t *p;
++      int first = 0;
++      int last = mddev->raid_disks - 1;
+       if (mddev->recovery_cp < MaxSector)
+               /* only hot-add to in-sync arrays, as recovery is
+@@ -1125,12 +1127,16 @@ static int raid10_add_disk(mddev_t *mdde
+       if (!enough(conf))
+               return 0;
++      if (rdev->raid_disk)
++              first = last = rdev->raid_disk;
++
+       if (rdev->saved_raid_disk >= 0 &&
++          rdev->saved_raid_disk >= first &&
+           conf->mirrors[rdev->saved_raid_disk].rdev == NULL)
+               mirror = rdev->saved_raid_disk;
+       else
+-              mirror = 0;
+-      for ( ; mirror < mddev->raid_disks; mirror++)
++              mirror = first;
++      for ( ; mirror <= last ; mirror++)
+               if ( !(p=conf->mirrors+mirror)->rdev) {
+                       blk_queue_stack_limits(mddev->queue,
+
+diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
+--- .prev/drivers/md/raid1.c   2008-05-30 14:49:31.000000000 +1000
++++ ./drivers/md/raid1.c       2008-06-10 10:41:00.000000000 +1000
+@@ -1103,8 +1103,13 @@ static int raid1_add_disk(mddev_t *mddev
+       int found = 0;
+       int mirror = 0;
+       mirror_info_t *p;
++      int first = 0;
++      int last = mddev->raid_disks - 1;
+-      for (mirror=0; mirror < mddev->raid_disks; mirror++)
++      if (rdev->raid_disk >= 0)
++              first = last = rdev->raid_disk;
++
++      for (mirror = first; mirror <= last; mirror++)
+               if ( !(p=conf->mirrors+mirror)->rdev) {
+                       blk_queue_stack_limits(mddev->queue,
+
+diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
+--- .prev/drivers/md/raid5.c   2008-05-30 14:49:35.000000000 +1000
++++ ./drivers/md/raid5.c       2008-06-10 10:27:51.000000000 +1000
+@@ -4399,21 +4399,27 @@ static int raid5_add_disk(mddev_t *mddev
+       int found = 0;
+       int disk;
+       struct disk_info *p;
++      int first = 0;
++      int last = conf->raid_disks - 1;
+       if (mddev->degraded > conf->max_degraded)
+               /* no point adding a device */
+               return 0;
++      if (rdev->raid_disk >= 0)
++              first = last = rdev->raid_disk;
++
+       /*
+        * find the disk ... but prefer rdev->saved_raid_disk
+        * if possible.
+        */
+       if (rdev->saved_raid_disk >= 0 &&
++          rdev->saved_raid_disk >= first &&
+           conf->disks[rdev->saved_raid_disk].rdev == NULL)
+               disk = rdev->saved_raid_disk;
+       else
+-              disk = 0;
+-      for ( ; disk < conf->raid_disks; disk++)
++              disk = first;
++      for ( ; disk <= last ; disk++)
+               if ((p=conf->disks + disk)->rdev == NULL) {
+                       clear_bit(In_sync, &rdev->flags);
+                       rdev->raid_disk = disk;
diff --git a/kernel-patch-2.6.27 b/kernel-patch-2.6.27
new file mode 100644 (file)
index 0000000..8d0785d
--- /dev/null
@@ -0,0 +1,36 @@
+touch_mnt_namespace when the mount flags change
+
+From: Dan Williams <dan.j.williams@intel.com>
+
+Daemons that need to be launched while the rootfs is read-only can now
+poll /proc/mounts to be notified when their O_RDWR requests may no
+longer end in EROFS.
+
+Cc: Kay Sievers <kay.sievers@vrfy.org>
+Cc: Neil Brown <neilb@suse.de>
+Signed-off-by: Dan Williams <dan.j.williams@intel.com>
+---
+
+ fs/namespace.c |    7 ++++++-
+ 1 files changed, 6 insertions(+), 1 deletions(-)
+
+
+diff --git a/fs/namespace.c b/fs/namespace.c
+index 6e283c9..1bd5ba2 100644
+--- a/fs/namespace.c
++++ b/fs/namespace.c
+@@ -1553,8 +1553,13 @@ static noinline int do_remount(struct nameidata *nd, int flags, int mnt_flags,
+       if (!err)
+               nd->path.mnt->mnt_flags = mnt_flags;
+       up_write(&sb->s_umount);
+-      if (!err)
++      if (!err) {
+               security_sb_post_remount(nd->path.mnt, flags, data);
++
++              spin_lock(&vfsmount_lock);
++              touch_mnt_namespace(nd->path.mnt->mnt_ns);
++              spin_unlock(&vfsmount_lock);
++      }
+       return err;
+ }
diff --git a/managemon.c b/managemon.c
new file mode 100644 (file)
index 0000000..e02c77e
--- /dev/null
@@ -0,0 +1,711 @@
+/*
+ * mdmon - monitor external metadata arrays
+ *
+ * Copyright (C) 2007-2008 Neil Brown <neilb@suse.de>
+ * Copyright (C) 2007-2008 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+/*
+ * The management thread for monitoring active md arrays.
+ * This thread does things which might block such as memory
+ * allocation.
+ * In particular:
+ *
+ * - Find out about new arrays in this container.
+ *   Allocate the data structures and open the files.
+ *
+ *   For this we watch /proc/mdstat and find new arrays with
+ *   metadata type that confirms sharing. e.g. "md4"
+ *   When we find a new array we slip it into the list of
+ *   arrays and signal 'monitor' by writing to a pipe.
+ *
+ * - Respond to reshape requests by allocating new data structures
+ *   and opening new files.
+ *
+ *   These come as a change to raid_disks.  We allocate a new
+ *   version of the data structures and slip it into the list.
+ *   'monitor' will notice and release the old version.
+ *   Changes to level, chunksize, layout.. do not need re-allocation.
+ *   Reductions in raid_disks don't really either, but we handle
+ *   them the same way for consistency.
+ *
+ * - When a device is added to the container, we add it to the metadata
+ *   as a spare.