Run the tests in a RAM disk. It's still a UFS file system and is backed
by 20GB of disk, but this avoids a lot of I/O. Even though we disable
fsync, our tests do a lot of directory manipulations, some of which
force file system meta-data to disk and flush slow device write caches
on UFS. This was a bottleneck preventing effective scaling beyond 2
CPUs.
Now we can use 4 CPUs like on other OSes, for a huge speedup.