Since the patch
c76242c5("mdmon: get safe mode delay file descriptor
early"), safe_mode_dalay is set properly by initrd mdmon. But in some
cases with filesystem traffic since the very start of the system, it
might take a while to transit to clean state. Due to fact that new
mdmon does not wait for the old one to exit - it might happen that the
new one switches safe_mode_delay back to seconds, before old one exits.
As the result two mdmons are running concurrently on same array.
Wait for the old mdmon to exit by pinging it with SIGUSR1 signal, just
in case it is sleeping.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
int fd;
int n;
long fl;
+ int rv;
/* first rule of survival... don't off yourself */
if (pid == getpid())
fl &= ~O_NONBLOCK;
fcntl(sock, F_SETFL, fl);
n = read(sock, buf, 100);
- /* Ignore result, it is just the wait that
- * matters
- */
+
+ /* If there is I/O going on it might took some time to get to
+ * clean state. Wait for monitor to exit fully to avoid races.
+ * Ping it with SIGUSR1 in case that it is sleeping */
+ for (n = 0; n < 25; n++) {
+ rv = kill(pid, SIGUSR1);
+ if (rv < 0)
+ break;
+ usleep(200000);
+ }
}
void remove_pidfile(char *devname)