From 0ff86a47924bededd3404df7874146cab7ddfd2c Mon Sep 17 00:00:00 2001 From: Michael Chapman Date: Wed, 8 Apr 2015 16:51:42 +1000 Subject: [PATCH] qemu_migrate: use nested job when adding NBD to cookie qemuMigrationCookieAddNBD is usually called from within an async MIGRATION_OUT or MIGRATION_IN job, so it needs to start a nested job. (The one exception is during the Begin phase when change protection isn't enabled, but qemuDomainObjEnterMonitorAsync will behave the same as qemuDomainObjEnterMonitor in this case.) This bug was encountered with a libvirt client that repeatedly queries the disk mirroring block job info during a migration. If one of these queries occurs just as the Perform migration cookie is baked, libvirt crashes. Relevant logs are as follows: 6701: warning : qemuDomainObjEnterMonitorInternal:1544 : This thread seems to be the async job owner; entering monitor without asking for a nested job is dangerous [1] 6701: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block","id":"libvirt-629"} [2] 6699: info : qemuMonitorIOWrite:503 : QEMU_MONITOR_IO_WRITE: mon=0x7fefdc004700 buf={"execute":"query-block","id":"libvirt-629"} [3] 6704: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block-jobs","id":"libvirt-630"} [4] 6699: info : qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_REPLY: mon=0x7fefdc004700 reply={"return": [...], "id": "libvirt-629"} 6699: error : qemuMonitorJSONIOProcessLine:211 : internal error: Unexpected JSON reply '{"return": [...], "id": "libvirt-629"}' At [1] qemuMonitorBlockStatsUpdateCapacity sends its request, then waits on mon->notify. At [2] the request is written out to the monitor socket. At [3] qemuMonitorBlockJobInfo sends its request, and also waits on mon->notify. The reply from the first request is received at [4]. However, qemuMonitorJSONIOProcessLine is not expecting this reply since the second request hadn't completed sending. The reply is dropped and an error is returned. qemuMonitorIO signals mon->notify twice during its error handling, waking up both of the threads waiting on it. One of them clears mon->msg as it exits qemuMonitorSend; the other crashes: qemuMonitorSend (mon=0x7fefdc004700, msg=) at qemu/qemu_monitor.c:975 975 while (!mon->msg->finished) { (gdb) print mon->msg $1 = (qemuMonitorMessagePtr) 0x0 Signed-off-by: Michael Chapman (cherry picked from commit 72df8314f02ac575b8407ab1d0d4fbfe82affd9c) --- src/qemu/qemu_migration.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 39ca37c1a2..5daf12f38f 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -570,7 +570,9 @@ qemuMigrationCookieAddNBD(qemuMigrationCookiePtr mig, if (!(stats = virHashCreate(10, virHashValueFree))) goto cleanup; - qemuDomainObjEnterMonitor(driver, vm); + if (qemuDomainObjEnterMonitorAsync(driver, vm, + priv->job.asyncJob) < 0) + goto cleanup; rc = qemuMonitorBlockStatsUpdateCapacity(priv->mon, stats, false); if (qemuDomainObjExitMonitor(driver, vm) < 0) goto cleanup; -- 2.47.3