block: asynchronously stop the VM on I/O errors
With virtio-blk dataplane, I/O errors might occur while QEMU is not in the main I/O thread. However, it's invalid to call vm_stop when we're neither in a VCPU thread nor in the main I/O thread, even if we were to take the iothread mutex around it. To avoid this problem, we can raise a request to the main I/O thread, similar to what QEMU does when vm_stop is called from a CPU thread. We know that bdrv_error_action is called from an AIO callback, and the moment at which the callback will fire is not well-defined; it depends on the moment at which the disk or OS finishes the operation, which can happen at any time. Note that QEMU is certainly not in a CPU thread and we do not need to call cpu_stop_current() like vm_stop() does. However, we need to ensure that any action taken by management will result in correct detection of the error _and_ a running VM. In particular: - the event must be raised after the iostatus has been set, so that "info block" will return an iostatus that matches the event. - the VM must be stopped after the iostatus has been set, so that "info block" will return an iostatus that matches the runstate. The ordering between the STOP and BLOCK_IO_ERROR events is preserved; BLOCK_IO_ERROR is documented to come first. This makes bdrv_error_action() thread safe (assuming QMP events are, which is attacked by a separate series). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit is contained in:
parent
74892d2468
commit
2bd3bce8ef
21
block.c
21
block.c
@ -3626,10 +3626,27 @@ void bdrv_error_action(BlockDriverState *bs, BlockErrorAction action,
|
||||
bool is_read, int error)
|
||||
{
|
||||
assert(error >= 0);
|
||||
bdrv_emit_qmp_error_event(bs, QEVENT_BLOCK_IO_ERROR, action, is_read);
|
||||
|
||||
if (action == BDRV_ACTION_STOP) {
|
||||
vm_stop(RUN_STATE_IO_ERROR);
|
||||
/* First set the iostatus, so that "info block" returns an iostatus
|
||||
* that matches the events raised so far (an additional error iostatus
|
||||
* is fine, but not a lost one).
|
||||
*/
|
||||
bdrv_iostatus_set_err(bs, error);
|
||||
|
||||
/* Then raise the request to stop the VM and the event.
|
||||
* qemu_system_vmstop_request_prepare has two effects. First,
|
||||
* it ensures that the STOP event always comes after the
|
||||
* BLOCK_IO_ERROR event. Second, it ensures that even if management
|
||||
* can observe the STOP event and do a "cont" before the STOP
|
||||
* event is issued, the VM will not stop. In this case, vm_start()
|
||||
* also ensures that the STOP/RESUME pair of events is emitted.
|
||||
*/
|
||||
qemu_system_vmstop_request_prepare();
|
||||
bdrv_emit_qmp_error_event(bs, QEVENT_BLOCK_IO_ERROR, action, is_read);
|
||||
qemu_system_vmstop_request(RUN_STATE_IO_ERROR);
|
||||
} else {
|
||||
bdrv_emit_qmp_error_event(bs, QEVENT_BLOCK_IO_ERROR, action, is_read);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -62,7 +62,7 @@ Data:
|
||||
- "action": action that has been taken, it's one of the following (json-string):
|
||||
"ignore": error has been ignored
|
||||
"report": error has been reported to the device
|
||||
"stop": error caused VM to be stopped
|
||||
"stop": the VM is going to stop because of the error
|
||||
|
||||
Example:
|
||||
|
||||
|
@ -1,7 +1,12 @@
|
||||
#include "qemu-common.h"
|
||||
#include "sysemu/sysemu.h"
|
||||
|
||||
int vm_stop(RunState state)
|
||||
void qemu_system_vmstop_request_prepare(void)
|
||||
{
|
||||
abort();
|
||||
}
|
||||
|
||||
void qemu_system_vmstop_request(RunState state)
|
||||
{
|
||||
abort();
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user