We need some of the fields without having to poison everything else.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
If the multifd_send_threads is not created when migration is failed,
multifd_save_cleanup would be called twice. In this senario, the
multifd_send_state is accessed after it has been released, the result
is that the source VM is crashing down.
Here is the coredump stack:
Program received signal SIGSEGV, Segmentation fault.
0x00005629333a78ef in multifd_send_terminate_threads (err=err@entry=0x0) at migration/ram.c:1012
1012 MultiFDSendParams *p = &multifd_send_state->params[i];
#0 0x00005629333a78ef in multifd_send_terminate_threads (err=err@entry=0x0) at migration/ram.c:1012
#1 0x00005629333ab8a9 in multifd_save_cleanup () at migration/ram.c:1028
#2 0x00005629333abaea in multifd_new_send_channel_async (task=0x562935450e70, opaque=<optimized out>) at migration/ram.c:1202
#3 0x000056293373a562 in qio_task_complete (task=task@entry=0x562935450e70) at io/task.c:196
#4 0x000056293373a6e0 in qio_task_thread_result (opaque=0x562935450e70) at io/task.c:111
#5 0x00007f475d4d75a7 in g_idle_dispatch () from /usr/lib64/libglib-2.0.so.0
#6 0x00007f475d4da9a9 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
#7 0x0000562933785b33 in glib_pollfds_poll () at util/main-loop.c:219
#8 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
#9 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:518
#10 0x00005629334c5acf in main_loop () at vl.c:1810
#11 0x000056293334d7bb in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4471
If the multifd_send_threads is not created when migration is failed.
In this senario, we don't call multifd_save_cleanup in multifd_new_send_channel_async.
Signed-off-by: Zhimin Feng <fengzhimin1@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This function returns true if we are in the middle of a migration.
It is like migration_is_setup_or_active() with CANCELLING and COLO.
Adapt all callers that are needed.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Test that this sequence works:
- launch source
- launch target
- start migration
- cancel migration
- relaunch target
- do migration again
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
If we do a cancel, we got out without one error, but we can't do the
rest of the output as in a normal situation.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Be sure that we are not doing neither read/write after shutdown of the
QEMUFile.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Add Artist graphics
Fix main memory allocation
Improve LDCW emulation wrt real hw
-----BEGIN PGP SIGNATURE-----
iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAl4vMa8dHHJpY2hhcmQu
aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/3PwgAmw+Q/rT2kT19vMHd
1XHjK1WJNx4SFRQWxwMsbxDoWyFslUZH5G0z0l7zB1eG7ONEZBttUVyOVnyPH5Q5
DUmfHMvS838lHkLU+OWPbfbwB8WZzfwUwHi3u8ljRBM52RZYf+m69/yMRd8H+PmF
bDq3zCviAqvIIvWdSmPEfsx9v4WmrE2aULkKN2aZsHYHzkuHmPWfSYe2dzxTcO3z
zDXoscUVmtVk29jpwHV4gM7zl9uk8jyvfeeB2fZ2/EY4qgZ+iHrhtnglfCdCCDr0
G1Q5vugJ70lFkYM2EzpyU+leHUREXN7xqYm5Iycv4neO+aS2FFkNpxCZvPofihHo
rUFcOw==
=kj86
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/rth/tags/pull-pa-20200127' into staging
Improve LASI emulation
Add Artist graphics
Fix main memory allocation
Improve LDCW emulation wrt real hw
# gpg: Signature made Mon 27 Jan 2020 18:53:35 GMT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-pa-20200127:
target/hppa: Allow, but diagnose, LDCW aligned only mod 4
hw/hppa/machine: Map the PDC memory region with higher priority
hw/hppa/machine: Restrict the total memory size to 3GB
hw/hppa/machine: Correctly check the firmware is in PDC range
hppa: Add emulation of Artist graphics
seabios-hppa: update to latest version
hppa: Switch to tulip NIC by default
hppa: add emulation of LASI PS2 controllers
ps2: accept 'Set Key Make and Break' commands
hppa: Add support for LASI chip with i82596 NIC
hw/hppa/dino.c: Improve emulation of Dino PCI chip
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fuzzing the Linux kernel with syzkaller allowed to find how to crash qemu
using a special SCSI_IOCTL_SEND_COMMAND. It hits the assertion in
ide_dma_cb() introduced in the commit a718978ed5 in July 2015.
Currently this bug is not reproduced by the unit tests.
Let's improve the ide-test to cover more PRDT cases including one
that causes this particular qemu crash.
The test is developed according to the Programming Interface for
Bus Master IDE Controller (Revision 1.0 5/16/94).
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Message-id: 20191223175117.508990-3-alex.popov@linux.com
Signed-off-by: John Snow <jsnow@redhat.com>
The commit a718978ed5 from July 2015 introduced the assertion which
implies that the size of successful DMA transfers handled in ide_dma_cb()
should be multiple of 512 (the size of a sector). But guest systems can
initiate DMA transfers that don't fit this requirement.
For fixing that let's check the number of bytes prepared for the transfer
by the prepare_buf() handler. The code in ide_dma_cb() must behave
according to the Programming Interface for Bus Master IDE Controller
(Revision 1.0 5/16/94):
1. If PRDs specified a smaller size than the IDE transfer
size, then the Interrupt and Active bits in the Controller
status register are not set (Error Condition).
2. If the size of the physical memory regions was equal to
the IDE device transfer size, the Interrupt bit in the
Controller status register is set to 1, Active bit is set to 0.
3. If PRDs specified a larger size than the IDE transfer size,
the Interrupt and Active bits in the Controller status register
are both set to 1.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20191223175117.508990-2-alex.popov@linux.com
Signed-off-by: John Snow <jsnow@redhat.com>
The PA-RISC 1.1 specification says that LDCW must be aligned mod 16
or the operation is undefined. However, real hardware only generates
an unaligned access trap for unaligned mod 4.
Match real hardware, but diagnose with GUEST_ERROR a violation of
the specification.
At the same time fix a bug in the initialization of mop, where the
size was specified twice, and another to free the zero temporary.
Tested-by: Helge Deller <deller@gmx.de>
Reported-by: Helge Deller <deller@gmx.de>
Suggested-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The region in range [0xf0000000 - 0xf1000000] is the PDC area
(Processor Dependent Code), where the firmware is loaded.
This region has higher priority than the main memory.
When the machine has more than 3840MB of RAM, there is an
overlap. Since the PDC is closer to the CPU in the bus
hierarchy, it gets accessed first, and the CPU does not have
access to the RAM in this range.
To model the same behavior and keep a simple memory layout,
reduce the priority of the RAM region. The PDC region ends
overlapping the RAM.
Acked-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200109000525.24744-4-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The hardware expects DIMM slots of 1 or 2 GB, allowing up to
4 GB of memory. We want to accept the same amount of memory the
hardware can deal with. DIMMs of 768MB are not available.
However we have to deal with a firmware limitation: currently
SeaBIOS only supports 32-bit, and expects the RAM size in a
32-bit register. When using a 4GB configuration, the 32-bit
register get truncated and we report a size of 0MB to SeaBIOS,
which ends halting the machine:
$ qemu-system-hppa -m 4g -serial stdio
SeaBIOS: Machine configured with too little memory (0 MB), minimum is 16 MB.
SeaBIOS wants SYSTEM HALT.
The easiest way is to restrict the machine to 3GB of memory.
Acked-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200109000525.24744-3-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The firmware has to reside in the PDC range. If the Elf file
expects to load it below FIRMWARE_START, it is incorrect,
regardless the RAM size.
Acked-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200109000525.24744-2-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This adds emulation of Artist graphics good enough to get a text
console on both Linux and HP-UX. The X11 server from HP-UX also works.
Adjust boot-serial-test to disable graphics, so that SeaBIOS outputs
to the serial port, as expected by the test.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20191220211512.3289-6-svens@stackframe.org>
[rth: Merge Helge's test for machine->enable_graphics]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Helge Deller (13):
Add PDC_MEM_MAP and ENTRY_INIT_SRCH_FRST for OSF/MkLinux
Return non-existant BTLB for PDC_BLOCK_TLB
Add serial, parallel and LAN port support of LASI chip
Implement ENTRY_IO_BBLOCK_IN IODC function
Do not print \r on parisc SeaBIOS
Fix serial ports and add PDC_MODEL functions for special instructions enablement
Implement SeaBIOS returning additional addresses. Fixes HP-UX boot.
Fix mod_pgs (number of pages) for graphic cards
Merge pull request #3 from svenschnelle/sti
Merge pull request #4 from svenschnelle/parisc-qemu-4.1.0
parisc: Implement PDC rendenzvous
parisc: Improve soft power button emulation
parisc: Fix line wrapping in STI console code
Sven Schnelle (7):
parisc: fix PDC info for graphics adapter
parisc: add missing header guard to hppa.h
parisc: add LASI PS/2 emulation.
parisc: Add STI support
parisc: wire up graphics console
parisc: Add support for setting STI screen resolution
parisc: support LASI RTC register
Required for STI and LASI support. Also adds a few Bugfixes.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20191220211512.3289-7-svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Most HP PA-RISC machines have a Digital DS21142/43 Tulip network card,
only some very latest generation machines have an e1000 NIC.
Since qemu now provides an emulated tulip card, use that one instead.
Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <20191221222530.GB27803@ls3530.fritz.box>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20191220211512.3289-5-svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
HP-UX sends both the 'Set key make and break (0xfc) and
'Set all key typematic make and break' (0xfa). QEMU response
with 'Resend' as it doesn't handle these commands. HP-UX than
reports an PS/2 max retransmission exceeded error. Add these
commands and just reply with ACK.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20191220211512.3289-4-svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
LASI is a built-in multi-I/O chip which supports serial, parallel,
network (Intel i82596 Apricot), sound and other functionalities.
LASI has been used in many HP PARISC machines.
This patch adds the necessary parts to allow Linux and HP-UX to detect
LASI and the network card.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20191220211512.3289-3-svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
- iscsi: Cap block count from GET LBA STATUS (CVE-2020-1711)
- AioContext fixes in QMP commands for backup and bitmaps
- iotests fixes
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJeLyLGAAoJEH8JsnLIjy/WK3sP/jc+rZwTLQ/1RbF/vQBlnR+B
6Ff25xwHqF6FL2vR2ldsfUtzqxuxKGl2KJMv07YbvnKljiefOR8r4sCVgGUGjB4R
rpMAIu/7qjhE7/ybyibYUm8WxblP+v+ZAyuyK2KVC9GFizWkDXV+ArBeEEDTPX29
owN79UsZBcs+38TpQnr2fzW6LE9KhRlC3A+LIb9kd+KyrUosB+xCQBHxVu1eDiub
jahM+i3CN/NubpKmJXsZX8u+wn7pI1+1kEF2upBMcjxTIX1VTXxUDZs09sdYYU9p
5CMkPL9VC4qaI5fjp5KnFUlR5vppQudoV94GkNMboScuylEavhQ/izJuc3FLP113
EWAZB0aRv8zlcBffhDrFzj642sZV4Rm0tSFzHdBnPLAvWAC9OvrztsTNv2E7oNCV
lV6AfTiuNf9BtI9NsxwRyTuhIz+BfllrRFmVzualAQkwL9oxi8RnJbduw1uVzaYf
WmxIDvnhgKrHAdR/BtFQ1bml5HkQnflvxuIHNvJk4qENyo0/2PhrUi7eQJ//1I9A
bURXp3zrOcNm9kyoorIrSwktbxKG002NPu9+5QUWWdsRLzmftiy0IQnEBx/lDSPA
FH/CWwOukoV+z3qZgW8JnxnS5FXHHUDkdiAtV5mdN4YO9wN3IAojYfkeXQMnGjT/
5u47vAA+5Kkv9oMIbsQ/
=tsNA
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- iscsi: Cap block count from GET LBA STATUS (CVE-2020-1711)
- AioContext fixes in QMP commands for backup and bitmaps
- iotests fixes
# gpg: Signature made Mon 27 Jan 2020 17:49:58 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
iscsi: Don't access non-existent scsi_lba_status_descriptor
iscsi: Cap block count from GET LBA STATUS (CVE-2020-1711)
block/backup: fix memory leak in bdrv_backup_top_append()
iotests: Test handling of AioContexts with some blockdev actions
blockdev: Return bs to the proper context on snapshot abort
blockdev: Acquire AioContext on dirty bitmap functions
block/backup-top: Don't acquire context while dropping top
blockdev: honor bdrv_try_set_aio_context() context requirements
blockdev: unify qmp_blockdev_backup and blockdev-backup transaction paths
blockdev: unify qmp_drive_backup and drive-backup transaction paths
blockdev: fix coding style issues in drive_backup_prepare
iotests: Add more "skip_if_unsupported" statements to the python tests
iotests.py: Let wait_migration wait even more
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The tests of the dino chip with the Online-diagnostics CD
("ODE DINOTEST") now succeeds.
Additionally add some qemu trace events.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191220211512.3289-2-svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In iscsi_co_block_status(), we may have received num_descriptors == 0
from the iscsi server. Therefore, we can't unconditionally access
lbas->descriptors[0]. Add the missing check.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
When querying an iSCSI server for the provisioning status of blocks (via
GET LBA STATUS), Qemu only validates that the response descriptor zero's
LBA matches the one requested. Given the SCSI spec allows servers to
respond with the status of blocks beyond the end of the LUN, Qemu may
have its heap corrupted by clearing/setting too many bits at the end of
its allocmap for the LUN.
A malicious guest in control of the iSCSI server could carefully program
Qemu's heap (by selectively setting the bitmap) and then smash it.
This limits the number of bits that iscsi_co_block_status() will try to
update in the allocmap so it can't overflow the bitmap.
Fixes: CVE-2020-1711
Cc: qemu-stable@nongnu.org
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_open_driver() allocates bs->opaque according to drv->instance_size.
There is no need to allocate it and overwrite opaque in
bdrv_backup_top_append().
Reproducer:
$ QTEST_QEMU_BINARY=./x86_64-softmmu/qemu-system-x86_64 valgrind -q --leak-check=full tests/test-replication -p /replication/secondary/start
==29792== 24 bytes in 1 blocks are definitely lost in loss record 52 of 226
==29792== at 0x483AB1A: calloc (vg_replace_malloc.c:762)
==29792== by 0x4B07CE0: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6000.7)
==29792== by 0x12BAB9: bdrv_open_driver (block.c:1289)
==29792== by 0x12BEA9: bdrv_new_open_driver (block.c:1359)
==29792== by 0x1D15CB: bdrv_backup_top_append (backup-top.c:190)
==29792== by 0x1CC11A: backup_job_create (backup.c:439)
==29792== by 0x1CD542: replication_start (replication.c:544)
==29792== by 0x1401B9: replication_start_all (replication.c:52)
==29792== by 0x128B50: test_secondary_start (test-replication.c:427)
...
Fixes: 7df7868b96 ("block: introduce backup-top filter driver")
Signed-off-by: Eiichi Tsukata <devel@etsukata.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Includes the following tests:
- Adding a dirty bitmap.
* RHBZ: 1782175
- Starting a drive-mirror to an NBD-backed target.
* RHBZ: 1746217, 1773517
- Aborting an external snapshot transaction.
* RHBZ: 1779036
- Aborting a blockdev backup transaction.
* RHBZ: 1782111
For each one of them, a VM with a number of disks running in an
IOThread AioContext is used.
Signed-off-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
external_snapshot_abort() calls to bdrv_set_backing_hd(), which
returns state->old_bs to the main AioContext, as it's intended to be
used then the BDS is going to be released. As that's not the case when
aborting an external snapshot, return it to the AioContext it was
before the call.
This issue can be triggered by issuing a transaction with two actions,
a proper blockdev-snapshot-sync and a bogus one, so the second will
trigger a transaction abort. This results in a crash with an stack
trace like this one:
#0 0x00007fa1048b28df in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007fa10489ccf5 in __GI_abort () at abort.c:79
#2 0x00007fa10489cbc9 in __assert_fail_base
(fmt=0x7fa104a03300 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=0x557224014d30 "block.c", line=2240, function=<optimized out>) at assert.c:92
#3 0x00007fa1048aae96 in __GI___assert_fail
(assertion=assertion@entry=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=file@entry=0x557224014d30 "block.c", line=line@entry=2240, function=function@entry=0x5572240b5d60 <__PRETTY_FUNCTION__.31620> "bdrv_replace_child_noperm") at assert.c:101
#4 0x0000557223e631f8 in bdrv_replace_child_noperm (child=0x557225b9c980, new_bs=new_bs@entry=0x557225c42e40) at block.c:2240
#5 0x0000557223e68be7 in bdrv_replace_node (from=0x557226951a60, to=0x557225c42e40, errp=0x5572247d6138 <error_abort>) at block.c:4196
#6 0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1731
#7 0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1717
#8 0x0000557223d09013 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x557225cc7d70, errp=errp@entry=0x7ffe704c0c98) at blockdev.c:2360
#9 0x0000557223e32085 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffe704c0d08) at qapi/qapi-commands-transaction.c:44
#10 0x0000557223ee798c in do_qmp_dispatch (errp=0x7ffe704c0d00, allow_oob=<optimized out>, request=<optimized out>, cmds=0x5572247d3cc0 <qmp_commands>) at qapi/qmp-dispatch.c:132
#11 0x0000557223ee798c in qmp_dispatch (cmds=0x5572247d3cc0 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175
#12 0x0000557223e06141 in monitor_qmp_dispatch (mon=0x557225c69ff0, req=<optimized out>) at monitor/qmp.c:120
#13 0x0000557223e0678a in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:209
#14 0x0000557223f2f366 in aio_bh_call (bh=0x557225b9dc60) at util/async.c:117
#15 0x0000557223f2f366 in aio_bh_poll (ctx=ctx@entry=0x557225b9c840) at util/async.c:117
#16 0x0000557223f32754 in aio_dispatch (ctx=0x557225b9c840) at util/aio-posix.c:459
#17 0x0000557223f2f242 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
#18 0x00007fa10913467d in g_main_dispatch (context=0x557225c28e80) at gmain.c:3176
#19 0x00007fa10913467d in g_main_context_dispatch (context=context@entry=0x557225c28e80) at gmain.c:3829
#20 0x0000557223f31808 in glib_pollfds_poll () at util/main-loop.c:219
#21 0x0000557223f31808 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
#22 0x0000557223f31808 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
#23 0x0000557223d13201 in main_loop () at vl.c:1828
#24 0x0000557223bbfb82 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1779036
Signed-off-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Dirty map addition and removal functions are not acquiring to BDS
AioContext, while they may call to code that expects it to be
acquired.
This may trigger a crash with a stack trace like this one:
#0 0x00007f0ef146370f in __GI_raise (sig=sig@entry=6)
at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f0ef144db25 in __GI_abort () at abort.c:79
#2 0x0000565022294dce in error_exit
(err=<optimized out>, msg=msg@entry=0x56502243a730 <__func__.16350> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
#3 0x00005650222950ba in qemu_mutex_unlock_impl
(mutex=mutex@entry=0x5650244b0240, file=file@entry=0x565022439adf "util/async.c", line=line@entry=526) at util/qemu-thread-posix.c:108
#4 0x0000565022290029 in aio_context_release
(ctx=ctx@entry=0x5650244b01e0) at util/async.c:526
#5 0x000056502221cd08 in bdrv_can_store_new_dirty_bitmap
(bs=bs@entry=0x5650244dc820, name=name@entry=0x56502481d360 "bitmap1", granularity=granularity@entry=65536, errp=errp@entry=0x7fff22831718)
at block/dirty-bitmap.c:542
#6 0x000056502206ae53 in qmp_block_dirty_bitmap_add
(errp=0x7fff22831718, disabled=false, has_disabled=<optimized out>, persistent=<optimized out>, has_persistent=true, granularity=65536, has_granularity=<optimized out>, name=0x56502481d360 "bitmap1", node=<optimized out>) at blockdev.c:2894
#7 0x000056502206ae53 in qmp_block_dirty_bitmap_add
(node=<optimized out>, name=0x56502481d360 "bitmap1", has_granularity=<optimized out>, granularity=<optimized out>, has_persistent=true, persistent=<optimized out>, has_disabled=false, disabled=false, errp=0x7fff22831718) at blockdev.c:2856
#8 0x00005650221847a3 in qmp_marshal_block_dirty_bitmap_add
(args=<optimized out>, ret=<optimized out>, errp=0x7fff22831798)
at qapi/qapi-commands-block-core.c:651
#9 0x0000565022247e6c in do_qmp_dispatch
(errp=0x7fff22831790, allow_oob=<optimized out>, request=<optimized out>, cmds=0x565022b32d60 <qmp_commands>) at qapi/qmp-dispatch.c:132
#10 0x0000565022247e6c in qmp_dispatch
(cmds=0x565022b32d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175
#11 0x0000565022166061 in monitor_qmp_dispatch
(mon=0x56502450faa0, req=<optimized out>) at monitor/qmp.c:145
#12 0x00005650221666fa in monitor_qmp_bh_dispatcher
(data=<optimized out>) at monitor/qmp.c:234
#13 0x000056502228f866 in aio_bh_call (bh=0x56502440eae0)
at util/async.c:117
#14 0x000056502228f866 in aio_bh_poll (ctx=ctx@entry=0x56502440d7a0)
at util/async.c:117
#15 0x0000565022292c54 in aio_dispatch (ctx=0x56502440d7a0)
at util/aio-posix.c:459
#16 0x000056502228f742 in aio_ctx_dispatch
(source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
#17 0x00007f0ef5ce667d in g_main_dispatch (context=0x56502449aa40)
at gmain.c:3176
#18 0x00007f0ef5ce667d in g_main_context_dispatch
(context=context@entry=0x56502449aa40) at gmain.c:3829
#19 0x0000565022291d08 in glib_pollfds_poll () at util/main-loop.c:219
#20 0x0000565022291d08 in os_host_main_loop_wait
(timeout=<optimized out>) at util/main-loop.c:242
#21 0x0000565022291d08 in main_loop_wait (nonblocking=<optimized out>)
at util/main-loop.c:518
#22 0x00005650220743c1 in main_loop () at vl.c:1828
#23 0x0000565021f20a72 in main
(argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
at vl.c:4504
Fix this by acquiring the AioContext at qmp_block_dirty_bitmap_add()
and qmp_block_dirty_bitmap_add().
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782175
Signed-off-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All paths that lead to bdrv_backup_top_drop(), except for the call
from backup_clean(), imply that the BDS AioContext has already been
acquired, so doing it there too can potentially lead to QEMU hanging
on AIO_WAIT_WHILE().
An easy way to trigger this situation is by issuing a two actions
transaction, with a proper and a bogus blockdev-backup, so the second
one will trigger a rollback. This will trigger a hang with an stack
trace like this one:
#0 0x00007fb680c75016 in __GI_ppoll (fds=0x55e74580f7c0, nfds=1, timeout=<optimized out>,
timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
#1 0x000055e743386e09 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>)
at /usr/include/bits/poll2.h:77
#2 0x000055e743386e09 in qemu_poll_ns
(fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:336
#3 0x000055e743388dc4 in aio_poll (ctx=0x55e7458925d0, blocking=blocking@entry=true)
at util/aio-posix.c:669
#4 0x000055e743305dea in bdrv_flush (bs=bs@entry=0x55e74593c0d0) at block/io.c:2878
#5 0x000055e7432be58e in bdrv_close (bs=0x55e74593c0d0) at block.c:4017
#6 0x000055e7432be58e in bdrv_delete (bs=<optimized out>) at block.c:4262
#7 0x000055e7432be58e in bdrv_unref (bs=bs@entry=0x55e74593c0d0) at block.c:5644
#8 0x000055e743316b9b in bdrv_backup_top_drop (bs=bs@entry=0x55e74593c0d0) at block/backup-top.c:273
#9 0x000055e74331461f in backup_job_create
(job_id=0x0, bs=bs@entry=0x55e7458d5820, target=target@entry=0x55e74589f640, speed=0, sync_mode=MIRROR_SYNC_MODE_FULL, sync_bitmap=sync_bitmap@entry=0x0, bitmap_mode=BITMAP_SYNC_MODE_ON_SUCCESS, compress=false, filter_node_name=0x0, on_source_error=BLOCKDEV_ON_ERROR_REPORT, on_target_error=BLOCKDEV_ON_ERROR_REPORT, creation_flags=0, cb=0x0, opaque=0x0, txn=0x0, errp=0x7ffddfd1efb0) at block/backup.c:478
#10 0x000055e74315bc52 in do_backup_common
(backup=backup@entry=0x55e746c066d0, bs=bs@entry=0x55e7458d5820, target_bs=target_bs@entry=0x55e74589f640, aio_context=aio_context@entry=0x55e7458a91e0, txn=txn@entry=0x0, errp=errp@entry=0x7ffddfd1efb0)
at blockdev.c:3580
#11 0x000055e74315c37c in do_blockdev_backup
(backup=backup@entry=0x55e746c066d0, txn=0x0, errp=errp@entry=0x7ffddfd1efb0)
at /usr/src/debug/qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64/./qapi/qapi-types-block-core.h:1492
#12 0x000055e74315c449 in blockdev_backup_prepare (common=0x55e746a8de90, errp=0x7ffddfd1f018)
at blockdev.c:1885
#13 0x000055e743160152 in qmp_transaction
(dev_list=<optimized out>, has_props=<optimized out>, props=0x55e7467fe2c0, errp=errp@entry=0x7ffddfd1f088) at blockdev.c:2340
#14 0x000055e743287ff5 in qmp_marshal_transaction
(args=<optimized out>, ret=<optimized out>, errp=0x7ffddfd1f0f8)
at qapi/qapi-commands-transaction.c:44
#15 0x000055e74333de6c in do_qmp_dispatch
(errp=0x7ffddfd1f0f0, allow_oob=<optimized out>, request=<optimized out>, cmds=0x55e743c28d60 <qmp_commands>) at qapi/qmp-dispatch.c:132
#16 0x000055e74333de6c in qmp_dispatch
(cmds=0x55e743c28d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>)
at qapi/qmp-dispatch.c:175
#17 0x000055e74325c061 in monitor_qmp_dispatch (mon=0x55e745908030, req=<optimized out>)
at monitor/qmp.c:145
#18 0x000055e74325c6fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234
#19 0x000055e743385866 in aio_bh_call (bh=0x55e745807ae0) at util/async.c:117
#20 0x000055e743385866 in aio_bh_poll (ctx=ctx@entry=0x55e7458067a0) at util/async.c:117
#21 0x000055e743388c54 in aio_dispatch (ctx=0x55e7458067a0) at util/aio-posix.c:459
#22 0x000055e743385742 in aio_ctx_dispatch
(source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
#23 0x00007fb68543e67d in g_main_dispatch (context=0x55e745893a40) at gmain.c:3176
#24 0x00007fb68543e67d in g_main_context_dispatch (context=context@entry=0x55e745893a40) at gmain.c:3829
#25 0x000055e743387d08 in glib_pollfds_poll () at util/main-loop.c:219
#26 0x000055e743387d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
#27 0x000055e743387d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
#28 0x000055e74316a3c1 in main_loop () at vl.c:1828
#29 0x000055e743016a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
at vl.c:4504
Fix this by not acquiring the AioContext there, and ensuring all paths
leading to it have it already acquired (backup_clean()).
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782111
Signed-off-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_try_set_aio_context() requires that the old context is held, and
the new context is not held. Fix all the occurrences where it's not
done this way.
Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Issuing a blockdev-backup from qmp_blockdev_backup takes a slightly
different path than when it's issued from a transaction. In the code,
this is manifested as some redundancy between do_blockdev_backup() and
blockdev_backup_prepare().
This change unifies both paths, merging do_blockdev_backup() and
blockdev_backup_prepare(), and changing qmp_blockdev_backup() to
create a transaction instead of calling do_backup_common() direcly.
As a side-effect, now qmp_blockdev_backup() is executed inside a
drained section, as it happens when creating a blockdev-backup
transaction. This change is visible from the user's perspective, as
the job gets paused and immediately resumed before starting the actual
work.
Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Issuing a drive-backup from qmp_drive_backup takes a slightly
different path than when it's issued from a transaction. In the code,
this is manifested as some redundancy between do_drive_backup() and
drive_backup_prepare().
This change unifies both paths, merging do_drive_backup() and
drive_backup_prepare(), and changing qmp_drive_backup() to create a
transaction instead of calling do_backup_common() direcly.
As a side-effect, now qmp_drive_backup() is executed inside a drained
section, as it happens when creating a drive-backup transaction. This
change is visible from the user's perspective, as the job gets paused
and immediately resumed before starting the actual work.
Also fix tests 141, 185 and 219 to cope with the extra
JOB_STATUS_CHANGE lines.
Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fix a couple of minor coding style issues in drive_backup_prepare.
Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The python code already contains a possibility to skip tests if the
corresponding driver is not available in the qemu binary - use it
in more spots to avoid that the tests are failing if the driver has
been disabled.
While we're at it, we can now also remove some of the old checks that
were using iotests.supports_quorum() - and which were apparently not
working as expected since the tests aborted instead of being skipped
when "quorum" was missing in the QEMU binary.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The "migration completed" event may be sent (on the source, to be
specific) before the migration is actually completed, so the VM runstate
will still be "finish-migrate" instead of "postmigrate". So ask the
users of VM.wait_migration() to specify the final runstate they desire
and then poll the VM until it has reached that state. (This should be
over very quickly, so busy polling is fine.)
Without this patch, I see intermittent failures in the new iotest 280
under high system load. I have not yet seen such failures with other
iotests that use VM.wait_migration() and query-status afterwards, but
maybe they just occur even more rarely, or it is because they also wait
on the destination VM to be running.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The logic was inverted and reported running if the cpu was stopped.
Let's fix that.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Fixes: d1b468bc88 ("s390x/tcg: implement SIGP SENSE RUNNING STATUS")
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200124134818.9981-1-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
While working on the "Enable adapter interruption suppression again"
recently, I had to discover that the meaning of get_machine_class()
and the related *_allowed() wrappers is not very obvious. Add a more
verbose comment here to clarify how these should be used.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200123170256.12386-1-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The separate pointer is now redundant.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200123232248.1800-6-richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
I believe that the separate allocation of DisasFields from DisasContext
was meant to limit the places from which we could access fields. But
that plan did not go unchanged, and since DisasContext contains a pointer
to fields, the substructure is accessible everywhere.
By allocating the substructure with DisasContext, we improve the locality
of the accesses by avoiding one level of pointer chasing. In addition,
we avoid a dangling pointer to stack allocated memory, diagnosed by static
checkers.
Launchpad: https://bugs.launchpad.net/bugs/1661815
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200123232248.1800-5-richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
All callers pass s->fields, so we might as well pass s directly.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200123232248.1800-4-richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The DisasFields data is available from DisasContext.
We do not need to pass a separate argument.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200123232248.1800-3-richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We will want to include the struct in DisasContext.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200123232248.1800-2-richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The AIS feature has been disabled late in the v2.10 development cycle since
there were some issues with migration (see commit 3f2d07b3b0 -
"s390x/ais: for 2.10 stable: disable ais facility"). We originally wanted
to enable it again for newer machine types, but apparently we forgot to do
this so far. Let's do it now for the machines that support proper CPU models.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1756946
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200122101437.5069-1-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Commit ae71ed8610 replaced the use of global max_cpus variable
with a machine property, but introduced a unnecessary ifdef, as
this block is already in the 'not CONFIG_USER_ONLY' branch part:
86 #if defined(CONFIG_USER_ONLY)
87
...
106 #else /* !CONFIG_USER_ONLY */
107
...
292 static void do_ext_interrupt(CPUS390XState *env)
293 {
...
313 #ifndef CONFIG_USER_ONLY
314 MachineState *ms = MACHINE(qdev_get_machine());
315 unsigned int max_cpus = ms->smp.max_cpus;
316 #endif
To ease code review, remove the duplicated preprocessor macro,
and move the declarations at the beginning of the statement.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200121110349.25842-6-philmd@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We currently check (by error) if the passed-in Error pointer errp
is non-null and return after realizing the first child of the
event facility in that case. Symptom is that 'virsh shutdown'
does not work, as the sclpquiesce device is not realized.
Fix this by (correctly) checking the local Error err.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 3d508334dd ("s390x/event-facility: Fix realize() error API violations")
Message-Id: <20200121095506.8537-1-cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>