Since kernel commit 948b701a607f
(binfmt_misc: add persistent opened binary handler for containers)
kernel allows to load the interpreter at the configuration time.
In case of chroot, it allows to have the interpreter in the host root
filesystem and not to copy it to the chroot filesystem.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180627205317.10343-3-laurent@vivier.eu>
move credential value to its own variable to be able to manage
more flags
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180627205317.10343-2-laurent@vivier.eu>
It's the old, lgpl vgabios implementation.
Was left in as fallback when we switched to seavgabios, so we could
easily switch back in case we see regressions. It's unused since years
now, reportedly doesn't even build, and lacks support for recently (and
not so recently) added display devices.
Zap it.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
QNX reportedly requires this to boot.
Should also speed up booting other guests.
Note: Upstream seabios defaults this to 'n' to due to known problems
on physical hardware (qemu not affected), and wouldn't flip the default
to 'y'. So we adjust our local build config accordingly.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Both bochs-display and ramfb are devices with a simple framebuffer and
no vga emulation or text mode. seavgabios has support for text mode
emulation (at vgabios call level), we are using that to provide some
vga compatibility support for these devices.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
git shortlog rel-1.11.1..rel-1.11.2
-----------------------------------
Gerd Hoffmann (11):
optionrom: enable non-vga display devices
cbvga: factor out cbvga_setup_modes()
qemu: add bochs-display support
cbvga_setup_modes: use real mode number instead of 0x140
cbvga_list_modes: don't list current mode twice
cbvga_set_mode: disable clearmem in windows x86 emulator.
bochs_display_setup: return error on failure
pmm: use tmp zone on oom
vgasrc: add allocate_pmm()
qemu: add qemu ramfb support
cbvga_set_mode: refine clear display logic
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When installing a TLB entry, remove any cached version of the
same page in the VTLB. If the existing TLB entry matches, do
not copy into the VTLB, but overwrite it.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In get_page_addr_code() when we check whether the TLB entry
is marked as TLB_RECHECK, we should not go down that code
path if the TLB entry is not valid at all (ie the TLB_INVALID
bit is set).
Tested-by: Laurent Vivier <laurent@vivier.eu>
Reported-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180629161731.16239-1-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In commit 71b9a45330 we changed the condition we use
to determine whether we need to refill the TLB in
get_page_addr_code() to
if (unlikely(env->tlb_table[mmu_idx][index].addr_code !=
(addr & (TARGET_PAGE_MASK | TLB_INVALID_MASK)))) {
This isn't the right check (it will falsely fail if the
input addr happens to have the low bit corresponding to
TLB_INVALID_MASK set, for instance). Replace it with a
use of the new tlb_hit() function, which is the correct test.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180629162122.19376-3-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The condition to check whether an address has hit against a particular
TLB entry is not completely trivial. We do this in various places, and
in fact in one place (get_page_addr_code()) we have got the condition
wrong. Abstract it out into new tlb_hit() and tlb_hit_page() inline
functions (one for a known-page-aligned address and one for an
arbitrary address), and use them in all the places where we had the
condition correct.
This is a no-behaviour-change patch; we leave fixing the buggy
code in get_page_addr_code() to a subsequent patch.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180629162122.19376-2-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit 0b5c91f ("translate-all: use per-page locking in !user-mode",
2018-06-15) introduced per-page locking. It assumed that the physical
pages corresponding to a TB (at most two pages) are always distinct,
which is wrong. For instance, an xtensa test provided by Max Filippov
is broken by the commit, since the test maps two virtual pages
to the same physical page:
virt1: 7fff, virt2: 8000
phys1 6000fff, phys2 6000000
Fix it by removing the assumption from page_lock_pair.
If the two physical page addresses are equal, we only lock
the PageDesc once. Note that the two callers of page_lock_pair,
namely page_unlock_tb and tb_link_page, are also updated so that
we do not try to unlock the same PageDesc twice.
Fixes: 0b5c91f74f
Reported-by: Max Filippov <jcmvbkbc@gmail.com>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1529944302-14186-1-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
- add bpb/ppa15 features to default cpu model for z196 and later
- rework TOD handling and fix cpu hotplug under tcg
- various fixes
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEw9DWbcNiT/aowBjO3s9rk8bwL68FAls6B/QSHGNvaHVja0By
ZWRoYXQuY29tAAoJEN7Pa5PG8C+vqCIQAII2fFuPicarkERO8DBbn/nzPWMNp20f
UUkhmS5w7Bng3Fq2hL3aihmRijMM7K4+6M4m0/mEf6Z07dQmC5M/FGNsV22/ToFb
wjGUZM/SaqJF+gosEvA/pzhc8GmgPCDF3z2Phbdqsa4Ck33sMKyZIGveH8jF9lDf
8HoljDQ06ckXhaIsX1DZ9I6o6u5CpoCOSLrwuLpVfzFK1fimD04B44GINZg/oJQ0
e++ac2OwyV7OgjdLiNlVJOI5UV7iVrgZRvtGLqLzRCWLJVDl85I2JwhQwx0mHv4Y
Of0SyiY3SF5jiV/FCLdd/k9CxMUzBXhZvq5vi7qtejakiVbVA+W5E1hugUUl794a
gjlsrkKnpFFAW6QU8/4bq88I9e0F4LXLfaK1dCnOsJX3kGIruUjLGfOTM/T2Cuz9
0oJPYgZs0NAxTh+9wsFOlhS0EBr8jczn+DRRQjUPrzWk2INLtl8kKyjh71UW9yl5
vYCZevK9KUuZXNGBsqVn2pS1rKbryoJVm3IZudJXBvyj5EPo4q3Tdxm7NZKGPIlu
e6VXHinhMtk7BZt5J2HNKGFIo5IPCcYQJCgs6+M1wRTHK6VtquVvas0y5aW5Wvl8
0lHI0fmflDoHbPiyRH9xvOS3r/y9xNYoUUG2GxjQqteW7tJAuhYy/rQIvBbVKufL
0xFKC4vdhV7c
=GF+m
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20180702' into staging
s390x updates:
- add bpb/ppa15 features to default cpu model for z196 and later
- rework TOD handling and fix cpu hotplug under tcg
- various fixes
# gpg: Signature made Mon 02 Jul 2018 12:09:40 BST
# gpg: using RSA key DECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg: aka "Cornelia Huck <cohuck@kernel.org>"
# gpg: aka "Cornelia Huck <cohuck@redhat.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20180702:
s390x/tcg: fix locking problem with tcg_s390_tod_updated
s390x/kvm: indicate alignment in legacy_s390_alloc()
s390x/kvm: legacy_s390_alloc() only supports one allocation
s390x/tcg: fix CPU hotplug with single-threaded TCG
s390x/tcg: rearm the CKC timer during migration
s390x/tcg: implement SET CLOCK
s390x/tcg: SET CLOCK COMPARATOR can clear CKC interrupts
s390x/tcg: properly implement the TOD
s390x/tcg: drop tod_basetime
s390x/tod: factor out TOD into separate device
s390x/kvm: pass values instead of pointers to kvm_s390_set_clock_*()
s390x/tcg: avoid overflows in time2tod/tod2time
s390x/cpumodel: default enable bpb and ppa15 for z196 and later
loader: Check access size when calling rom_ptr() to avoid crashes
s390/ipl: fix ipl with -no-reboot
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fix the --disable-tcg breakage introduced by 8bca9a03ec:
$ configure --disable-tcg
[...]
$ make -C i386-softmmu exec.o
make: Entering directory 'i386-softmmu'
CC exec.o
In file included from source/qemu/exec.c:62:0:
source/qemu/include/exec/ram_addr.h:96:6: error: conflicting types for ‘tb_invalidate_phys_range’
void tb_invalidate_phys_range(ram_addr_t start, ram_addr_t end);
^~~~~~~~~~~~~~~~~~~~~~~~
In file included from source/qemu/exec.c:24:0:
source/qemu/include/exec/exec-all.h:309:6: note: previous declaration of ‘tb_invalidate_phys_range’ was here
void tb_invalidate_phys_range(target_ulong start, target_ulong end);
^~~~~~~~~~~~~~~~~~~~~~~~
source/qemu/exec.c:1043:6: error: conflicting types for ‘tb_invalidate_phys_addr’
void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
^~~~~~~~~~~~~~~~~~~~~~~
In file included from source/qemu/exec.c:24:0:
source/qemu/include/exec/exec-all.h:308:6: note: previous declaration of ‘tb_invalidate_phys_addr’ was here
void tb_invalidate_phys_addr(target_ulong addr);
^~~~~~~~~~~~~~~~~~~~~~~
make: *** [source/qemu/rules.mak:69: exec.o] Error 1
make: Leaving directory 'i386-softmmu'
Tested to build x86_64-softmmu and i386-softmmu targets.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180629200710.27626-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
"move16 %a0@+,%a1@" and "fmovel (cpid=3) %a0@-,%fpcr"
share the same opcode.
To fix that, backport the fix from binutils:
2005-11-10 Andreas Schwab <schwab@suse.de>
* m68k-dis.c (print_insn_m68k): Only match FPU insns with
coprocessor ID 1.
Reported-by: Thomas Huth <huth@tuxfamily.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Tested-by: Thomas Huth <huth@tuxfamily.org>
Message-Id: <20180625203559.21370-2-laurent@vivier.eu>
Doesn't build on 32bit clang. And because we run under qemu mutex
anyway they are not needed.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180627111936.31019-1-kraxel@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
tcg_s390_tod_updated() is always called with the iothread being locked
(e.g. from S390TODClass->set() e.g. via HELPER(sck) or on incoming
migration). The helper we call takes the lock itself - bad.
Let's change that by factoring out updating the ckc timer. This now looks
much nicer than having to call a helper from another function.
While touching it we also make sure that env->ckc is updated even if the
new value is -1ULL, for now it would not have been modified in that case.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180629170520.13671-1-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's do this for completeness reason, although we don't support e.g.
PCDIMM/NVDIMM, which would use the alignment for placing the memory
region in guest physical memory. But maybe someday we would want to
support something like this - then we don't forget about this if
allowing multiple allocations in legacy_s390_alloc().
Use the same alignment as we would set in qemu_anon_ram_alloc(). Our
fixed address satisfies this alignment (1MB). This implicitly sets the
alignment of the underlying memory region.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180628113817.30814-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We always allocate at a fixed address, a second allocation can therefore
of course never work. We would simply overwrite mappings.
This can e.g. happen in s390_memory_init(), if trying to allocate more
than > 8TB. Let's just bail out, as there is no need for supporting it
(legacy handling for z/VM).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180628113817.30814-2-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
run_on_cpu() doesn't seem to work reliably until the CPU has been fully
created if the single-threaded TCG main loop is already running.
Therefore, hotplugging a CPU under single-threaded TCG does currently
not work. We should use the direct call instead of going via
run_on_cpu().
So let's use run_on_cpu() for KVM only - KVM requires it due to the initial
CPU reset ioctl. As a nice side effect, we get rid of the ifdef.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-10-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
If the CPU data is migrated after the TOD clock, the CKC timer of a CPU
is not rearmed. Let's rearm it when loading the CPU state.
Introduce tcg-stub.c just like kvm-stub.c for tcg specific stubs.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-9-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This allows a guest to change its TOD. We already take care of updating
all CKC timers from within S390TODClass.
Use MO_ALIGN to load the operand manually - this will properly trigger a
SPECIFICATION exception.
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's stop the timer and delete any pending CKC IRQ before doing
anything else.
While at it, add a comment why the check for ckc == -1ULL is needed.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-7-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Right now, each CPU has its own TOD. Especially, the TOD will differ
based on creation time of a CPU - e.g. when hotplugging a CPU the times
will differ quite a lot, resulting in stall warnings in the guest.
Let's use a single TOD by implementing our new TOD device. Prepare it
for TOD-clock epoch extension.
Most importantly, whenever we set the TOD, we have to update the CKC
timer.
Introduce "tcg_s390x.h" just like "kvm_s390x.h" for tcg specific
function declarations that should not go into cpu.h.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Never set to anything but 0.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's treat this like a separate device. TCG will have to store the
actual state/time later on.
Include cpu-qom.h in kvm_s390x.h (due to S390CPU) to compile tod-kvm.c.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-4-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We are going to factor out the TOD into a separate device and use const
pointers for device class functions where possible. We are passing right
now ordinary pointers that should never be touched when setting the TOD.
Let's just pass the values directly.
Note that s390_set_clock() will be removed in a follow-on patch and
therefore its calling convention is not changed.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Big values for the TOD/ns clock can result in some overflows that can be
avoided. Not all overflows can be handled however, as the conversion either
multiplies by 4.096 or divided by 4.096.
Apply the trick used in the Linux kernel in arch/s390/include/asm/timex.h
for tod_to_ns() and use the same trick also for the conversion in the
other direction.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Most systems and host kernels provide the necessary building blocks for
bpb and ppa15. We can reverse the logic and default enable those
features, while still allowing to disable it via cpu model.
So let us add bpb and ppa15 to z196 and later default CPU model for the
qemu 3.0 machine. (like -cpu z13). Older machine types (e.g.
s390-ccw-virtio-2.12) will retain the old value and not provide those
bits in the default model.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20180626123830.18282-1-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The rom_ptr() function allows direct access to the ROM blobs that we
load during startup. However, there are currently no checks for the
size of the accesses, so it's currently possible to crash QEMU for
example with:
$ echo "Insane in the mainframe" > /tmp/test.txt
$ s390x-softmmu/qemu-system-s390x -kernel /tmp/test.txt -append xyz
Segmentation fault (core dumped)
$ s390x-softmmu/qemu-system-s390x -kernel /tmp/test.txt -initrd /tmp/test.txt
Segmentation fault (core dumped)
$ echo -n HdrS > /tmp/hdr.txt
$ sparc64-softmmu/qemu-system-sparc64 -kernel /tmp/hdr.txt -initrd /tmp/hdr.txt
Segmentation fault (core dumped)
We need a possibility to check the size of the ROM area that we want
to access, thus let's add a size parameter to the rom_ptr() function
to avoid these problems.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1530005740-25254-1-git-send-email-thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
kexec/kdump as well as the bootloader use a subcode of diagnose 308
that is supposed to reset the I/O subsystem but not comprise a full
"reboot". With the latest refactoring this is now broken when
-no-reboot is used or when libvirt acts on a reboot QMP event, for
example a virt-install from iso images.
We need to mark these "subsystem resets" as special.
Fixes: a30fb811cb (s390x: refactor reset/reipl handling)
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180622102928.173420-1-borntraeger@de.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The xtensa frontend calls get_page_addr_code() from its
itlb_hit_test helper function. This function is really part
of the TCG core's internals, and calling it from a target
helper makes it awkward to make changes to that core code.
It also means that we don't pass the correct retaddr to
tlb_fill(), so we won't correctly handle the case where
an exception is generated.
The helper is used for the instructions IHI, IHU and IPFL.
Change it to call cpu_ldb_code_ra() instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This will reduce the size of the patch in the next patch,
where the context will have to be a pointer.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
The usage of DISAS_UPDATE is after noreturn helpers.
It is thus indistinguishable from DISAS_NORETURN.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
ISA book documents that the first instruction of zero overhead loop
must fit completely into naturally aligned region of an instruction
fetch unit size. Check that condition and log a message if it's
violated.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Out-Of-Band handlers need to protect shared state if there is any.
Mention it in the document. Meanwhile, touch up some other places too,
either with better English, or reordering of bullets.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180620073223.31964-6-peterx@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
In my Out-Of-Band test, "check -qcow2 060" fail with this:
--- /home/peterx/git/qemu/tests/qemu-iotests/060.out
+++ /home/peterx/git/qemu/bin/tests/qemu-iotests/060.out.bad
@@ -427,8 +427,8 @@
QMP_VERSION
{"return": {}}
qcow2: Image is corrupt: L2 table offset 0x2a2a2a00 unaligned (L1
index: 0); further non-fatal corruption events will be suppressed
-{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_IMAGE_CORRUPTED", "data": {"device": "", "msg": "L2 table offset 0x2a2a2a0
0 unaligned (L1 index: 0)", "node-name": "drive", "fatal": false}}
read failed: Input/output error
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_IMAGE_CORRUPTED", "data": {"device": "", "msg": "L2 table offset 0x2a2a2a0
0 unaligned (L1 index: 0)", "node-name": "drive", "fatal": false}}
{"return": ""}
{"return": {}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP},
"event": "SHUTDOWN", "data": {"guest": false}}
The order of the event and the in/out error line is swapped. I didn't
dig up the reason, but AFAIU what we want to verify is the event rather
than stderr. Let's drop the stderr line directly for this test.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180620073223.31964-5-peterx@redhat.com>
[Commit message touched up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Previously we clean up the queues when we got CLOSED event. It was used
to make sure we won't send leftover replies/events of a old client to a
new client which makes perfect sense. However this will also drop the
replies/events even if the output port of the previous chardev backend
is still open, which can lead to missing of the last replies/events.
Now this patch does an extra operation to flush the response queue
before cleaning up.
In most cases, a QMP session will be based on a bidirectional channel (a
TCP port, for example, we read/write to the same socket handle), so in
port and out port of the backend chardev are fundamentally the same
port. In these cases, it does not really matter much on whether we'll
flush the response queue since flushing will fail anyway. However there
can be cases where in & out ports of the QMP monitor's backend chardev
are separated. Here is an example:
cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands
In this case, the backend is fd-typed, and it is connected to stdio
where in port is stdin and out port is stdout. Now if we drop all the
events on the response queue then filter_command process might miss some
events that it might expect. The thing is that, when stdin closes,
stdout might still be there alive!
In practice, I encountered SHUTDOWN event missing when running test with
iotest 087 with Out-Of-Band enabled. Here is one of the ways that this
can happen (after "quit" command is executed and QEMU quits the main
loop):
1. [main thread] QEMU queues a SHUTDOWN event into response queue.
2. "cat" terminates (to distinguish it from the animal, I quote it).
3. [monitor iothread] QEMU's monitor iothread reads EOF from stdin.
4. [monitor iothread] QEMU's monitor iothread calls the CLOSED event
hook for the monitor, which will destroy the response queue of the
monitor, then the SHUTDOWN event is dropped.
5. [main thread] QEMU's main thread cleans up the monitors in
monitor_cleanup(). When trying to flush pending responses, it sees
nothing. SHUTDOWN is lost forever.
Note that before the monitor iothread was introduced, step [4]/[5] could
never happen since the main loop was the only place to detect the EOF
event of stdin and run the CLOSED event hooks. Now things can happen in
parallel in the iothread.
Without this patch, iotest 087 will have ~10% chance to miss the
SHUTDOWN event and fail when with Out-Of-Band enabled:
--- /home/peterx/git/qemu/tests/qemu-iotests/087.out
+++ /home/peterx/git/qemu/bin/tests/qemu-iotests/087.out.bad
@@ -8,7 +8,6 @@
{"return": {}}
{"error": {"class": "GenericError", "desc": "'node-name' must be
specified for the root node"}}
{"return": {}}
-{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
=== Duplicate ID ===
@@ -53,7 +52,6 @@
{"return": {}}
{"return": {}}
{"return": {}}
-{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
This patch fixes the problem.
Fixes: 6d2d563f8c ("qmp: cleanup qmp queues properly", 2018-03-27)
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180620073223.31964-4-peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message and a comment touched up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The old names are confusing since both of the old functions are popping
an item from multiple queues rather than a single queue. In that
sense, *_pop_any() suites better than *_pop_one().
Since at it, touch up the function monitor_qmp_response_pop_any() a bit
to let the callers pass in a QMPResponse struct instead of returning a
struct. Change the return value to boolean to mark whether we have
popped a valid response instead.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180620073223.31964-3-peterx@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
It was unclear before on what does the CLOSED event mean. Meanwhile we
add a TODO to fix up the CLOSED event in the future when the in/out
ports are different for a chardev.
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "Marc-André Lureau" <marcandre.lureau@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180620073223.31964-2-peterx@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>