79590 Commits

Author SHA1 Message Date
Cornelia Huck
3ff3c5d317 hw: add compat machines for 5.2
Add 5.2 machine types for arm/i440fx/q35/s390x/spapr.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200819144016.281156-1-cohuck@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-08-19 10:45:48 -04:00
Michal Privoznik
c556600598 qmp: Expose MachineClass::default_ram_id
If a management application (like Libvirt) want's to preserve
migration ability and switch to '-machine memory-backend' it
needs to set exactly the same RAM id as QEMU would. Since the id
is machine type dependant, expose it under 'query-machines'
result. Some machine types don't have the attribute set (riscv
family for example), therefore the QMP attribute must be
optional.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <9384422f63fe594a54d801f9cb4539b1d2ce9b67.1590481402.git.mprivozn@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
[ehabkost: updated doc to "since 5.2"]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-08-19 10:45:21 -04:00
Philippe Mathieu-Daudé
0aca03a3f1 qdev: Document qdev_prop_set_drive_err() return value
Since commit 73ac1aac39 qdev_prop_set_drive_err() returns
a boolean value. Document it.

Fixes: 73ac1aac39 "Make functions taking Error ** return bool, not void"
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200720121659.31886-1-f4bug@amsat.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-08-19 10:44:29 -04:00
Gerd Hoffmann
c180ff4c79 seabios: update to 1.14-final
Due to 1.14 release being delayed it missed qemu 5.1.
The last snapshot has all code changes though, so this
changes only the version string in the binaries.

shortlog
========

Kevin O'Connor (1):
      docs: Note v1.14.0 release

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2020-08-19 08:35:24 +02:00
Peter Maydell
672b2f2695 Open 5.2 development tree
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-08-18 13:44:04 +01:00
Greg Kurz
3110f0ee19 spapr/xive: Use xive_source_esb_len()
static inline size_t xive_source_esb_len(XiveSource *xsrc)
{
    return (1ull << xsrc->esb_shift) * xsrc->nr_irqs;
}

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159733969034.320580.6571451425779179477.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-14 13:35:45 +10:00
Greg Kurz
37035df51e nvram: Exit QEMU if NVRAM cannot contain all -prom-env data
Since commit 61f20b9dc5b7 ("spapr_nvram: Pre-initialize the NVRAM to
support the -prom-env parameter"), pseries machines can pre-initialize
the "system" partition in the NVRAM with the data passed to all -prom-env
parameters on the QEMU command line.

In this case it is assumed that all the data fits in 64 KiB, but the user
can easily pass more and crash QEMU:

$ qemu-system-ppc64 -M pseries $(for ((x=0;x<128;x++)); do \
  echo -n " -prom-env " ; printf "%0.sx" {1..1024}; \
  done) # this requires ~128 Kib
malloc(): corrupted top size
Aborted (core dumped)

This happens because we don't check if all the prom-env data fits in
the NVRAM and chrp_nvram_set_var() happily memcpy() it passed the
buffer.

This crash affects basically all ppc/ppc64 machine types that use -prom-env:
- pseries (all versions)
- g3beige
- mac99

and also sparc/sparc64 machine types:
- LX
- SPARCClassic
- SPARCbook
- SS-10
- SS-20
- SS-4
- SS-5
- SS-600MP
- Voyager
- sun4u
- sun4v

Add a max_len argument to chrp_nvram_create_system_partition() so that
it can check the available size before writing to memory.

Since NVRAM is populated at machine init, it seems reasonable to consider
this error as fatal. So, instead of reporting an error when we detect that
the NVRAM is too small and adapt all machine types to handle it, we simply
exit QEMU in all cases. This is still better than crashing. If someone
wants another behavior, I guess this can be reworked later.

Tested with:

$ yes q | \
  (for arch in ppc ppc64 sparc sparc64; do \
       echo == $arch ==; \
       qemu=${arch}-softmmu/qemu-system-$arch; \
       for mach in $($qemu -M help | awk '! /^Supported/ { print $1 }'); do \
           echo $mach; \
           $qemu -M $mach -monitor stdio -nodefaults -nographic \
           $(for ((x=0;x<128;x++)); do \
                 echo -n " -prom-env " ; printf "%0.sx" {1..1024}; \
             done) >/dev/null; \
        done; echo; \
   done)

Without the patch, affected machine types cause QEMU to report some
memory corruption and crash:

malloc(): corrupted top size

free(): invalid size

*** stack smashing detected ***: terminated

With the patch, QEMU prints the following message and exits:

NVRAM is too small. Try to pass less data to -prom-env

It seems that the conditions for the crash have always existed, but it
affects pseries, the machine type I care for, since commit 61f20b9dc5b7
only.

Fixes: 61f20b9dc5b7 ("spapr_nvram: Pre-initialize the NVRAM to support the -prom-env parameter")
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1867739
Reported-by: John Snow <jsnow@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159736033937.350502.12402444542194031035.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-14 13:34:31 +10:00
Stefan Hajnoczi
44277bf914 aio-posix: keep aio_notify_me disabled during polling
Polling only monitors the ctx->notified field and does not need the
ctx->notifier EventNotifier to be signalled. Keep ctx->aio_notify_me
disabled while polling to avoid unnecessary EventNotifier syscalls.

This optimization improves virtio-blk 4KB random read performance by
18%. The following results are with an IOThread and the null-co block
driver:

Test         IOPS   Error
Before  244518.62 ± 1.20%
After   290706.11 ± 0.44%

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200806131802.569478-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-08-13 13:34:14 +01:00
Stefan Hajnoczi
601829f88e async: always set ctx->notified in aio_notify()
aio_notify() does not set ctx->notified when called with
ctx->aio_notify_me disabled. Therefore aio_notify_me needs to be enabled
during polling.

This is suboptimal since expensive event_notifier_set(&ctx->notifier)
and event_notifier_test_and_clear(&ctx->notifier) calls are required
when ctx->aio_notify_me is enabled.

Change aio_notify() so that aio->notified is always set, regardless of
ctx->aio_notify_me. This will make polling cheaper since
ctx->aio_notify_me can remain disabled. Move the
event_notifier_test_and_clear() to the fd handler function (which is now
no longer an empty function so "dummy" has been dropped from its name).

The next patch takes advantage of this by optimizing polling in
util/aio-posix.c.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200806131802.569478-3-stefanha@redhat.com

[Paolo Bonzini pointed out that the smp_wmb() in aio_notify_accept()
should be smp_wb() but the comment should be smp_wmb() instead of
smp_wb(). Fixed.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-08-13 13:32:59 +01:00
Stefan Hajnoczi
c13be5a1bf async: rename event_notifier_dummy_cb/poll()
The event_notifier_*() prefix can be confused with the EventNotifier
APIs that are also called event_notifier_*().

Rename the functions to aio_context_notifier_*() to make it clear that
they relate to the AioContext::notifier field.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200806131802.569478-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-08-13 13:26:12 +01:00
Greg Kurz
1118b6b727 spapr/xive: Simplify error handling of kvmppc_xive_cpu_synchronize_state()
Now that kvmppc_xive_cpu_get_state() returns negative on error, use that
and get rid of the temporary Error object and error_propagate().

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707852916.1489912.8376334685349668124.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:09:38 +10:00
Greg Kurz
61203f2b35 ppc/xive: Simplify error handling in xive_tctx_realize()
Now that kvmppc_xive_cpu_connect() returns a negative errno on failure,
use that and get rid of the local_err boilerplate.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707852234.1489912.16410314514265848075.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:07:28 +10:00
Greg Kurz
6cdc0e2063 spapr/xive: Simplify error handling in kvmppc_xive_connect()
Now that all these functions return a negative errno on failure, check
that and get rid of the local_err boilerplate.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707851537.1489912.1030839306195472651.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:06:44 +10:00
Greg Kurz
2a8100cb61 ppc/xive: Fix error handling in vmstate_xive_tctx_*() callbacks
Now that kvmppc_xive_cpu_get_state() and kvmppc_xive_cpu_set_state()
return negative errnos on failures, use that instead local_err because
it is the recommended practice. Also return that instead of -1 since
vmstate expects negative errnos.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707850840.1489912.14912810818646455474.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:05:06 +10:00
Greg Kurz
a845a54cbe spapr/xive: Fix error handling in kvmppc_xive_post_load()
Now that all these functions return a negative errno on failure, check
that because it is preferred to local_err. And most of all, propagate it
because vmstate expects negative errnos.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707850148.1489912.18355118622296682631.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:04:05 +10:00
Greg Kurz
42a92d925d spapr/kvm: Fix error handling in kvmppc_xive_pre_save()
Now that kvmppc_xive_get_queues() returns a negative errno on failure, check
with that because it is preferred to local_err. And most of all, propagate
it because vmstate expects negative errnos.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707849455.1489912.6034461176847728064.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:03:35 +10:00
Greg Kurz
d55daadcb8 spapr/xive: Rework error handling of kvmppc_xive_set_source_config()
Since kvm_device_access() returns a negative errno on failure, convert
kvmppc_xive_set_source_config() to use it for error checking. This allows
to get rid of the local_err boilerplate.

Propagate the return value so that callers may use it as well to check
failures.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707848764.1489912.17078842252160674523.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:00:52 +10:00
Greg Kurz
d53482a73b spapr/xive: Rework error handling in kvmppc_xive_get_queues()
Since kvmppc_xive_get_queue_config() has a return value, convert
kvmppc_xive_get_queues() to use it for error checking. This allows
to get rid of the local_err boiler plate.

Propagate the return value so that callers may use it as well to check
failures.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707848069.1489912.14879208798696134531.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:00:52 +10:00
Greg Kurz
f9a548edf2 spapr/xive: Rework error handling of kvmppc_xive_[gs]et_queue_config()
Since kvm_device_access() returns a negative errno on failure, convert
kvmppc_xive_get_queue_config() and kvmppc_xive_set_queue_config() to
use it for error checking. This allows to get rid of the local_err
boilerplate.

Propagate the return value so that callers may use it as well to check
failures.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707847357.1489912.2032291280645236480.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:00:52 +10:00
Greg Kurz
5fa36b7ffb spapr/xive: Rework error handling of kvmppc_xive_cpu_[gs]et_state()
kvm_set_one_reg() returns a negative errno on failure, use that instead
of errno. Also propagate it to callers so they can use it to check
for failures and hopefully get rid of their local_err boilerplate.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707846665.1489912.14267225652103441921.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:00:52 +10:00
Greg Kurz
b14adb4a27 spapr/xive: Rework error handling of kvmppc_xive_mmap()
Callers currently check failures of kvmppc_xive_mmap() through the
@errp argument, which isn't a recommanded practice. It is preferred
to use a return value when possible.

Since NULL isn't an invalid address in theory, it seems better to
return MAP_FAILED and to teach callers to handle it.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707845972.1489912.719896767746375765.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:00:52 +10:00
Greg Kurz
46407a2531 spapr/xive: Rework error handling of kvmppc_xive_source_reset()
Since kvmppc_xive_source_reset_one() has a return value, convert
kvmppc_xive_source_reset() to use it for error checking. This
allows to get rid of the local_err boiler plate.

Propagate the return value so that callers may use it as well to check
failures.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707845245.1489912.9151822670764690034.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:00:52 +10:00
Greg Kurz
3885ca6688 spapr/xive: Rework error handling of kvmppc_xive_cpu_connect()
Use error_setg_errno() instead of error_setg(strerror()). While here,
use -ret instead of errno since kvm_vcpu_enable_cap() returns a negative
errno on failure.

Use ERRP_GUARD() to ensure that errp can be passed to error_append_hint(),
and get rid of the local_err boilerplate.

Propagate the return value so that callers may use it as well to check
failures.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707844549.1489912.4862921680328017645.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:00:52 +10:00
Greg Kurz
4a6891b838 spapr: Simplify error handling in spapr_phb_realize()
The spapr_phb_realize() function has a local_err variable which
is used to:

1) check failures of spapr_irq_findone() and spapr_irq_claim()

2) prepend extra information to the error message

Recent work from Markus Armbruster highlighted we get better
code when testing the return value of a function, rather than
setting up all the local_err boiler plate. For similar reasons,
it is now preferred to use ERRP_GUARD() and error_prepend()
rather than error_propagate_prepend().

Since spapr_irq_findone() and spapr_irq_claim() return negative
values in case of failure, do both changes.

This is just cleanup, no functional impact.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <159707843851.1489912.6108405733810934642.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 21:00:52 +10:00
Greg Kurz
a490711934 spapr/xive: Convert KVM device fd checks to assert()
All callers guard these functions with an xive_in_kernel() helper. Make
it clear that they are only to be called when the KVM XIVE device exists.

Note that the check on xive is dropped in kvmppc_xive_disconnect(). It
really cannot be NULL since it comes from set_active_intc() which only
passes pointers to allocated objects.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <159679994169.876294.11026653581505077112.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 20:58:52 +10:00
Greg Kurz
e519cdd9bc ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers
Calls to the KVM XIVE device are guarded by kvm_irqchip_in_kernel(). This
ensures that QEMU won't try to use the device if KVM is disabled or if
an in-kernel irqchip isn't required.

When using ic-mode=dual with the pseries machine, we have two possible
interrupt controllers: XIVE and XICS. The kvm_irqchip_in_kernel() helper
will return true as soon as any of the KVM device is created. It might
lure QEMU to think that the other one is also around, while it is not.
This is exactly what happens with ic-mode=dual at machine init when
claiming IRQ numbers, which must be done on all possible IRQ backends,
eg. RTAS event sources or the PHB0 LSI table : only the KVM XICS device
is active but we end up calling kvmppc_xive_source_reset_one() anyway,
which fails. This doesn't cause any trouble because of another bug :
kvmppc_xive_source_reset_one() lacks an error_setg() and callers don't
see the failure.

Most of the other kvmppc_xive_* functions have similar xive->fd
checks to filter out the case when KVM XIVE isn't active. It
might look safer to have idempotent functions but it doesn't
really help to understand what's going on when debugging.

Since we already have all the kvm_irqchip_in_kernel() in place,
also have the callers to check xive->fd as well before calling
KVM XIVE specific code. This is straight-forward for the spapr
specific XIVE code. Some more care is needed for the platform
agnostic XIVE code since it cannot access xive->fd directly.
Introduce new in_kernel() methods in some base XIVE classes
for this purpose and implement them only in spapr.

In all cases, we still need to call kvm_irqchip_in_kernel() so that
compilers can optimize the kvmppc_xive_* calls away when CONFIG_KVM
isn't defined, thus avoiding the need for stubs.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159679993438.876294.7285654331498605426.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 20:56:01 +10:00
Greg Kurz
cf36e5b376 ppc/xive: Rework setup of XiveSource::esb_mmio
Depending on whether XIVE is emultated or backed with a KVM XIVE device,
the ESB MMIOs of a XIVE source point to an I/O memory region or a mapped
memory region.

This is currently handled by checking kvm_irqchip_in_kernel() returns
false in xive_source_realize(). This is a bit awkward as we usually
need to do extra things when we're using the in-kernel backend, not
less. But most important, we can do better: turn the existing "xive.esb"
memory region into a plain container, introduce an "xive.esb-emulated"
I/O subregion and rename the existing "xive.esb" subregion in the KVM
code to "xive.esb-kvm". Since "xive.esb-kvm" is added with overlap
and a higher priority, it prevails over "xive.esb-emulated" (ie.
a guest using KVM XIVE will interact with "xive.esb-kvm" instead of
the default "xive.esb-emulated" region.

While here, consolidate the computation of the MMIO region size in
a common helper.

Suggested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159679992680.876294.7520540158586170894.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 20:50:17 +10:00
Gustavo Romero
a72c71b77d target/ppc: Integrate icount to purr, vtb, and tbu40
Currently if option '-icount auto' is passed to the QEMU TCG to enable
counting instructions the VM crashes with the following error report when
Linux runs on it:

qemu-system-ppc64: Bad icount read

This happens because read/write access to the SPRs PURR, VTB, and TBU40
is not integrated to the icount framework.

This commit fixes that issue by making the read/write access of these
SPRs aware of icount framework, adding the proper gen_io_start() calls
before calling the helpers to load/store these SPRs in TCG and ensuring
that the associated TBs end immediately after, accordingly to what's in
docs/devel/tcg-icount.rst.

Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
Message-Id: <20200811153235.4527-1-gromero@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-13 10:03:01 +10:00
Paolo Bonzini
0eca1f4b5c acceptance: use stable URLs for the Debian and Ubuntu installer
The kernel and initrd hashes seem to have changed for the Bionic
aarch64 installer, causing BootLinuxConsole.test_aarch64_xlnx_versal_virt
to fail.  Correct the paths to use the previous binaries instead of
the latest.  Do the same for the Lenny alpha installer for
consistency, even though those are unlikely to change.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200810092941.154911-1-pbonzini@redhat.com>
2020-08-12 14:14:28 +02:00
Philippe Mathieu-Daudé
c45aed126f tests/acceptance/boot_linux: Extract common URL from xlnx-versal test
Both files refer to the same directory. Store the common part in
a new variable.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200810093050.28744-2-philmd@redhat.com>
2020-08-12 14:13:02 +02:00
Christian Schoenebeck
da9f2eda25 9pfs: clarify latency of v9fs_co_run_in_worker()
As we just fixed a severe performance issue with Treaddir request
handling, clarify this overall issue as a comment on
v9fs_co_run_in_worker() with the intention to hopefully prevent
such performance mistakes in future (and fixing other yet
outstanding ones).

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <4d34d332e1aaa8a2cf8dc0b5da4fd7727f2a86e8.1596012787.git.qemu_oss@crudebyte.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2020-08-12 09:17:32 +02:00
Christian Schoenebeck
d2c5cf7ca1 9pfs: differentiate readdir lock between 9P2000.u vs. 9P2000.L
Previous patch suggests that it might make sense to use a different mutex
type now while handling readdir requests, depending on the precise
protocol variant, as v9fs_do_readdir_with_stat() (used by 9P2000.u) uses
a CoMutex to avoid deadlocks that might happen with QemuMutex otherwise,
whereas do_readdir_many() (used by 9P2000.L) should better use a
QemuMutex, as the precise behaviour of a failed CoMutex lock on fs driver
side would not be clear.

And to avoid the wrong lock type being used, be now strict and error out
if a 9P2000.L client sends a Tread on a directory, and likeweise error out
if a 9P2000.u client sends a Treaddir request.

This patch is just intended as transitional measure, as currently 9P2000.u
vs. 9P2000.L implementations currently differ where the main logic of
fetching directory entries is located at (9P2000.u still being more top
half focused, while 9P2000.L already being bottom half focused in regards
to fetching directory entries that is).

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <9a2ddc347e533b0d801866afd9dfac853d2d4106.1596012787.git.qemu_oss@crudebyte.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2020-08-12 09:17:32 +02:00
Christian Schoenebeck
0c4356ba7d 9pfs: T_readdir latency optimization
Make top half really top half and bottom half really bottom half:

Each T_readdir request handling is hopping between threads (main
I/O thread and background I/O driver threads) several times for
every individual directory entry, which sums up to huge latencies
for handling just a single T_readdir request.

Instead of doing that, collect now all required directory entries
(including all potentially required stat buffers for each entry) in
one rush on a background I/O thread from fs driver by calling the
previously added function v9fs_co_readdir_many() instead of
v9fs_co_readdir(), then assemble the entire resulting network
response message for the readdir request on main I/O thread. The
fs driver is still aborting the directory entry retrieval loop
(on the background I/O thread inside of v9fs_co_readdir_many())
as soon as it would exceed the client's requested maximum R_readdir
response size. So this will not introduce a performance penalty on
another end.

Also: No longer seek initial directory position in v9fs_readdir(),
as this is now handled (more consistently) by
v9fs_co_readdir_many() instead.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <c7c3d1cf4e86611538cef44897842819d9359d7a.1596012787.git.qemu_oss@crudebyte.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2020-08-12 09:17:32 +02:00
Christian Schoenebeck
2149675b19 9pfs: add new function v9fs_co_readdir_many()
The newly added function v9fs_co_readdir_many() retrieves multiple
directory entries with a single fs driver request. It is intended to
replace uses of v9fs_co_readdir(), the latter only retrieves a
single directory entry per fs driver request instead.

The reason for this planned replacement is that for every fs driver
request the coroutine is dispatched from main I/O thread to a
background I/O thread and eventually dispatched back to main I/O
thread. Hopping between threads adds latency. So if a 9pfs Treaddir
request reads a large amount of directory entries, this currently
sums up to huge latencies of several hundred ms or even more. So
using v9fs_co_readdir_many() instead of v9fs_co_readdir() will
provide significant performance improvements.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <73dc827a12ef577ae7e644dcf34a5c0e443ab42f.1596012787.git.qemu_oss@crudebyte.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2020-08-12 09:17:32 +02:00
Christian Schoenebeck
dd8151f4fe 9pfs: split out fs driver core of v9fs_co_readdir()
The implementation of v9fs_co_readdir() has two parts: the outer
part is executed by main I/O thread, whereas the inner part is
executed by fs driver on a background I/O thread.

Move the inner part to its own new, private function do_readdir(),
so it can be shared by another upcoming new function.

This is just a preparatory patch for the subsequent patch, with the
purpose to avoid the next patch to clutter the overall diff.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <a426ee06e77584fa2d8253ce5d8bea519eb3ffd4.1596012787.git.qemu_oss@crudebyte.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2020-08-12 09:17:32 +02:00
Christian Schoenebeck
29c9d2ca80 9pfs: make v9fs_readdir_response_size() public
Rename function v9fs_readdir_data_size() -> v9fs_readdir_response_size()
and make it callable from other units. So far this function is only
used by 9p.c, however subsequent patches require the function to be
callable from another 9pfs unit. And as we're at it; also make it clear
for what this function is used for.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <3668ebc7d5b929a0e4f1357457060d96f50f76f4.1596012787.git.qemu_oss@crudebyte.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2020-08-12 09:17:32 +02:00
Christian Schoenebeck
46488b627a tests/virtio-9p: added split readdir tests
The previous, already existing 'basic' readdir test simply used a
'count' parameter big enough to retrieve all directory entries with a
single Treaddir request.

In the 3 new 'split' readdir tests added by this patch, directory
entries are retrieved, split over several Treaddir requests by picking
small 'count' parameters which force the server to truncate the
response. So the test client sends as many Treaddir requests as
necessary to get all directory entries.

The following 3 new tests are added (executed in this sequence):

1. Split readdir test with count=512
2. Split readdir test with count=256
3. Split readdir test with count=128

This test case sequence is chosen because the smaller the 'count' value,
the higher the chance of errors in case of implementation bugs on server
side.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <569b2e05ab1d0223b14a12dfbdf3ad5e8b3ac131.1596012787.git.qemu_oss@crudebyte.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2020-08-12 09:17:32 +02:00
Greg Kurz
e781139539 spapr/xive: Simplify kvmppc_xive_disconnect()
Since this function begins with:

    /* The KVM XIVE device is not in use */
    if (!xive || xive->fd == -1) {
        return;
    }

we obviously don't need to check xive->fd again.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159673297296.766512.14780055521619233656.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Greg Kurz
82f086b5e7 spapr/xive: Fix xive->fd if kvm_create_device() fails
If the creation of the KVM XIVE device fails for some reasons, the
negative errno ends up in xive->fd, but the rest of the code assumes
that xive->fd either contains an open fd, ie. positive value, or -1.

This doesn't cause any misbehavior except kvmppc_xive_disconnect()
that will try to close(xive->fd) during rollback and likely be
rewarded with an EBADF.

Only set xive->fd with a open fd.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159673296585.766512.15404407281299745442.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Greg Kurz
c55bcb1f47 spapr: Clarify error and documentation for broken KVM XICS
When starting an L2 KVM guest with `ic-mode=dual,kernel-irqchip=on`,
QEMU fails with:

KVM is too old to support ic-mode=dual,kernel-irqchip=on

This error message was introduced to detect older KVM versions that
didn't allow destruction and re-creation of the XICS KVM device that
we do at reboot. But it is actually the same issue that we get with
nested guests : when running under pseries, KVM currently provides
a genuine XICS device (not the XICS-on-XIVE device that we get
under powernv) which doesn't support destruction/re-creation.

This will eventually be fixed in KVM but in the meantime, update
the error message and documentation to mention the nested case.
While here, mention that in "No XIVE support in KVM" section that
this can also happen with "guest OSes supporting XIVE" since
we check this at init time before starting the guest.

Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Buglink: https://bugs.launchpad.net/qemu/+bug/1890290
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159664243614.622889.18307368735989783528.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Cédric Le Goater
8d14523b1c docs: Update POWER9 XIVE support for nested guests
It is not yet supported.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20200804131639.407049-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Daniel Henrique Barboza
61f5e1a34d docs: adding NUMA documentation for pseries
This patch adds a new documentation file, ppc-spapr-numa.rst,
informing what developers and user can expect of the NUMA distance
support for the pseries machine, up to QEMU 5.1.

In the (hopefully soon) future, when we rework the NUMA mechanics
of the pseries machine to at least attempt to contemplate user
choice, this doc will be extended to inform about the new
support.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200803133440.825276-1-danielhb413@gmail.com>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Matthieu Bucchianeri
8dcdb535d7 target/ppc: Fix SPE unavailable exception triggering
When emulating certain floating point instructions or vector instructions on
PowerPC machines, QEMU did not properly generate the SPE/Embedded Floating-
Point Unavailable interrupt. See the buglink further below for references to
the relevant NXP documentation.

This patch fixes the behavior of some evfs* instructions that were
incorrectly emitting the interrupt.

More importantly, this patch fixes the behavior of several efd* and ev*
instructions that were not generating the interrupt. Triggering the
interrupt for these instructions fixes lazy FPU/vector context switching on
some operating systems like Linux.

Without this patch, the result of some double-precision arithmetic could be
corrupted due to the lack of proper saving and restoring of the upper
32-bit part of the general-purpose registers.

Buglink: https://bugs.launchpad.net/qemu/+bug/1888918
Buglink: https://bugs.launchpad.net/qemu/+bug/1611394
Signed-off-by: Matthieu Bucchianeri <matthieu.bucchianeri@leostella.com>
Message-Id: <20200727175553.32276-1-matthieu.bucchianeri@leostella.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Lijun Pan
c4b8b49d68 target/ppc: add vmulh{su}d instructions
vmulhsd: Vector Multiply High Signed Doubleword
vmulhud: Vector Multiply High Unsigned Doubleword

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Message-Id: <20200724045845.89976-5-ljp@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Lijun Pan
f3e0d864ab target/ppc: add vmulh{su}w instructions
vmulhsw: Vector Multiply High Signed Word
vmulhuw: Vector Multiply High Unsigned Word

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Message-Id: <20200724045845.89976-4-ljp@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Lijun Pan
73ebe95e8e target/ppc: add vmulld to INDEX_op_mul_vec case
Group vmuluwm and vmulld. Make vmulld-specific
changes since it belongs to new ISA 3.1.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Message-Id: <20200724045845.89976-3-ljp@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Lijun Pan
de55d3b381 Update PowerPC AT_HWCAP2 definition
Add PPC2_FEATURE2_ARCH_3_10 to the PowerPC AT_HWCAP2 definitions.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Message-Id: <20200724045845.89976-2-ljp@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2020-08-12 13:16:27 +10:00
Gustavo Romero
4b160fad4f ppc/xive: Fix some typos in comments
Fix some typos in comments about code modeling coalescing points in the
XIVE routing engine (IVRE).

Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
Message-Id: <1595461434-27725-1-git-send-email-gromero@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Greg Kurz
19d55e2031 spapr: Forbid nested KVM-HV in pre-power9 compat mode
Nested KVM HV only works if the kernel is using the radix MMU mode, ie.
the CPU is POWER9 and it is not running in some pre-power9 compat mode.
Otherwise, the KVM HV module fails to load in the guest with -ENODEV.
It might be painful for a user to discover this late that nested cannot
work with their setup. Erroring out at machine init instead seems to be
the best we can do.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <159491948127.188975.9621435875869177751.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Greg Kurz
d9c5b5fa86 spapr: Use error_append_hint() in spapr_caps.c
We have a dedicated error API for hints. Use it instead of embedding
the hint in the error message, as recommanded in the "qapi/error.h"
header file.

While here, have cap_fwnmi_apply(), which already uses
error_append_hint(), to call ERRP_GUARD() as well.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <159594297421.8262.14314530897345809924.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00