Commit Graph

228 Commits

Author SHA1 Message Date
Paolo Bonzini
39dd3e1f55 kvm: i8254: require KVM_CAP_PIT2 and KVM_CAP_PIT_STATE2
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-25 19:53:38 +02:00
Paolo Bonzini
700766ba60 kvm: i386: require KVM_CAP_ADJUST_CLOCK
This was introduced in KVM in Linux 2.6.33, we can require it
unconditionally.  KVM_CLOCK_TSC_STABLE was only added in Linux 4.9,
for now do not require it (though it would allow the removal of some
pretty yucky code).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-25 19:53:38 +02:00
Richard Henderson
b77af26e97 accel/tcg: Replace CPUState.env_ptr with cpu_env()
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-04 11:03:54 -07:00
Michael Tokarev
bad5cfcd60 i386: spelling fixes
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2023-09-20 07:54:34 +03:00
Stefan Hajnoczi
03a3a62fbd * only build util/async-teardown.c when system build is requested
* target/i386: fix BQL handling of the legacy FERR interrupts
 * target/i386: fix memory operand size for CVTPS2PD
 * target/i386: Add support for AMX-COMPLEX in CPUID enumeration
 * compile plugins on Darwin
 * configure and meson cleanups
 * drop mkvenv support for Python 3.7 and Debian10
 * add wrap file for libblkio
 * tweak KVM stubs
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT5t6UUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMmjwf+MpvVuq+nn+3PqGUXgnzJx5ccA5ne
 O9Xy8+1GdlQPzBw/tPovxXDSKn3HQtBfxObn2CCE1tu/4uHWpBA1Vksn++NHdUf2
 P0yoHxGskJu5iYYTtIcNw5cH2i+AizdiXuEjhfNjqD5Y234cFoHnUApt9e3zBvVO
 cwGD7WpPuSb4g38hHkV6nKcx72o7b4ejDToqUVZJ2N+RkddSqB03fSdrOru0hR7x
 V+lay0DYdFszNDFm05LJzfDbcrHuSryGA91wtty7Fzj6QhR/HBHQCUZJxMB5PI7F
 Zy4Zdpu60zxtSxUqeKgIi7UhNFgMcax2Hf9QEqdc/B4ARoBbboh4q4u8kQ==
 =dH7/
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* only build util/async-teardown.c when system build is requested
* target/i386: fix BQL handling of the legacy FERR interrupts
* target/i386: fix memory operand size for CVTPS2PD
* target/i386: Add support for AMX-COMPLEX in CPUID enumeration
* compile plugins on Darwin
* configure and meson cleanups
* drop mkvenv support for Python 3.7 and Debian10
* add wrap file for libblkio
* tweak KVM stubs

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT5t6UUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMmjwf+MpvVuq+nn+3PqGUXgnzJx5ccA5ne
# O9Xy8+1GdlQPzBw/tPovxXDSKn3HQtBfxObn2CCE1tu/4uHWpBA1Vksn++NHdUf2
# P0yoHxGskJu5iYYTtIcNw5cH2i+AizdiXuEjhfNjqD5Y234cFoHnUApt9e3zBvVO
# cwGD7WpPuSb4g38hHkV6nKcx72o7b4ejDToqUVZJ2N+RkddSqB03fSdrOru0hR7x
# V+lay0DYdFszNDFm05LJzfDbcrHuSryGA91wtty7Fzj6QhR/HBHQCUZJxMB5PI7F
# Zy4Zdpu60zxtSxUqeKgIi7UhNFgMcax2Hf9QEqdc/B4ARoBbboh4q4u8kQ==
# =dH7/
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 07 Sep 2023 07:44:37 EDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (51 commits)
  docs/system/replay: do not show removed command line option
  subprojects: add wrap file for libblkio
  sysemu/kvm: Restrict kvm_pc_setup_irq_routing() to x86 targets
  sysemu/kvm: Restrict kvm_has_pit_state2() to x86 targets
  sysemu/kvm: Restrict kvm_get_apic_state() to x86 targets
  sysemu/kvm: Restrict kvm_arch_get_supported_cpuid/msr() to x86 targets
  target/i386: Restrict declarations specific to CONFIG_KVM
  target/i386: Allow elision of kvm_hv_vpindex_settable()
  target/i386: Allow elision of kvm_enable_x2apic()
  target/i386: Remove unused KVM stubs
  target/i386/cpu-sysemu: Inline kvm_apic_in_kernel()
  target/i386/helper: Restrict KVM declarations to system emulation
  hw/i386/fw_cfg: Include missing 'cpu.h' header
  hw/i386/pc: Include missing 'cpu.h' header
  hw/i386/pc: Include missing 'sysemu/tcg.h' header
  Revert "mkvenv: work around broken pip installations on Debian 10"
  mkvenv: assume presence of importlib.metadata
  Python: Drop support for Python 3.7
  configure: remove dead code
  meson: list leftover CONFIG_* symbols
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-09-07 10:29:06 -04:00
Philippe Mathieu-Daudé
bb781b947d sysemu/kvm: Restrict kvm_pc_setup_irq_routing() to x86 targets
kvm_pc_setup_irq_routing() is only defined for x86 targets (in
hw/i386/kvm/apic.c). Its declaration is pointless on all
other targets.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-14-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-09-07 13:32:37 +02:00
Philippe Mathieu-Daudé
fc30abf846 sysemu/kvm: Restrict kvm_has_pit_state2() to x86 targets
kvm_has_pit_state2() is only defined for x86 targets (in
target/i386/kvm/kvm.c). Its declaration is pointless on
all other targets. Have it return a boolean.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-13-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-09-07 13:32:37 +02:00
Philippe Mathieu-Daudé
a09ef8ff0a hw/i386: Rename 'hw/kvm/clock.h' -> 'hw/i386/kvm/clock.h'
kvmclock_create() is only implemented in hw/i386/kvm/clock.h.
Restrict the "hw/kvm/clock.h" header to i386 by moving it to
hw/i386/.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230620083228.88796-3-philmd@linaro.org>
2023-08-31 19:47:43 +02:00
Philippe Mathieu-Daudé
b797c98de4 hw/i386: Remove unuseful kvmclock_create() stub
We shouldn't call kvmclock_create() when KVM is not available
or disabled:
 - check for kvm_enabled() before calling it
 - assert KVM is enabled once called
Since the call is elided when KVM is not available, we can
remove the stub (it is never compiled).

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230620083228.88796-2-philmd@linaro.org>
2023-08-31 19:47:43 +02:00
Richard Henderson
081619e677 Misc patches queue
xen: Fix issues reported by fuzzer / Coverity
 misc: Fix some typos in documentation and comments
 ui/dbus: Build fixes for Clang/win32/!opengl
 linux-user: Semihosting fixes on m68k/nios2
 tests/migration: Disable stack protector when linking without stdlib
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmTJfrQACgkQ4+MsLN6t
 wN4Nqw/+NjoW2jdy9LNAgx7IeH2w+HfvvULpBOTDRRNahuXbGpzl6L57cS92r5a8
 UFJGfxbL2nlxrJbUdAWGONIweCvUb9jnpbT2id1dBp4wp+8aKFvPj1Al34OENNVS
 1lQT0G6mKx9itcXP9lVSBPhEbWIB9ZMaDG0R872bA6Ec3G7PWny+AOhMvJecieol
 2Qyv84ioA3N0xkYUB64KBVDmJOG0Tx+LYZfsXUybLKwfvBDLeVkHuHKtb94kh0G9
 MUsM/p9sHvfrC1bO+DQ9P1bzRI9zw2I2f4xMIs4QCMGPbJUrhv7edOc2PSO5XQoG
 izcV9NSL0tl6LbXZvkE7sJw0tDuR6R9sQ9KJWoltJCGRGOWlC5CeSTUfLbH9HkFc
 CXapKWth6cmOboGZNTlidn41oH7xE/kW6Em1XAD0M0eLUCUMzVjaSs1sIwKnbF7i
 sz7HcgAAuAVhmR0n4zOkphJkek72J7atLNpqU0AdYH46LR92zSdh6YoD5YDBPwY8
 hoy7VFauSkF8+5Wi7CTTjtq+edkuFRcuNMCR0Fd2iolE8KKYvxHnwEGH/5T4s2m7
 8f40AEyQRk0nFn44tqeyb14O8c2lZL3jmDEh+LYT/PPp/rCc/X7Ugplpau+bNZsx
 OOZd0AxujbrK+Xn80Agc+3/vn4/2eAvz7OdGc/SmKuYLyseBQfo=
 =5ZLa
 -----END PGP SIGNATURE-----

Merge tag 'misc-fixes-20230801' of https://github.com/philmd/qemu into staging

Misc patches queue

xen: Fix issues reported by fuzzer / Coverity
misc: Fix some typos in documentation and comments
ui/dbus: Build fixes for Clang/win32/!opengl
linux-user: Semihosting fixes on m68k/nios2
tests/migration: Disable stack protector when linking without stdlib

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmTJfrQACgkQ4+MsLN6t
# wN4Nqw/+NjoW2jdy9LNAgx7IeH2w+HfvvULpBOTDRRNahuXbGpzl6L57cS92r5a8
# UFJGfxbL2nlxrJbUdAWGONIweCvUb9jnpbT2id1dBp4wp+8aKFvPj1Al34OENNVS
# 1lQT0G6mKx9itcXP9lVSBPhEbWIB9ZMaDG0R872bA6Ec3G7PWny+AOhMvJecieol
# 2Qyv84ioA3N0xkYUB64KBVDmJOG0Tx+LYZfsXUybLKwfvBDLeVkHuHKtb94kh0G9
# MUsM/p9sHvfrC1bO+DQ9P1bzRI9zw2I2f4xMIs4QCMGPbJUrhv7edOc2PSO5XQoG
# izcV9NSL0tl6LbXZvkE7sJw0tDuR6R9sQ9KJWoltJCGRGOWlC5CeSTUfLbH9HkFc
# CXapKWth6cmOboGZNTlidn41oH7xE/kW6Em1XAD0M0eLUCUMzVjaSs1sIwKnbF7i
# sz7HcgAAuAVhmR0n4zOkphJkek72J7atLNpqU0AdYH46LR92zSdh6YoD5YDBPwY8
# hoy7VFauSkF8+5Wi7CTTjtq+edkuFRcuNMCR0Fd2iolE8KKYvxHnwEGH/5T4s2m7
# 8f40AEyQRk0nFn44tqeyb14O8c2lZL3jmDEh+LYT/PPp/rCc/X7Ugplpau+bNZsx
# OOZd0AxujbrK+Xn80Agc+3/vn4/2eAvz7OdGc/SmKuYLyseBQfo=
# =5ZLa
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 01 Aug 2023 02:52:52 PM PDT
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]

* tag 'misc-fixes-20230801' of https://github.com/philmd/qemu:
  target/m68k: Fix semihost lseek offset computation
  target/nios2: Fix semihost lseek offset computation
  target/nios2: Pass semihosting arg to exit
  tests/migration: Add -fno-stack-protector
  misc: Fix some typos in documentation and comments
  ui/dbus: fix clang compilation issue
  ui/dbus: fix win32 compilation when !opengl
  hw/xen: prevent guest from binding loopback event channel to itself
  i386/xen: consistent locking around Xen singleshot timers
  hw/xen: fix off-by-one in xen_evtchn_set_gsi()

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-08-02 06:51:29 -07:00
David Woodhouse
75a87af9b2 hw/xen: prevent guest from binding loopback event channel to itself
Fuzzing showed that a guest could bind an interdomain port to itself, by
guessing the next port to be allocated and putting that as the 'remote'
port number. By chance, that works because the newly-allocated port has
type EVTCHNSTAT_unbound. It shouldn't.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20230801175747.145906-4-dwmw2@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-08-01 23:52:23 +02:00
David Woodhouse
cf885b1957 hw/xen: fix off-by-one in xen_evtchn_set_gsi()
Coverity points out (CID 1508128) a bounds checking error. We need to check
for gsi >= IOAPIC_NUM_PINS, not just greater-than.

Also fix up an assert() that has the same problem, that Coverity didn't see.

Fixes: 4f81baa33e ("hw/xen: Support GSI mapping to PIRQ")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230801175747.145906-2-dwmw2@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-08-01 23:40:30 +02:00
David Woodhouse
ace33a0e5a hw/xen: Clarify (lack of) error handling in transaction_commit()
Coverity was unhappy (CID 1508359) because we didn't check the return of
init_walk_op() in transaction_commit(), despite doing so at every other
call site.

Strictly speaking, this is a false positive since it can never fail. It
only fails for invalid user input (transaction ID or path), and both of
those are hard-coded to known sane values in this invocation.

But Coverity doesn't know that, and neither does the casual reader of the
code.

Returning an error here would be weird, since the transaction *is*
committed by this point; all the walk_op is doing is firing watches on
the newly-committed changed nodes. So make it a g_assert(!ret), since
it really should never happen.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20076888f6bdf06a65aafc5cf954260965d45b97.camel@infradead.org>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2023-08-01 10:22:33 +01:00
Philippe Mathieu-Daudé
c7b64948f8 meson: Replace CONFIG_SOFTMMU -> CONFIG_SYSTEM_ONLY
Since we *might* have user emulation with softmmu,
use the clearer 'CONFIG_SYSTEM_ONLY' key to check
for system emulation.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-9-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20 10:01:30 +02:00
David Woodhouse
c9bdfe8d58 hw/xen: Fix broken check for invalid state in xs_be_open()
Coverity points out that if (!s && !s->impl) isn't really what we intended
to do here. CID 1508131.

Fixes: 0324751272 ("hw/xen: Add emulated implementation of XenStore operations")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230412185102.441523-6-dwmw2@infradead.org>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2023-06-07 15:07:10 +01:00
David Woodhouse
eeedfe6c63 hw/xen: Simplify emulated Xen platform init
I initially put the basic platform init (overlay pages, grant tables,
event channels) into mc->kvm_type because that was the earliest place
that could sensibly test for xen_mode==XEN_EMULATE.

The intent was to do this early enough that we could then initialise the
XenBus and other parts which would have depended on them, from a generic
location for both Xen and KVM/Xen in the PC-specific code, as seen in
https://lore.kernel.org/qemu-devel/20230116221919.1124201-16-dwmw2@infradead.org/

However, then the Xen on Arm patches came along, and *they* wanted to
do the XenBus init from a 'generic' Xen-specific location instead:
https://lore.kernel.org/qemu-devel/20230210222729.957168-4-sstabellini@kernel.org/

Since there's no generic location that covers all three, I conceded to
do it for XEN_EMULATE mode in pc_basic_devices_init().

And now there's absolutely no point in having some of the platform init
done from pc_machine_kvm_type(); we can move it all up to live in a
single place in pc_basic_devices_init(). This has the added benefit that
we can drop the separate xen_evtchn_connect_gsis() function completely,
and pass just the system GSIs in directly to xen_evtchn_create().

While I'm at it, it does no harm to explicitly pass in the *number* of
said GSIs, because it does make me twitch a bit to pass an array of
impicit size. During the lifetime of the KVM/Xen patchset, that had
already changed (albeit just cosmetically) from GSI_NUM_PINS to
IOAPIC_NUM_PINS.

And document a bit better that this is for the *output* GSI for raising
CPU0's events when the per-CPU vector isn't available. The fact that
we create a whole set of them and then only waggle the one we're told
to, instead of having a single output and only *connecting* it to the
GSI that it should be connected to, is still non-intuitive for me.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20230412185102.441523-2-dwmw2@infradead.org>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2023-06-07 15:07:10 +01:00
Bernhard Beschow
02520772ae hw/timer/i8254_common: Share "iobase" property via base class
Both TYPE_KVM_I8254 and TYPE_I8254 have their own but same implementation of
the "iobase" property. The storage for the property already resides in
PITCommonState, so also move the property definition there.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230523195608.125820-2-shentey@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2023-06-05 07:43:23 +01:00
Stefan Hajnoczi
60f782b6b7 aio: remove aio_disable_external() API
All callers now pass is_external=false to aio_set_fd_handler() and
aio_set_event_notifier(). The aio_disable_external() API that
temporarily disables fd handlers that were registered is_external=true
is therefore dead code.

Remove aio_disable_external(), aio_enable_external(), and the
is_external arguments to aio_set_fd_handler() and
aio_set_event_notifier().

The entire test-fdmon-epoll test is removed because its sole purpose was
testing aio_disable_external().

Parts of this patch were generated using the following coccinelle
(https://coccinelle.lip6.fr/) semantic patch:

  @@
  expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque;
  @@
  - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque)
  + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque)

  @@
  expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready;
  @@
  - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready)
  + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready)

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-21-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30 17:37:26 +02:00
Stefan Hajnoczi
9998f70f66 hw/xen: do not use aio_set_fd_handler(is_external=true) in xen_xenstore
There is no need to suspend activity between aio_disable_external() and
aio_enable_external(), which is mainly used for the block layer's drain
operation.

This is part of ongoing work to remove the aio_disable_external() API.

Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-9-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30 17:32:02 +02:00
Richard Henderson
cc37d98bfb *: Add missing includes of qemu/error-report.h
This had been pulled in via qemu/plugin.h from hw/core/cpu.h,
but that will be removed.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230310195252.210956-5-richard.henderson@linaro.org>
[AJB: add various additional cases shown by CI]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230315174331.2959-15-alex.bennee@linaro.org>
Reviewed-by: Emilio Cota <cota@braap.org>
2023-03-22 15:06:57 +00:00
David Woodhouse
de26b26197 hw/xen: Implement soft reset for emulated gnttab
This is only part of it; we will also need to get the PV back end drivers
to tear down their own mappings (or do it for them, but they kind of need
to stop using the pointers too).

Some more work on the actual PV back ends and xen-bus code is going to be
needed to really make soft reset and migration fully functional, and this
part is the basis for that.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
d05864d23b hw/xen: Map guest XENSTORE_PFN grant in emulated Xenstore
We don't actually access the guest's page through the grant, because
this isn't real Xen, and we can just use the page we gave it in the
first place. Map the grant anyway, mostly for cosmetic purposes so it
*looks* like it's in use in the guest-visible grant table.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
0324751272 hw/xen: Add emulated implementation of XenStore operations
Now that we have an internal implementation of XenStore, we can populate
the xenstore_backend_ops to allow PV backends to talk to it.

Watches can't be processed with immediate callbacks because that would
call back into XenBus code recursively. Defer them to a QEMUBH to be run
as appropriate from the main loop. We use a QEMUBH per XS handle, and it
walks all the watches (there shouldn't be many per handle) to fire any
which have pending events. We *could* have done it differently but this
allows us to use the same struct watch_event as we have for the guest
side, and keeps things relatively simple.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
b08d88e30f hw/xen: Add emulated implementation of grant table operations
This is limited to mapping a single grant at a time, because under Xen the
pages are mapped *contiguously* into qemu's address space, and that's very
hard to do when those pages actually come from anonymous mappings in qemu
in the first place.

Eventually perhaps we can look at using shared mappings of actual objects
for system RAM, and then we can make new mappings of the same backing
store (be it deleted files, shmem, whatever). But for now let's stick to
a page at a time.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
4dfd5fb178 hw/xen: Hook up emulated implementation for event channel operations
We provided the backend-facing evtchn functions very early on as part of
the core Xen platform support, since things like timers and xenstore need
to use them.

By what may or may not be an astonishing coincidence, those functions
just *happen* all to have exactly the right function prototypes to slot
into the evtchn_backend_ops table and be called by the PV backends.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Paul Durrant
ba2a92db1f hw/xen: Add xenstore operations to allow redirection to internal emulation
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Paul Durrant
831b0db8ab hw/xen: Create initial XenStore nodes
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
766804b101 hw/xen: Implement core serialize/deserialize methods for xenstore_impl
This implements the basic migration support in the back end, with unit
tests that give additional confidence in the node-counting already in
the tree.

However, the existing PV back ends like xen-disk don't support migration
yet. They will reset the ring and fail to continue where they left off.
We will fix that in future, but not in time for the 8.0 release.

Since there's also an open question of whether we want to serialize the
full XenStore or only the guest-owned nodes in /local/domain/${domid},
for now just mark the XenStore device as unmigratable.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Paul Durrant
be1934dfef hw/xen: Implement XenStore permissions
Store perms as a GList of strings, check permissions.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
7cabbdb70d hw/xen: Watches on XenStore transactions
Firing watches on the nodes that still exist is relatively easy; just
walk the tree and look at the nodes with refcount of one.

Firing watches on *deleted* nodes is more fun. We add 'modified_in_tx'
and 'deleted_in_tx' flags to each node. Nodes with those flags cannot
be shared, as they will always be unique to the transaction in which
they were created.

When xs_node_walk would need to *create* a node as scaffolding and it
encounters a deleted_in_tx node, it can resurrect it simply by clearing
its deleted_in_tx flag. If that node originally had any *data*, they're
gone, and the modified_in_tx flag will have been set when it was first
deleted.

We then attempt to send appropriate watches when the transaction is
committed, properly delete the deleted_in_tx nodes, and remove the
modified_in_tx flag from the others.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
7248b87cb0 hw/xen: Implement XenStore transactions
Given that the whole thing supported copy on write from the beginning,
transactions end up being fairly simple. On starting a transaction, just
take a ref of the existing root; swap it back in on a successful commit.

The main tree has a transaction ID too, and we keep a record of the last
transaction ID given out. if the main tree is ever modified when it isn't
the latest, it gets a new transaction ID.

A commit can only succeed if the main tree hasn't moved on since it was
forked. Strictly speaking, the XenStore protocol allows a transaction to
succeed as long as nothing *it* read or wrote has changed in the interim,
but no implementations do that; *any* change is sufficient to abort a
transaction.

This does not yet fire watches on the changed nodes on a commit. That bit
is more fun and will come in a follow-on commit.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
6e1330090d hw/xen: Implement XenStore watches
Starts out fairly simple: a hash table of watches based on the path.

Except there can be multiple watches on the same path, so the watch ends
up being a simple linked list, and the head of that list is in the hash
table. Which makes removal a bit of a PITA but it's not so bad; we just
special-case "I had to remove the head of the list and now I have to
replace it in / remove it from the hash table". And if we don't remove
the head, it's a simple linked-list operation.

We do need to fire watches on *deleted* nodes, so instead of just a simple
xs_node_unref() on the topmost victim, we need to recurse down and fire
watches on them all.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
3ef7ff83ca hw/xen: Add basic XenStore tree walk and write/read/directory support
This is a fairly simple implementation of a copy-on-write tree.

The node walk function starts off at the root, with 'inplace == true'.
If it ever encounters a node with a refcount greater than one (including
the root node), then that node is shared with other trees, and cannot
be modified in place, so the inplace flag is cleared and we copy on
write from there on down.

Xenstore write has 'mkdir -p' semantics and will create the intermediate
nodes if they don't already exist, so in that case we flip the inplace
flag back to true as we populate the newly-created nodes.

We put a copy of the absolute path into the buffer in the struct walk_op,
with *two* NUL terminators at the end. As xs_node_walk() goes down the
tree, it replaces the next '/' separator with a NUL so that it can use
the 'child name' in place. The next recursion down then puts the '/'
back and repeats the exercise for the next path element... if it doesn't
hit that *second* NUL termination which indicates the true end of the
path.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
0254c4d19d hw/xen: Add xenstore wire implementation and implementation stubs
This implements the basic wire protocol for the XenStore commands, punting
all the actual implementation to xs_impl_* functions which all just return
errors for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
e16aff4cc2 kvm/i386: Add xen-evtchn-max-pirq property
The default number of PIRQs is set to 256 to avoid issues with 32-bit MSI
devices. Allow it to be increased if the user desires.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:09:22 +00:00
David Woodhouse
6096cf7877 hw/xen: Support MSI mapping to PIRQ
The way that Xen handles MSI PIRQs is kind of awful.

There is a special MSI message which targets a PIRQ. The vector in the
low bits of data must be zero. The low 8 bits of the PIRQ# are in the
destination ID field, the extended destination ID field is unused, and
instead the high bits of the PIRQ# are in the high 32 bits of the address.

Using the high bits of the address means that we can't intercept and
translate these messages in kvm_send_msi(), because they won't be caught
by the APIC — addresses like 0x1000fee46000 aren't in the APIC's range.

So we catch them in pci_msi_trigger() instead, and deliver the event
channel directly.

That isn't even the worst part. The worst part is that Xen snoops on
writes to devices' MSI vectors while they are *masked*. When a MSI
message is written which looks like it targets a PIRQ, it remembers
the device and vector for later.

When the guest makes a hypercall to bind that PIRQ# (snooped from a
marked MSI vector) to an event channel port, Xen *unmasks* that MSI
vector on the device. Xen guests using PIRQ delivery of MSI don't
ever actually unmask the MSI for themselves.

Now that this is working we can finally enable XENFEAT_hvm_pirqs and
let the guest use it all.

Tested with passthrough igb and emulated e1000e + AHCI.

           CPU0       CPU1
  0:         65          0   IO-APIC   2-edge      timer
  1:          0         14  xen-pirq   1-ioapic-edge  i8042
  4:          0        846  xen-pirq   4-ioapic-edge  ttyS0
  8:          1          0  xen-pirq   8-ioapic-edge  rtc0
  9:          0          0  xen-pirq   9-ioapic-level  acpi
 12:        257          0  xen-pirq  12-ioapic-edge  i8042
 24:       9600          0  xen-percpu    -virq      timer0
 25:       2758          0  xen-percpu    -ipi       resched0
 26:          0          0  xen-percpu    -ipi       callfunc0
 27:          0          0  xen-percpu    -virq      debug0
 28:       1526          0  xen-percpu    -ipi       callfuncsingle0
 29:          0          0  xen-percpu    -ipi       spinlock0
 30:          0       8608  xen-percpu    -virq      timer1
 31:          0        874  xen-percpu    -ipi       resched1
 32:          0          0  xen-percpu    -ipi       callfunc1
 33:          0          0  xen-percpu    -virq      debug1
 34:          0       1617  xen-percpu    -ipi       callfuncsingle1
 35:          0          0  xen-percpu    -ipi       spinlock1
 36:          8          0   xen-dyn    -event     xenbus
 37:          0       6046  xen-pirq    -msi       ahci[0000:00:03.0]
 38:          1          0  xen-pirq    -msi-x     ens4
 39:          0         73  xen-pirq    -msi-x     ens4-rx-0
 40:         14          0  xen-pirq    -msi-x     ens4-rx-1
 41:          0         32  xen-pirq    -msi-x     ens4-tx-0
 42:         47          0  xen-pirq    -msi-x     ens4-tx-1

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:09:22 +00:00
David Woodhouse
4f81baa33e hw/xen: Support GSI mapping to PIRQ
If I advertise XENFEAT_hvm_pirqs then a guest now boots successfully as
long as I tell it 'pci=nomsi'.

[root@localhost ~]# cat /proc/interrupts
           CPU0
  0:         52   IO-APIC   2-edge      timer
  1:         16  xen-pirq   1-ioapic-edge  i8042
  4:       1534  xen-pirq   4-ioapic-edge  ttyS0
  8:          1  xen-pirq   8-ioapic-edge  rtc0
  9:          0  xen-pirq   9-ioapic-level  acpi
 11:       5648  xen-pirq  11-ioapic-level  ahci[0000:00:04.0]
 12:        257  xen-pirq  12-ioapic-edge  i8042
...

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:09:20 +00:00
David Woodhouse
aa98ee38a5 hw/xen: Implement emulated PIRQ hypercall support
This wires up the basic infrastructure but the actual interrupts aren't
there yet, so don't advertise it to the guest.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:09:01 +00:00
David Woodhouse
799c23548f i386/xen: Implement HYPERVISOR_physdev_op
Just hook up the basic hypercalls to stubs in xen_evtchn.c for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:08:26 +00:00
David Woodhouse
f3341e7b91 hw/xen: Add basic ring handling to xenstore
Extract requests, return ENOSYS to all of them. This is enough to allow
older Linux guests to boot, as they need *something* back but it doesn't
matter much what.

A full implementation of a single-tentant internal XenStore copy-on-write
tree with transactions and watches is waiting in the wings to be sent in
a subsequent round of patches along with hooking up the actual PV disk
back end in qemu, but this is enough to get guests booting for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:08:26 +00:00
David Woodhouse
c08f5d0e53 hw/xen: Add xen_xenstore device for xenstore emulation
Just the basic shell, with the event channel hookup. It only dumps the
buffer for now; a real ring implmentation will come in a subsequent patch.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:08:26 +00:00
David Woodhouse
794fba23a5 hw/xen: Add backend implementation of interdomain event channel support
The provides the QEMU side of interdomain event channels, allowing events
to be sent to/from the guest.

The API mirrors libxenevtchn, and in time both this and the real Xen one
will be available through ops structures so that the PV backend drivers
can use the correct one as appropriate.

For now, this implementation can be used directly by our XenStore which
will be for emulated mode only.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:08:25 +00:00
Joao Martins
b746a77926 i386/xen: handle PV timer hypercalls
Introduce support for one shot and periodic mode of Xen PV timers,
whereby timer interrupts come through a special virq event channel
with deadlines being set through:

1) set_timer_op hypercall (only oneshot)
2) vcpu_op hypercall for {set,stop}_{singleshot,periodic}_timer
hypercalls

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:07:52 +00:00
David Woodhouse
b46f9745b1 hw/xen: Implement GNTTABOP_query_size
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:07:52 +00:00
David Woodhouse
28b7ae94a2 i386/xen: Implement HYPERVISOR_grant_table_op and GNTTABOP_[gs]et_verson
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:07:52 +00:00
David Woodhouse
e33cb789af hw/xen: Support mapping grant frames
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:07:52 +00:00
David Woodhouse
a28b0fc034 hw/xen: Add xen_gnttab device for grant table emulation
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:07:52 +00:00
David Woodhouse
2aff696b10 hw/xen: Support HVM_PARAM_CALLBACK_TYPE_PCI_INTX callback
The guest is permitted to specify an arbitrary domain/bus/device/function
and INTX pin from which the callback IRQ shall appear to have come.

In QEMU we can only easily do this for devices that actually exist, and
even that requires us "knowing" that it's a PCMachine in order to find
the PCI root bus — although that's OK really because it's always true.

We also don't get to get notified of INTX routing changes, because we
can't do that as a passive observer; if we try to register a notifier
it will overwrite any existing notifier callback on the device.

But in practice, guests using PCI_INTX will only ever use pin A on the
Xen platform device, and won't swizzle the INTX routing after they set
it up. So this is just fine.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:07:50 +00:00
David Woodhouse
ddf0fd9ae1 hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback
The GSI callback (and later PCI_INTX) is a level triggered interrupt. It
is asserted when an event channel is delivered to vCPU0, and is supposed
to be cleared when the vcpu_info->evtchn_upcall_pending field for vCPU0
is cleared again.

Thankfully, Xen does *not* assert the GSI if the guest sets its own
evtchn_upcall_pending field; we only need to assert the GSI when we
have delivered an event for ourselves. So that's the easy part, kind of.

There's a slight complexity in that we need to hold the BQL before we
can call qemu_set_irq(), and we definitely can't do that while holding
our own port_lock (because we'll need to take that from the qemu-side
functions that the PV backend drivers will call). So if we end up
wanting to set the IRQ in a context where we *don't* already hold the
BQL, defer to a BH.

However, we *do* need to poll for the evtchn_upcall_pending flag being
cleared. In an ideal world we would poll that when the EOI happens on
the PIC/IOAPIC. That's how it works in the kernel with the VFIO eventfd
pairs — one is used to trigger the interrupt, and the other works in the
other direction to 'resample' on EOI, and trigger the first eventfd
again if the line is still active.

However, QEMU doesn't seem to do that. Even VFIO level interrupts seem
to be supported by temporarily unmapping the device's BARs from the
guest when an interrupt happens, then trapping *all* MMIO to the device
and sending the 'resample' event on *every* MMIO access until the IRQ
is cleared! Maybe in future we'll plumb the 'resample' concept through
QEMU's irq framework but for now we'll do what Xen itself does: just
check the flag on every vmexit if the upcall GSI is known to be
asserted.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 09:06:44 +00:00
Joao Martins
507cb64d6e i386/xen: add monitor commands to test event injection
Specifically add listing, injection of event channels.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-01 08:22:50 +00:00