Commit Graph

107538 Commits

Author SHA1 Message Date
Mark Cave-Ayland
2006904938 mac_via: workaround NetBSD ADB bus enumeration issue
NetBSD assumes it can send its first ADB command after sending the ADB_BUSRESET
command in ADB_STATE_NEW without changing the state back to ADB_STATE_IDLE
first as detailed in the ADB protocol.

Add a workaround to detect this condition at the start of ADB enumeration
and send the next command written to SR after a ADB_BUSRESET onto the bus
regardless, even if we don't detect a state transition to ADB_STATE_NEW.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-16-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
366d27796c mac_via: work around underflow in TimeDBRA timing loop in SETUPTIMEK
The MacOS toolbox ROM calculates the number of branches that can be executed
per millisecond as part of its timer calibration. Since modern hosts are
considerably quicker than original hardware, the negative counter reaches zero
before the calibration completes leading to division by zero later in
CALCULATESLOD.

Instead of trying to fudge the timing loop (which won't work for TimeDBRA/TimeSCCDB
anyhow), use the pattern of access to the VIA1 registers to detect when SETUPTIMEK
has finished executing and write some well-known good timer values to TimeDBRA
and TimeSCCDB taken from real hardware with a suitable scaling factor.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-15-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
994af0b2e0 swim: update IWM/ISM register block decoding
Update the IWM/ISM register block decoding to match the description given in the
"SWIM Chip Users Reference". This allows us to validate the device response to
the guest OS which currently only does just enough to indicate that the floppy
drive is unavailable.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-14-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
5700420417 swim: split into separate IWM and ISM register blocks
The swim chip provides an implementation of both Apple's IWM and ISM floppy disk
controllers. Split the existing implementation into separate register banks for
each controller, whilst also switching the IWM registers from 16-bit to 8-bit
as implemented in real hardware.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-13-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
d05cad2bf6 swim: add trace events for IWM and ISM registers
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-12-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
7afc4356c3 q800: add easc bool machine class property to switch between ASC and EASC
This determines whether the Apple Sound Chip (ASC) is set to enhanced mode
(default) or to original mode. The real Q800 hardware used an EASC chip however
a lot of older software only works with the older ASC chip.

Adding this as a machine parameter allows QEMU to be used as an developer aid
for testing and migrating code from ASC to EASC.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-11-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
9983f6e12e q800: add Apple Sound Chip (ASC) audio to machine
The Quadra 800 has the enhanced ASC (EASC) audio chip which supports both the
legacy IRQ routing through VIA2 and also "A/UX" mode routing direct to the
CPU.

Co-developed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20231004083806.757242-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
5243e5543e asc: generate silence if FIFO empty but engine still running
MacOS (un)helpfully leaves the FIFO engine running even when all the samples have
been written to the hardware, and expects the FIFO status flags and IRQ to be
updated continuously.

There is an additional problem in that not all audio backends guarantee an
all-zero output when there is no FIFO data available, in particular the Windows
dsound backend which re-uses its internal circular buffer causing the last played
sound to loop indefinitely.

Whilst this is effectively a bug in the Windows dsound backend, work around it
for now using a simple heuristic: if the FIFO remains empty for half a cycle
(~23ms) then continuously fill the generated buffer with empty silence.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-9-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
ac13a6b3fd audio: add Apple Sound Chip (ASC) emulation
The Apple Sound Chip was primarily used by the Macintosh II to generate sound
in hardware which was previously handled by the toolbox ROM with software
interrupts.

Implement both the standard ASC and also the enhanced ASC (EASC) functionality
which is used in the Quadra 800.

Note that whilst real ASC hardware uses AUDIO_FORMAT_S8, this implementation uses
AUDIO_FORMAT_U8 instead because AUDIO_FORMAT_S8 is rarely used and not supported
by some audio backends like PulseAudio and DirectSound when played directly with
-audiodev out.mixing-engine=off.

Co-developed-by: Laurent Vivier <laurent@vivier.eu>
Co-developed-by: Volker Rümelin <vr_qemu@t-online.de>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20231004083806.757242-8-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
6997f26d0d q800: allow accesses to RAM area even if less memory is available
MacOS attempts a series of writes and reads over the entire RAM area in order
to determine the amount of RAM within the machine. Allow accesses to the
entire RAM area ignoring writes and always reading zero for areas where there
is no physical RAM installed to allow MacOS to detect the memory size without
faulting.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
bdc2c77d29 q800: add IOSB subsystem
It is needed because it defines the BIOSConfig area.

Co-developed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-ID: <20231004083806.757242-6-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
0f03047c1b q800: implement additional machine id bits on VIA1 port A
Co-developed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004083806.757242-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
e993af36a7 q800: add machine id register
MacOS reads this address to identify the hardware.

This is a basic implementation returning the ID of Quadra 800.

Details:

  http://mess.redump.net/mess/driver_info/mac_technical_notes

"There are 3 ID schemes [...]
 The third and most scalable is a machine ID register at 0x5ffffffc.
 The top word must be 0xa55a to be valid. Then bits 15-11 are 0 for
 consumer Macs, 1 for portables, 2 for high-end 68k, and 3 for high-end
 PowerPC. Bit 10 is 1 if additional ID bits appear elsewhere (e.g. in VIA1).
 The rest of the bits are a per-model identifier.

 Model                          Lower 16 bits of ID
...
 Quadra/Centris 610/650/800     0x2BAD"

Co-developed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004083806.757242-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
e2fd695e9d q800: add djMEMC memory controller
The djMEMC controller is used to store information related to the physical memory
configuration.

Co-developed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004083806.757242-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Mark Cave-Ayland
d43e967f69 q800-glue.c: convert to Resettable interface
Convert the GLUE device to 3-phase reset. The legacy method
doesn't do anything that's invalid in the hold phase, so the
conversion is simple and not a behaviour change.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-ID: <20231004083806.757242-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2023-10-06 10:33:43 +02:00
Jing Liu
eaadba6f9b vfio/pci: enable MSI-X in interrupt restoring on dynamic allocation
During migration restoring, vfio_enable_vectors() is called to restore
enabling MSI-X interrupts for assigned devices. It sets the range from
0 to nr_vectors to kernel to enable MSI-X and the vectors unmasked in
guest. During the MSI-X enabling, all the vectors within the range are
allocated according to the VFIO_DEVICE_SET_IRQS ioctl.

When dynamic MSI-X allocation is supported, we only want the guest
unmasked vectors being allocated and enabled. Use vector 0 with an
invalid fd to get MSI-X enabled, after that, all the vectors can be
allocated in need.

Signed-off-by: Jing Liu <jing2.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-05 22:04:51 +02:00
Jing Liu
5ebffa4e87 vfio/pci: use an invalid fd to enable MSI-X
Guests typically enable MSI-X with all of the vectors masked in the MSI-X
vector table. To match the guest state of device, QEMU enables MSI-X by
enabling vector 0 with userspace triggering and immediately release.
However the release function actually does not release it due to already
using userspace mode.

It is no need to enable triggering on host and rely on the mask bit to
avoid spurious interrupts. Use an invalid fd (i.e. fd = -1) is enough
to get MSI-X enabled.

After dynamic MSI-X allocation is supported, the interrupt restoring
also need use such way to enable MSI-X, therefore, create a function
for that.

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-05 22:04:51 +02:00
Jing Liu
d9e6710d7d vfio/pci: enable vector on dynamic MSI-X allocation
The vector_use callback is used to enable vector that is unmasked in
guest. The kernel used to only support static MSI-X allocation. When
allocating a new interrupt using "static MSI-X allocation" kernels,
QEMU first disables all previously allocated vectors and then
re-allocates all including the new one. The nr_vectors of VFIOPCIDevice
indicates that all vectors from 0 to nr_vectors are allocated (and may
be enabled), which is used to loop all the possibly used vectors when
e.g., disabling MSI-X interrupts.

Extend the vector_use function to support dynamic MSI-X allocation when
host supports the capability. QEMU therefore can individually allocate
and enable a new interrupt without affecting others or causing interrupts
lost during runtime.

Utilize nr_vectors to calculate the upper bound of enabled vectors in
dynamic MSI-X allocation mode since looping all msix_entries_nr is not
efficient and unnecessary.

Signed-off-by: Jing Liu <jing2.liu@intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-05 22:04:51 +02:00
Jing Liu
45d85f6228 vfio/pci: detect the support of dynamic MSI-X allocation
Kernel provides the guidance of dynamic MSI-X allocation support of
passthrough device, by clearing the VFIO_IRQ_INFO_NORESIZE flag to
guide user space.

Fetch the flags from host to determine if dynamic MSI-X allocation is
supported.

Originally-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-05 22:04:51 +02:00
Zhenzhong Duan
c06327c9db vfio/pci: rename vfio_put_device to vfio_pci_put_device
vfio_put_device() is a VFIO PCI specific function, rename it with
'vfio_pci' prefix to avoid confusing.

No functional change.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-05 22:04:51 +02:00
Alex Williamson
931150e56b vfio/display: Fix missing update to set backing fields
The below referenced commit renames scanout_width/height to
backing_width/height, but also promotes these fields in various portions
of the egl interface.  Meanwhile vfio dmabuf support has never used the
previous scanout fields and is therefore missed in the update.  This
results in a black screen when transitioning from ramfb to dmabuf display
when using Intel vGPU with these features.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1891
Link: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg02726.html
Fixes: 9ac06df8b6 ("virtio-gpu-udmabuf: correct naming of QemuDmaBuf size properties")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-05 22:04:51 +02:00
Eric Blake
2dcbb11b39 nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS
Allow a client to request a subset of negotiated meta contexts.  For
example, a client may ask to use a single connection to learn about
both block status and dirty bitmaps, but where the dirty bitmap
queries only need to be performed on a subset of the disk; forcing the
server to compute that information on block status queries in the rest
of the disk is wasted effort (both at the server, and on the amount of
traffic sent over the wire to be parsed and ignored by the client).

Qemu as an NBD client never requests to use more than one meta
context, so it has no need to use block status payloads.  Testing this
instead requires support from libnbd, which CAN access multiple meta
contexts in parallel from a single NBD connection; an interop test
submitted to the libnbd project at the same time as this patch
demonstrates the feature working, as well as testing some corner cases
(for example, when the payload length is longer than the export
length), although other corner cases (like passing the same id
duplicated) requires a protocol fuzzer because libnbd is not wired up
to break the protocol that badly.

This also includes tweaks to 'qemu-nbd --list' to show when a server
is advertising the capability, and to the testsuite to reflect the
addition to that output.

Of note: qemu will always advertise the new feature bit during
NBD_OPT_INFO if extended headers have alreay been negotiated
(regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has
occurred); but for NBD_OPT_GO, qemu only advertises the feature if
block status is also enabled (that is, if the client does not
negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so
the feature is not advertised).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230925192229.3186470-26-eblake@redhat.com>
[eblake: fix logic to reject unnegotiated contexts]
Signed-off-by: Eric Blake <eblake@redhat.com>
2023-10-05 11:02:08 -05:00
Eric Blake
1dec4643d1 nbd/server: Prepare for per-request filtering of BLOCK_STATUS
The next commit will add support for the optional extension
NBD_CMD_FLAG_PAYLOAD during NBD_CMD_BLOCK_STATUS, where the client can
request that the server only return a subset of negotiated contexts,
rather than all contexts.  To make that task easier, this patch
populates the list of contexts to return on a per-command basis (for
now, identical to the full set of negotiated contexts).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230925192229.3186470-25-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-10-05 11:02:08 -05:00
Eric Blake
fd358d8390 nbd/server: Refactor list of negotiated meta contexts
Peform several minor refactorings of how the list of negotiated meta
contexts is managed, to make upcoming patches easier: Promote the
internal type NBDExportMetaContexts to the public opaque type
NBDMetaContexts, and mark exp const.  Use a shorter member name in
NBDClient.  Hoist calls to nbd_check_meta_context() earlier in their
callers, as the number of negotiated contexts may impact the flags
exposed in regards to an export, which in turn requires a new
parameter.  Drop a redundant parameter to nbd_negotiate_meta_queries.
No semantic change intended on the success path; on the failure path,
dropping context in nbd_check_meta_export even when reporting an error
is safer.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-24-eblake@redhat.com>
2023-10-05 11:02:08 -05:00
Eric Blake
56cf9d0471 nbd/client: Request extended headers during negotiation
All the pieces are in place for a client to finally request extended
headers.  Note that we must not request extended headers when qemu-nbd
is used to connect to the kernel module (as nbd.ko does not expect
them, but expects us to do the negotiation in userspace before handing
the socket over to the kernel), but there is no harm in all other
clients requesting them.

Extended headers are not essential to the information collected during
'qemu-nbd --list', but probing for it gives us one more piece of
information in that output.  Update the iotests affected by the new
line of output.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-23-eblake@redhat.com>
2023-10-05 11:02:08 -05:00
Eric Blake
a7c18670b4 nbd/client: Accept 64-bit block status chunks
Once extended mode is enabled, we need to accept 64-bit status replies
(even for replies that don't exceed a 32-bit length).  It is easier to
normalize narrow replies into wide format so that the rest of our code
only has to handle one width.  Although a server is non-compliant if
it sends a 64-bit reply in compact mode, or a 32-bit reply in extended
mode, it is still easy enough to tolerate these mismatches.

In normal execution, we are only requesting "base:allocation" which
never exceeds 32 bits for flag values. But during testing with
x-dirty-bitmap, we can force qemu to connect to some other context
that might have 64-bit status bit; however, we ignore those upper bits
(other than mapping qemu:allocation-depth into something that
'qemu-img map --output=json' can expose), and since that only affects
testing, we really don't bother with checking whether more than the
two least-significant bits are set.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-22-eblake@redhat.com>
2023-10-05 11:02:08 -05:00
Eric Blake
4fc55bf3b0 nbd/client: Initial support for extended headers
Update the client code to be able to send an extended request, and
parse an extended header from the server.  Note that since we reject
any structured reply with a too-large payload, we can always normalize
a valid header back into the compact form, so that the caller need not
deal with two branches of a union.  Still, until a later patch lets
the client negotiate extended headers, the code added here should not
be reached.  Note that because of the different magic numbers, it is
just as easy to trace and then tolerate a non-compliant server sending
the wrong header reply as it would be to insist that the server is
compliant.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230925192229.3186470-21-eblake@redhat.com>
[eblake: fix trace format]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-10-05 11:02:08 -05:00
Eric Blake
ed6c996c67 nbd/client: Plumb errp through nbd_receive_replies
Instead of ignoring the low-level error just to refabricate our own
message to pass to the caller, we can just plumb the caller's errp
down to the low level.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230925192229.3186470-20-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-10-05 11:02:08 -05:00
Eric Blake
9c1d261437 nbd/server: Enable initial support for extended headers
Time to start supporting clients that request extended headers.  Now
we can finally reach the code added across several previous patches.

Even though the NBD spec has been altered to allow us to accept
NBD_CMD_READ larger than the max payload size (provided our response
is a hole or broken up over more than one data chunk), we are not
planning to take advantage of that, and continue to cap NBD_CMD_READ
to 32M regardless of header size.

For NBD_CMD_WRITE_ZEROES and NBD_CMD_TRIM, the block layer already
supports 64-bit operations without any effort on our part.  For
NBD_CMD_BLOCK_STATUS, the client's length is a hint, and the previous
patch took care of implementing the required
NBD_REPLY_TYPE_BLOCK_STATUS_EXT.

We do not yet support clients that want to do request payload
filtering of NBD_CMD_BLOCK_STATUS; that will be added in later
patches, but is not essential for qemu as a client since qemu only
requests the single context base:allocation.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-19-eblake@redhat.com>
2023-10-05 11:02:08 -05:00
Eric Blake
bcc16cc19e nbd/server: Support 64-bit block status
The NBD spec states that if the client negotiates extended headers,
the server must avoid NBD_REPLY_TYPE_BLOCK_STATUS and instead use
NBD_REPLY_TYPE_BLOCK_STATUS_EXT which supports 64-bit lengths, even if
the reply does not need more than 32 bits.  As of this patch,
client->mode is still never NBD_MODE_EXTENDED, so the code added here
does not take effect until the next patch enables negotiation.

For now, all metacontexts that we know how to export never populate
more than 32 bits of information, so we don't have to worry about
NBD_REP_ERR_EXT_HEADER_REQD or filtering during handshake, and we
always send all zeroes for the upper 32 bits of status during
NBD_CMD_BLOCK_STATUS.

Note that we previously had some interesting size-juggling on call
chains, such as:

nbd_co_send_block_status(uint32_t length)
-> blockstatus_to_extents(uint32_t bytes)
  -> bdrv_block_status_above(bytes, &uint64_t num)
  -> nbd_extent_array_add(uint64_t num)
    -> store num in 32-bit length

But we were lucky that it never overflowed: bdrv_block_status_above
never sets num larger than bytes, and we had previously been capping
'bytes' at 32 bits (since the protocol does not allow sending a larger
request without extended headers).  This patch adds some assertions
that ensure we continue to avoid overflowing 32 bits for a narrow
client, while fully utilizing 64-bits all the way through when the
client understands that.  Even in 64-bit math, overflow is not an
issue, because all lengths are coming from the block layer, and we
know that the block layer does not support images larger than off_t
(if lengths were coming from the network, the story would be
different).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-18-eblake@redhat.com>
2023-10-05 11:02:08 -05:00
Eric Blake
11d3355fc8 nbd/server: Prepare to send extended header replies
Although extended mode is not yet enabled, once we do turn it on, we
need to reply with extended headers to all messages.  Update the low
level entry points necessary so that all other callers automatically
get the right header based on the current mode.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-17-eblake@redhat.com>
2023-10-05 11:02:08 -05:00
Eric Blake
c8720ca03e nbd/server: Prepare to receive extended header requests
Although extended mode is not yet enabled, once we do turn it on, we
need to accept extended requests for all messages.  Previous patches
have already taken care of supporting 64-bit lengths, now we just need
to read it off the wire.

Note that this implementation will block indefinitely on a buggy
client that sends a non-extended payload (that is, we try to read a
full packet before we ever check the magic number, but a client that
mistakenly sends a simple request after negotiating extended headers
doesn't send us enough bytes), but it's no different from any other
client that stops talking to us partway through a packet and thus not
worth coding around.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230925192229.3186470-16-eblake@redhat.com>
2023-10-05 11:02:08 -05:00
Eric Blake
009cd86650 nbd/server: Support a request payload
Upcoming additions to support NBD 64-bit effect lengths allow for the
possibility to distinguish between payload length (capped at 32M) and
effect length (64 bits, although we generally assume 63 bits because
of off_t limitations).  Without that extension, only the NBD_CMD_WRITE
request has a payload; but with the extension, it makes sense to allow
at least NBD_CMD_BLOCK_STATUS to have both a payload and effect length
in a future patch (where the payload is a limited-size struct that in
turn gives the real effect length as well as a subset of known ids for
which status is requested).  Other future NBD commands may also have a
request payload, so the 64-bit extension introduces a new
NBD_CMD_FLAG_PAYLOAD_LEN that distinguishes between whether the header
length is a payload length or an effect length, rather than
hard-coding the decision based on the command.

According to the spec, a client should never send a command with a
payload without the negotiation phase proving such extension is
available.  So in the unlikely event the bit is set or cleared
incorrectly, the client is already at fault; if the client then
provides the payload, we can gracefully consume it off the wire and
fail the command with NBD_EINVAL (subsequent checks for magic numbers
ensure we are still in sync), while if the client fails to send
payload we block waiting for it (basically deadlocking our connection
to the bad client, but not negatively impacting our ability to service
other clients, so not a security risk).  Note that we do not support
the payload version of BLOCK_STATUS yet.

This patch also fixes a latent bug introduced in b2578459: once
request->len can be 64 bits, assigning it to a 32-bit payload_len can
cause wraparound to 0 which then sets req->complete prematurely;
thankfully, the bug was not possible back then (it takes this and
later patches to even allow request->len larger than 32 bits; and
since previously the only 'payload_len = request->len' assignment was
in NBD_CMD_WRITE which also sets check_length, which in turn rejects
lengths larger than 32M before relying on any possibly-truncated value
stored in payload_len).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230925192229.3186470-15-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[eblake: enhance comment on handling client error, fix type bug]
Signed-off-by: Eric Blake <eblake@redhat.com>
2023-10-05 11:02:08 -05:00
Eric Blake
a960a27ed0 mailmap: Fix BALATON Zoltan author email
This fixes authorship of commits 5cbd51a5 and friends, where the
qemu-ppc mailing list rewrote the "From:" field in the corresponding
patches.  See commit 3bd2608db7 ("maint: Add .mailmap entries for
patches claiming list authorship") for explanation.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230927143815.3397386-8-eblake@redhat.com>
2023-10-05 11:02:08 -05:00
Eric Blake
8a18f28c00 maint: Tweak comments in mailmap regarding SPF
Documenting that we should not add new lines to work around SPF
rewrites sounds foreboding; the intent is instead that new lines here
are okay, but indicate a second problem elsewhere in our build process
that we should also consider fixing at the same time, to keep the
section from growing without bounds.  While we have been doing that
for qemu-devel for a while, we jut recently fixed that for qemu-block:
https://git.linaro.org/people/pmaydell/misc-scripts.git/commit/?id=f9a317392

Mentioning DMARC alongside SPF may also help people grep for this
scenario, as well as documenting the 'git config' workaround that can
be used by submitters to avoid the munging issue in the first place.

Note the subtlety: 'git commit' sets authorship information based on
user.name and user.email (where name is usually unquoted); while 'git
send-email' includes a body 'From:' line only when sendemail.from is
present but differs from authorship information.  Hence the use of
quotes in sendemail.from (not a semantic change to email, but enough
of a difference to add the body 'From:').

Fixes: 3bd2608d ("maint: Add .mailmap entries for patches claiming list authorship")
CC: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230927143815.3397386-7-eblake@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2023-10-05 11:02:08 -05:00
Andrey Drobyshev
994679b98a mailmap: Fix Andrey Drobyshev author email
This fixes authorship of commits 2848289168, 52b10c9c0c as the mailing
list rewrote the "From:" field in the corresponding patches.  See commit
3bd2608db7 ("maint: Add .mailmap entries for patches claiming list
authorship") for explanation.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20230926102801.512107-1-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
2023-10-05 11:02:07 -05:00
Stefan Hajnoczi
2f3913f4b2 virtio,pci: features, cleanups
vdpa:
       shadow vq vlan support
       net migration with cvq
 cxl:
      support emulating 4 HDM decoders
      serial number extended capability
 virtio:
       hared dma-buf
 
 Fixes, cleanups all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmUd4/YPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpyM8H/02cRbJcQOjYt7j68zPW6GaDXxBI/UmdWDyG
 15LZZbGNOPjyjNd3Vz1M7stQ5rhoKcgo/RdI+0E60a78svgW5JvpXoXR3pksc3Dx
 v28B/akXwHUErYFSZQ+2VHNc8OhCd0v2ehxZxbwPEAYIOAj3hcCIVoPGXTnKJmAJ
 imr5hjH0wZUc0+xdsmn8Vfdv5NTzpwfVObbGiMZejeJsaoh0y6Rt8RANBMY67KQD
 S7/HPlVuDYf/y43t4ZEHNYuV9RaCdZZYlLWwV1scdKaYcofgmtJOKbOdCjHRXgj+
 004Afb3rggIoCfnCzOFzhGx+MLDtLjvEn2N4oLEWCLi+k/3huaA=
 =GAvH
 -----END PGP SIGNATURE-----

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pci: features, cleanups

vdpa:
      shadow vq vlan support
      net migration with cvq
cxl:
     support emulating 4 HDM decoders
     serial number extended capability
virtio:
      hared dma-buf

Fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (53 commits)
  libvhost-user: handle shared_object msg
  vhost-user: add shared_object msg
  hw/display: introduce virtio-dmabuf
  util/uuid: add a hash function
  virtio: remove unused next argument from virtqueue_split_read_next_desc()
  virtio: remove unnecessary thread fence while reading next descriptor
  virtio: use shadow_avail_idx while checking number of heads
  libvhost-user.c: add assertion to vu_message_read_default
  pcie_sriov: unregister_vfs(): fix error path
  hw/i386/pc: improve physical address space bound check for 32-bit x86 systems
  amd_iommu: Fix APIC address check
  vdpa net: follow VirtIO initialization properly at cvq isolation probing
  vdpa net: stop probing if cannot set features
  vdpa net: fix error message setting virtio status
  hw/pci-bridge/cxl-upstream: Add serial number extended capability support
  hw/cxl: Support 4 HDM decoders at all levels of topology
  hw/cxl: Fix and use same calculation for HDM decoder block size everywhere
  hw/cxl: Add utility functions decoder interleave ways and target count.
  hw/cxl: Push cxl_decoder_count_enc() and cxl_decode_ig() into .c
  vdpa net: zero vhost_vdpa iova_tree pointer at cleanup
  ...

Conflicts:
  hw/core/machine.c
  Context conflict with commit 314e0a84cd ("hw/core: remove needless
  includes") because it removed an adjacent #include.
2023-10-05 09:01:01 -04:00
Stefan Hajnoczi
800af0aae1 accel: Introduce AccelClass::cpu_common_[un]realize
accel: Target agnostic code movement
 accel/tcg: Cleanups to use CPUState instead of CPUArchState
 accel/tcg: Move CPUNegativeOffsetState into CPUState
 tcg: Split out tcg init functions to tcg/startup.h
 linux-user/hppa: Fix struct target_sigcontext layout
 build: Remove --enable-gprof
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmUdsL4dHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/iYggAvDJEyMCAXSSH97BA
 wZT/2D/MFIhOMk6xrQRnrXfrG70N0iVKz44jl9j7k1D+9BOHcso//DDJH3c96k9A
 MgDb6W2bsWvC15/Qw6BALf5bb/II0MJuCcQvj3CNX5lNkXAWhwIOBhsZx7V9ST1+
 rihN4nowpRWdV5GeCjDGaJW455Y1gc96hICYHy6Eqw1cUgUFt9vm5aYU3FHlat29
 sYRaVYKUL2hRUPPNcPiPq0AaJ8wN6/s8gT+V1UvTzkhHqskoM4ZU89RchuXVoq1h
 SvhKElyULMRzM7thWtpW8qYJPj4mxZsKArESvHjsunGD6KEz3Fh1sy6EKRcdmpG/
 II1vkg==
 =k2Io
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20231004' of https://gitlab.com/rth7680/qemu into staging

accel: Introduce AccelClass::cpu_common_[un]realize
accel: Target agnostic code movement
accel/tcg: Cleanups to use CPUState instead of CPUArchState
accel/tcg: Move CPUNegativeOffsetState into CPUState
tcg: Split out tcg init functions to tcg/startup.h
linux-user/hppa: Fix struct target_sigcontext layout
build: Remove --enable-gprof

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmUdsL4dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/iYggAvDJEyMCAXSSH97BA
# wZT/2D/MFIhOMk6xrQRnrXfrG70N0iVKz44jl9j7k1D+9BOHcso//DDJH3c96k9A
# MgDb6W2bsWvC15/Qw6BALf5bb/II0MJuCcQvj3CNX5lNkXAWhwIOBhsZx7V9ST1+
# rihN4nowpRWdV5GeCjDGaJW455Y1gc96hICYHy6Eqw1cUgUFt9vm5aYU3FHlat29
# sYRaVYKUL2hRUPPNcPiPq0AaJ8wN6/s8gT+V1UvTzkhHqskoM4ZU89RchuXVoq1h
# SvhKElyULMRzM7thWtpW8qYJPj4mxZsKArESvHjsunGD6KEz3Fh1sy6EKRcdmpG/
# II1vkg==
# =k2Io
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 04 Oct 2023 14:36:46 EDT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20231004' of https://gitlab.com/rth7680/qemu: (47 commits)
  tcg/loongarch64: Fix buid error
  tests/avocado: Re-enable MIPS Malta tests (GitLab issue #1884 fixed)
  build: Remove --enable-gprof
  linux-user/hppa: Fix struct target_sigcontext layout
  tcg: Split out tcg init functions to tcg/startup.h
  tcg: Remove argument to tcg_prologue_init
  accel/tcg: Make cpu-exec-common.c a target agnostic unit
  accel/tcg: Make icount.o a target agnostic unit
  accel/tcg: Make monitor.c a target-agnostic unit
  accel/tcg: Rename target-specific 'internal.h' -> 'internal-target.h'
  exec: Rename target specific page-vary.c -> page-vary-target.c
  exec: Rename cpu.c -> cpu-target.c
  accel: Rename accel-common.c -> accel-target.c
  accel: Make accel-blocker.o target agnostic
  accel/tcg: Restrict dump_exec_info() declaration
  exec: Move cpu_loop_foo() target agnostic functions to 'cpu-common.h'
  exec: Make EXCP_FOO definitions target agnostic
  accel/tcg: move ld/st helpers to ldst_common.c.inc
  accel/tcg: Unify user and softmmu do_[st|ld]*_mmu()
  accel/tcg: Remove env_tlb()
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-05 08:55:34 -04:00
Albert Esteve
ce0f3b032a libvhost-user: handle shared_object msg
In the libvhost-user library we need to
handle VHOST_USER_GET_SHARED_OBJECT requests,
and add helper functions to allow sending messages
to interact with the virtio shared objects
hash table.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20231002065706.94707-5-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Albert Esteve
1609476662 vhost-user: add shared_object msg
Add three new vhost-user protocol
`VHOST_USER_BACKEND_SHARED_OBJECT_* messages`.
These new messages are sent from vhost-user
back-ends to interact with the virtio-dmabuf
table in order to add or remove themselves as
virtio exporters, or lookup for virtio dma-buf
shared objects.

The action taken in the front-end depends
on the type stored in the virtio shared
object hash table.

When the table holds a pointer to a vhost
backend for a given UUID, the front-end sends
a VHOST_USER_GET_SHARED_OBJECT to the
backend holding the shared object.

The messages can only be sent after successfully
negotiating a new VHOST_USER_PROTOCOL_F_SHARED_OBJECT
vhost-user protocol feature bit.

Finally, refactor code to send response message so
that all common parts both for the common REPLY_ACK
case, and other data responses, can call it and
avoid code repetition.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20231002065706.94707-4-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Albert Esteve
faefdba847 hw/display: introduce virtio-dmabuf
This API manages objects (in this iteration,
dmabuf fds) that can be shared along different
virtio devices, associated to a UUID.

The API allows the different devices to add,
remove and/or retrieve the objects by simply
invoking the public functions that reside in the
virtio-dmabuf file.

For vhost backends, the API stores the pointer
to the backend holding the object.

Suggested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20231002065706.94707-3-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Albert Esteve
a6ceee591a util/uuid: add a hash function
Add hash function to uuid module using the
djb2 hash algorithm.

Add a couple simple unit tests for the hash
function, checking collisions for similar UUIDs.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20231002065706.94707-2-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Ilya Maximets
70f88436aa virtio: remove unused next argument from virtqueue_split_read_next_desc()
The 'next' was converted from a local variable to an output parameter
in commit:
  412e0e81b1 ("virtio: handle virtqueue_read_next_desc() errors")

But all the actual uses of the 'i/next' as an output were removed a few
months prior in commit:
  aa570d6fb6 ("virtio: combine the read of a descriptor")

Remove the unused argument to simplify the code.

Also, adding a comment to the function to describe what it is actually
doing, as it is not obvious that the 'desc' is both an input and an
output argument.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Message-Id: <20230927140016.2317404-3-i.maximets@ovn.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Ilya Maximets
d501f97d96 virtio: remove unnecessary thread fence while reading next descriptor
It was supposed to be a compiler barrier and it was a compiler barrier
initially called 'wmb' when virtio core support was introduced.
Later all the instances of 'wmb' were switched to smp_wmb to fix memory
ordering issues on non-x86 platforms.  However, this one doesn't need
to be an actual barrier, as its only purpose was to ensure that the
value is not read twice.

And since commit aa570d6fb6 ("virtio: combine the read of a descriptor")
there is no need for a barrier at all, since we're no longer reading
guest memory here, but accessing a local structure.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Message-Id: <20230927140016.2317404-2-i.maximets@ovn.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Ilya Maximets
850cd20b07 virtio: use shadow_avail_idx while checking number of heads
We do not need the most up to date number of heads, we only want to
know if there is at least one.

Use shadow variable as long as it is not equal to the last available
index checked.  This avoids expensive qatomic dereference of the
RCU-protected memory region cache as well as the memory access itself.

The change improves performance of the af-xdp network backend by 2-3%.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Message-Id: <20230927135157.2316982-1-i.maximets@ovn.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Vladimir Sementsov-Ogievskiy
a6f4d2ec42 libvhost-user.c: add assertion to vu_message_read_default
Explain Coverity that we are not going to overflow vmsg->fds.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230925194040.68592-5-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Vladimir Sementsov-Ogievskiy
f1a153857a pcie_sriov: unregister_vfs(): fix error path
local_err must be NULL before calling object_property_set_bool(), so we
must clear it on each iteration. Let's also use more convenient
error_reportf_err().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230925194040.68592-8-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Ani Sinha
cf0386509e hw/i386/pc: improve physical address space bound check for 32-bit x86 systems
32-bit x86 systems do not have a reserved memory for hole64. On those 32-bit
systems without PSE36 or PAE CPU features, hotplugging memory devices are not
supported by QEMU as QEMU always places hotplugged memory above 4 GiB boundary
which is beyond the physical address space of the processor. Linux guests also
does not support memory hotplug on those systems. Please see Linux
kernel commit b59d02ed08690 ("mm/memory_hotplug: disable the functionality
for 32b") for more details.

Therefore, the maximum limit of the guest physical address in the absence of
additional memory devices effectively coincides with the end of
"above 4G memory space" region for 32-bit x86 without PAE/PSE36. When users
configure additional memory devices, after properly accounting for the
additional device memory region to find the maximum value of the guest
physical address, the address will be outside the range of the processor's
physical address space.

This change adds improvements to take above into consideration.

For example, previously this was allowed:

$ ./qemu-system-x86_64 -cpu pentium -m size=10G

With this change now it is no longer allowed:

$ ./qemu-system-x86_64 -cpu pentium -m size=10G
qemu-system-x86_64: Address space limit 0xffffffff < 0x2bfffffff phys-bits too low (32)

However, the following are allowed since on both cases physical address
space of the processor is 36 bits:

$ ./qemu-system-x86_64 -cpu pentium2 -m size=10G
$ ./qemu-system-x86_64 -cpu pentium,pse36=on -m size=10G

For 32-bit, without PAE/PSE36, hotplugging additional memory is no longer allowed.

$ ./qemu-system-i386 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0xffffffff < 0x1ffffffff phys-bits too low (32)
$ ./qemu-system-i386 -machine q35 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0xffffffff < 0x1ffffffff phys-bits too low (32)

A new compatibility flag is introduced to make sure pc_max_used_gpa() keeps
returning the old value for machines 8.1 and older.
Therefore, the above is still allowed for older machine types in order to support
compatibility. Hence, the following still works:

$ ./qemu-system-i386 -machine pc-i440fx-8.1 -m size=1G,maxmem=3G,slots=2
$ ./qemu-system-i386 -machine pc-q35-8.1 -m size=1G,maxmem=3G,slots=2

Further, following is also allowed as with PSE36, the processor has 36-bit
address space:

$ ./qemu-system-i386 -cpu 486,pse36=on -m size=1G,maxmem=3G,slots=2

After calling CPUID with EAX=0x80000001, all AMD64 compliant processors
have the longmode-capable-bit turned on in the extended feature flags (bit 29)
in EDX. The absence of CPUID longmode can be used to differentiate between
32-bit and 64-bit processors and is the recommended approach. QEMU takes this
approach elsewhere (for example, please see x86_cpu_realizefn()), With
this change, pc_max_used_gpa() also uses the same method to detect 32-bit
processors.

Unit tests are modified to not run 32-bit x86 tests that use memory hotplug.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230922160413.165702-1-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Akihiko Odaki
0114c45130 amd_iommu: Fix APIC address check
An MSI from I/O APIC may not exactly equal to APIC_DEFAULT_ADDRESS. In
fact, Windows 17763.3650 configures I/O APIC to set the dest_mode bit.
Cover the range assigned to APIC.

Fixes: 577c470f43 ("x86_iommu/amd: Prepare for interrupt remap support")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230921114612.40671-1-akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00
Eugenio Pérez
845ec38ae1 vdpa net: follow VirtIO initialization properly at cvq isolation probing
This patch solves a few issues.  The most obvious is that the feature
set was done previous to ACKNOWLEDGE | DRIVER status bit set.  Current
vdpa devices are permissive with this, but it is better to follow the
standard.

Fixes: 152128d646 ("vdpa: move CVQ isolation check to net_init_vhost_vdpa")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230915170836.3078172-4-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04 18:15:06 -04:00