Also note that we were missing the qemu_target_list entry
for plain sparc; fix that at the same time.
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20191106113318.10226-2-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Commit 65f14ab98d ("usb-host: skip reset for untouched devices")
filters out multiple usb device resets in a row. While this improves
the situation for usb some devices it doesn't work for others :-(
So go add a config option to make the behavior configurable.
Buglink: https://bugs.launchpad.net/bugs/1846451
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20191015064426.19454-1-kraxel@redhat.com
Host notifiers are used in several cases:
1. Traditional ioeventfd where virtqueue notifications are handled in
the main loop thread.
2. IOThreads (aio_handle_output) where virtqueue notifications are
handled in an IOThread AioContext.
3. vhost where virtqueue notifications are handled by kernel vhost or
a vhost-user device backend.
Most virtqueue notifications from the guest use the ioeventfd mechanism,
but there are corner cases where QEMU code calls virtio_queue_notify().
This currently honors the host notifier for the IOThreads
aio_handle_output case, but not for the vhost case. The result is that
vhost does not receive virtqueue notifications from QEMU when
virtio_queue_notify() is called.
This patch extends virtio_queue_notify() to set the host notifier
whenever it is enabled instead of calling the vq->(aio_)handle_output()
function directly. We track the host notifier state for each virtqueue
separately since some devices may use it only for certain virtqueues.
This fixes the vhost case although it does add a trip through the
eventfd for the traditional ioeventfd case. I don't think it's worth
adding a fast path for the traditional ioeventfd case because calling
virtio_queue_notify() is rare when ioeventfd is enabled.
Reported-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20191105140946.165584-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtual address that is provided by the guest in post_send and
post_recv operations is related to the guest address space. This address
space is unknown to the HCA resides on host so extra step in these
operations is needed to adjust the address to host virtual address.
This step, which is done in data-path affects performances.
An enhanced verion of MR registration introduced here
https://patchwork.kernel.org/patch/11044467/ can be used so that the
guest virtual address space for this MR is known to the HCA in host.
This will save the data-path adjustment.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Message-Id: <20190818132107.18181-3-yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
The function reg_mr_iova is an enhanced version of ibv_reg_mr function
that can help to easly register and use guest's MRs.
Add check in 'configure' phase to detect if we have libibverbs with this
support.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Message-Id: <20190818132107.18181-2-yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
The PIIX3 is not tied to the i440FX and can even be used without it.
Move its creation to the machine code (pc_piix.c).
We have now removed the last trace of southbridge code in the i440FX
northbridge.
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
We moved all the PIIX3 southbridge code out of hw/pci-host/piix.c,
it now only contains i440FX northbridge code.
Rename it to match the chipset modelled.
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Move all the PIIX3 functions to a new file: hw/isa/piix3.c.
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
We will move this code, fix its style first.
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Esteban Bosse <estebanbosse@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The hw/pci-host/piix.c contains a mix of PIIX3 and i440FX chipsets
functions. To be able to split it, we need to export some
declarations first.
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The IRQ Route Control registers definitions belong to the PIIX
chipset. We were only defining the 'A' register. Define the other
B, C and D registers, and use them.
Acked-by: Paul Durrant <paul@xen.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The RCR_IOPORT register belongs to the PIIX chipset.
Move the definition to "piix.h", and prepend the PIIX prefix.
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Extract the PIIX3 creation code from the i440fx_init() function.
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Esteban Bosse <estebanbosse@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
These devices implemented their load_state_old() handler 10 years
ago, previous to QEMU v0.12.
Since commit cc425b5ddf removed the pc-0.10 and pc-0.11 machines,
we can drop this code.
Note: the mips_r4k machine started to use the i8254 device just
after QEMU v0.5.0, but the MIPS machine types are not versioned,
so there is no migration compatibility issue removing this handler.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Now that we properly refactored the piix4_create() function, let's
move it to hw/isa/piix4.c where it belongs, so it can be reused
on other places.
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The Malta board instantiate a PIIX4 chipset doing various
calls. Refactor all those related calls into a single
function: piix4_create().
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In the next commit we'll refactor the PIIX4 code out of
mips_malta_init(). As a preliminary step, add the 'ide_drives'
variable and create the drive array dynamically.
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Remove mc146818rtc instanciated in malta board, to not have it twice.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <20171216090228.28505-13-hpoussin@reactos.org>
[PMD: rebased, set RTC base_year to 2000]
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Remove i8254 instanciated in malta board, to not have it twice.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <20171216090228.28505-10-hpoussin@reactos.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The i8257 is not a chipset on the Malta board, but is part of
the PIIX4 chipset.
Create the i8257 in the PIIX4 code, remove the one instantiated
in malta board, to not have it twice.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <20171216090228.28505-9-hpoussin@reactos.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Esteban Bosse <estebanbosse@gmail.com>
[PMD: rebased, reworded description]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Other piix4 parts are already named piix4-ide and piix4-usb-uhci.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <20171216090228.28505-15-hpoussin@reactos.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Esteban Bosse <estebanbosse@gmail.com>
[PMD: rebased]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
This function isn't used anymore.
This reverts commit 22ec3283ef.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Esteban Bosse <estebanbosse@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Add ISA irqs as piix4 gpio in, and CPU interrupt request as piix4 gpio out.
Remove i8259 instanciated in malta board, to not have it twice.
We can also remove the now unused piix4_init() function.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <20171216090228.28505-8-hpoussin@reactos.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
[PMD: rebased, updated includes, use ISA_NUM_IRQS in for loop]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The RCR I/O port (0xcf9) is used to generate a hard reset or a soft reset.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <20171216090228.28505-7-hpoussin@reactos.org>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
[PMD: rebased, updated includes]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The PIIX4 Southbridge is not used by the PC machine,
but by the Malta board (MIPS). Add a new section to
keep it covered.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Li Qiang <liq3ea@163.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
When hw/$DIR/Kconfig is changed, the corresponding generated
hw/$DIR/config-devices.mak is not being updated.
Fix this by including all the hw/*/Kconfig files to the prerequisite
names of the rule generating the config-devices.mak files.
Fixes: e0e312f352 (build: switch to Kconfig)
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEicHnj2Ae6GyGdJXLoqP9bt6twN4FAl2/Us4ACgkQoqP9bt6t
wN7dLhAAxlnmK9M7DeraiyCCVwgIzYL4sFssIAcIt5wbDg8p6od+hgeKcRwPKJbP
mg2a2Ehc1hoIJD3xUUVwJIkV12CONPnx+H3LEkJXY+w3ZqwbkNNJHy+5FBuC+h8P
1lTZDK6ulXkNR7OfeKys8Mzf6Ukf0TJsEuXHZWxC/e3I3rpf0+/FqP5QwHbi18Q2
4rCSy/59eaBiMBphlZHBncVOo1Kv1hKqpqSc9ddGj3uwyDpcjUThz4NEDhdXFE+r
0tKTPbv0f+z8CG9jAOgbmbFNFxwFb7D4uouwflFtNXleb5cdVGxQsAsJmYDvTmjP
3Qnvqiuw1BYRGpG1+54l0F82AThV1jlmlWVt/34PfwmGJRgMoTtDUJmW+SfjC3YT
MB+r78v4a6lavAxj780YIWVQzdvO4pG6fKKRbtMXNw4hyVxSWCBv1xC9ioKe1Xn1
LNI+rAY7ohYt/dN1aNipdFwk33NYHOOGDP2eU9GGL76tMrY5jK8i5KSqo1SNvxak
zpAJggMZXaG4BtGj7qCmMngQeC2yJwK1lou4P/S5A5OYDlHWjeszKBHa+hzQa/XH
3U8hEgRMjyeMzyyvh1OLDjWnxAMMeoPqb/hajmuZ1qM0oeBcKXbYtijNdAWIOK2q
AJeTeHWJ9WTkFvThwSSVDu6I6Snfhecvk96tKft0pr0rXYM6bsg=
=0aXg
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/fw_cfg-next-pull-request' into staging
Fix the fw_cfg reboot-timeout=-1 special value, add a test for it.
# gpg: Signature made Sun 03 Nov 2019 22:21:02 GMT
# gpg: using RSA key 89C1E78F601EE86C867495CBA2A3FD6EDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (Phil) <philmd@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 89C1 E78F 601E E86C 8674 95CB A2A3 FD6E DEAD C0DE
* remotes/philmd-gitlab/tags/fw_cfg-next-pull-request:
tests/fw_cfg: Test 'reboot-timeout=-1' special value
fw_cfg: Allow reboot-timeout=-1 again
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Linux kernel 5.4 will introduce a new memory map for SWIM device.
(aee6bff1c325 ("m68k: mac: Revisit floppy disc controller base addresses"))
Until this release all MMIO are mapped between 0x50f00000 and 0x50f40000,
but it appears that for real hardware 0x50f00000 is not the base address:
the MMIO region spans 0x50000000 through 0x60000000, and 0x50040000 through
0x54000000 is repeated images of 0x50000000 to 0x50040000.
Fixed: 04e7ca8d0f ("hw/m68k: define Macintosh Quadra 800")
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20191104101513.29518-1-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
OSError can't be used like a tuple on Python 3, so change the
code to use `e.sterror` instead of `e[1]`.
Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20191021214117.18091-1-ehabkost@redhat.com
Message-Id: <20191021214117.18091-1-ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Instead of manually encoding stderr and stdout output, use
`errors` parameter of subprocess.Popen(). This will make
process.communicate() return unicode strings instead of bytes
objects.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191016192430.25098-11-ehabkost@redhat.com
Message-Id: <20191016192430.25098-11-ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
image-fuzzer is now supposed to be ready to run using Python 3.
Remove the __future__ imports and change the interpreter line to
"#!/usr/bin/env python3".
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191016192430.25098-10-ehabkost@redhat.com
Message-Id: <20191016192430.25098-10-ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Callers of create_image() will pass strings as arguments, but the
Image class will expect bytes objects to be provided. Encode
them inside create_image().
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191016192430.25098-9-ehabkost@redhat.com
Message-Id: <20191016192430.25098-9-ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Field values are supposed to be bytes objects, not unicode
strings. Change two constants that were declared as strings.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191016192430.25098-8-ehabkost@redhat.com
Message-Id: <20191016192430.25098-8-ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
No caller of fuzzer functions is interested in unicode string values,
so replace them with bytes sequences.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191016192430.25098-7-ehabkost@redhat.com
Message-Id: <20191016192430.25098-7-ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This makes the formatting code simpler, and safer if we change
the type of self.value from str to bytes.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191016192430.25098-6-ehabkost@redhat.com
Message-Id: <20191016192430.25098-6-ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
StringIO.StringIO is not available on Python 3, but io.StringIO
is available on both Python 2 and 3. io.StringIO is slightly
different from the Python 2 StringIO module, though, so we need
bytes coming from subprocess.Popen() to be explicitly decoded.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191016192430.25098-5-ehabkost@redhat.com
Message-Id: <20191016192430.25098-5-ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Most of the division expressions in image-fuzzer assume integer
division. Use the // operator to keep the same behavior when we
move to Python 3.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191016192430.25098-4-ehabkost@redhat.com
Message-Id: <20191016192430.25098-4-ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is necessary for Python 3 compatibility.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191016192430.25098-3-ehabkost@redhat.com
Message-Id: <20191016192430.25098-3-ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This probably never caused problems because on Linux there's no
actual newline conversion happening, but on Python 3 the
binary/text distinction is stronger and we must explicitly open
the image file in binary mode.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191016192430.25098-2-ehabkost@redhat.com
Message-Id: <20191016192430.25098-2-ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The Plug & Play region of the AHB/APB bridge can be accessed
by various word size, however the implementation is clearly
restricted to 32-bit:
static uint64_t grlib_apb_pnp_read(void *opaque, hwaddr offset, unsigned size)
{
APBPnp *apb_pnp = GRLIB_APB_PNP(opaque);
return apb_pnp->regs[offset >> 2];
}
Set the MemoryRegionOps::impl min/max fields to 32-bit, so
memory.c::access_with_adjusted_size() can adjust when the
access is not 32-bit.
This is required to run RTEMS on leon3, the grlib scanning
functions do byte accesses.
Reported-by: Jiri Gaisler <jiri@gaisler.se>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: KONRAD Frederic <frederic.konrad@adacore.com>
Message-Id: <20191025110114.27091-3-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Improve the help text of the "-display" option:
- Only print the options that we have enabled in the binary
(similar to what we do for other options like -netdev already)
- The "frame=on|off" from "-display sdl" has been removed in commit
09bd7ba9f5 ("Remove deprecated -no-frame option"), so we should
not show this in the help text anymore
- The "-display egl-headless" line was missing a "\n" at the end
- Indent the default display text in a nicer way
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20191023120129.13721-1-huth@tuxfamily.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This change includes support for all AF_NETLINK socket options up to about
kernel version 5.4 (5.4 is not formally released at the time of writing).
Socket options that were introduced in kernel versions before the oldest
currently stable kernel version are guarded by kernel version macros.
This change has been built under gcc 8.3, and clang 9.0, and it passes
`make check`. The netlink options have been tested by emulating some
non-trival software that uses NETLINK socket options, but they have
not been exaustively verified.
Signed-off-by: Josh Kunz <jkz@google.com>
Message-Id: <20191029224310.164025-1-jkz@google.com>
[lv: updated patch according to CODING_STYLE]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
PCIe requester IDs are used by modern IOMMUs to differentiate devices
in order to provide a unique IOVA address space per device. These
requester IDs are composed of the bus/device/function (BDF) of the
requesting device. Conventional PCI pre-dates this concept and is
simply a shared parallel bus where transactions are claimed by
decoding target ranges rather than the packetized, point-to-point
mechanisms of PCI-express. In order to interface conventional PCI
to PCIe, the PCIe-to-PCI bridge creates and accepts packetized
transactions on behalf of all downstream devices, using one of two
potential forms of a requester ID relating to the bridge itself or its
subordinate bus. All downstream devices are therefore aliased by the
bridge's requester ID and it's not possible for the IOMMU to create
unique IOVA spaces for devices downstream of such buses.
At least that's how it works on bare metal. Until now point we've
ignored this nuance of vIOMMU support in QEMU, creating a unique
AddressSpace per device regardless of the virtual bus topology.
Aside from simply being true to bare metal behavior, there are aspects
of a shared address space that we can use to our advantage when
designing a VM. For instance, a PCI device assignment scenario where
we have the following IOMMU group on the host system:
$ ls /sys/kernel/iommu_groups/1/devices/
0000:00:01.0 0000:01:00.0 0000:01:00.1
An IOMMU group is considered the smallest set of devices which are
fully DMA isolated from other devices by the IOMMU. In this case the
root port at 00:01.0 does not guarantee that it prevents peer to peer
traffic between the endpoints on bus 01: and the devices are therefore
grouped together. VFIO considers an IOMMU group to be the smallest
unit of device ownership and allows only a single shared IOVA space
per group due to the limitations of the isolation.
Therefore, if we attempt to create the following VM, we get an error:
qemu-system-x86_64 -machine q35... \
-device intel-iommu,intremap=on \
-device pcie-root-port,addr=1e.0,id=pcie.1 \
-device vfio-pci,host=1:00.0,bus=pcie.1,addr=0.0,multifunction=on \
-device vfio-pci,host=1:00.1,bus=pcie.1,addr=0.1
qemu-system-x86_64: -device vfio-pci,host=1:00.1,bus=pcie.1,addr=0.1: vfio \
0000:01:00.1: group 1 used in multiple address spaces
VFIO only allows a single IOVA space (AddressSpace) for both devices,
but we've placed them into a topology where the vIOMMU expects a
separate AddressSpace for each device. On bare metal we know that
a conventional PCI bus would provide the sort of aliasing we need
here, forcing the IOMMU to consider these devices to be part of a
single shared IOVA space. The support provided here does the same
for QEMU, such that we can create a conventional PCI topology to
expose equivalent AddressSpace sharing requirements to the VM:
qemu-system-x86_64 -machine q35... \
-device intel-iommu,intremap=on \
-device pcie-pci-bridge,addr=1e.0,id=pci.1 \
-device vfio-pci,host=1:00.0,bus=pci.1,addr=1.0,multifunction=on \
-device vfio-pci,host=1:00.1,bus=pci.1,addr=1.1
There are pros and cons to this configuration; it's not necessarily
recommended, it's simply a tool we can use to create configurations
which may provide additional functionality in spite of host hardware
limitations or as a benefit to the guest configuration or resource
usage. An incomplete list of pros and cons:
Cons:
a) Extended PCI configuration space is unavailable to devices
downstream of a conventional PCI bus. The degree to which this
is a drawback depends on the device and guest drivers.
b) Applying this topology to devices which are already isolated by
the host IOMMU (singleton IOMMU groups) will result in devices
which appear to be non-isolated to the VM (non-singleton groups).
This can limit configurations within the guest, such as userspace
drivers or nested device assignment.
Pros:
a) QEMU better emulates bare metal.
b) Configurations as above are now possible.
c) Host IOMMU resources and VM locked memory requirements are reduced
in vIOMMU configurations due to shared IOMMU domains on the host
and avoidance of duplicate locked memory accounting.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <157187083548.5439.14747141504058604843.stgit@gimli.home>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Memory block commands are only supported for linux with sysfs,
"guest-get-memory-block-info" was not in blacklist for other
cases.
Reported on:
https://bugzilla.redhat.com/show_bug.cgi?id=1751431
Signed-off-by: Basil Salman <bsalman@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>