vIOMMU support works already with RamDiscardManager as long as guests only
map populated memory. Both, populated and discarded memory is mapped
into &address_space_memory, where vfio_get_xlat_addr() will find that
memory, to create the vfio mapping.
Sane guests will never map discarded memory (e.g., unplugged memory
blocks in virtio-mem) into an IOMMU - or keep it mapped into an IOMMU while
memory is getting discarded. However, there are two cases where a malicious
guests could trigger pinning of more memory than intended.
One case is easy to handle: the guest trying to map discarded memory
into an IOMMU.
The other case is harder to handle: the guest keeping memory mapped in
the IOMMU while it is getting discarded. We would have to walk over all
mappings when discarding memory and identify if any mapping would be a
violation. Let's keep it simple for now and print a warning, indicating
that setting RLIMIT_MEMLOCK can mitigate such attacks.
We have to take care of incoming migration: at the point the
IOMMUs get restored and start creating mappings in vfio, RamDiscardManager
implementations might not be back up and running yet: let's add runstate
priorities to enforce the order when restoring.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-10-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Let's query the maximum number of possible DMA mappings by querying the
available mappings when creating the container (before any mappings are
created). We'll use this informaton soon to perform some sanity checks
and warn the user.
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-8-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Implement support for RamDiscardManager, to prepare for virtio-mem
support. Instead of mapping the whole memory section, we only map
"populated" parts and update the mapping when notified about
discarding/population of memory via the RamDiscardListener. Similarly, when
syncing the dirty bitmaps, sync only the actually mapped (populated) parts
by replaying via the notifier.
Using virtio-mem with vfio is still blocked via
ram_block_discard_disable()/ram_block_discard_require() after this patch.
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-7-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Let's properly notify when (un)plugging blocks, after discarding memory
and before allowing the guest to consume memory. Handle errors from
notifiers gracefully (e.g., no remaining VFIO mappings) when plugging,
rolling back the change and telling the guest that the VM is busy.
One special case to take care of is replaying all notifications after
restoring the vmstate. The device starts out with all memory discarded,
so after loading the vmstate, we have to notify about all plugged
blocks.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-6-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
In case one wants to create a permanent copy of a MemoryRegionSections,
one needs access to flatview_ref()/flatview_unref(). Instead of exposing
these, let's just add helpers to copy/free a MemoryRegionSection and
properly adjust references.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-3-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We have some special RAM memory regions (managed by virtio-mem), whereby
the guest agreed to only use selected memory ranges. "unused" parts are
discarded so they won't consume memory - to logically unplug these memory
ranges. Before the VM is allowed to use such logically unplugged memory
again, coordination with the hypervisor is required.
This results in "sparse" mmaps/RAMBlocks/memory regions, whereby only
coordinated parts are valid to be used/accessed by the VM.
In most cases, we don't care about that - e.g., in KVM, we simply have a
single KVM memory slot. However, in case of vfio, registering the
whole region with the kernel results in all pages getting pinned, and
therefore an unexpected high memory consumption - discarding of RAM in
that context is broken.
Let's introduce a way to coordinate discarding/populating memory within a
RAM memory region with such special consumers of RAM memory regions: they
can register as listeners and get updates on memory getting discarded and
populated. Using this machinery, vfio will be able to map only the
currently populated parts, resulting in discarded parts not getting pinned
and not consuming memory.
A RamDiscardManager has to be set for a memory region before it is getting
mapped, and cannot change while the memory region is mapped.
Note: At some point, we might want to let RAMBlock users (esp. vfio used
for nvme://) consume this interface as well. We'll need RAMBlock notifier
calls when a RAMBlock is getting mapped/unmapped (via the corresponding
memory region), so we can properly register a listener there as well.
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-2-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
QEMU has support for SMBus devices, and PMBus is a more specific
implementation of SMBus. The additions made in this commit makes it easier to
add new PMBus devices to QEMU.
https://pmbus.org/specification-archives/
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Hao Wu <wuhaotsh@google.com>
Signed-off-by: Titus Rwantare <titusr@google.com>
Message-Id: <20210708172556.1868139-2-titusr@google.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
To ease reviewing code using the I2C bus API, introduce the
i2c_start_recv() and i2c_start_send() helpers which don't
take the confusing 'is_recv' boolean argument.
Use these new helpers in the SMBus / AUX bus models.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Make the argument representing the direction of the transfer a
boolean type.
Rename the boolean argument as 'is_recv' to match i2c_recv_send().
Document the function prototype.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20200621145235.9E241745712@zero.eik.bme.hu>
[PMD: Split patch, added docstring]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Other functions from I2C slave API are named "i2c_slave_XXX()".
Follow that pattern with set_address(). Add docstring along.
No logical change.
Patch created mechanically using:
$ sed -i s/i2c_set_slave_address/i2c_slave_set_address/ \
$(git grep -l i2c_set_slave_address)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
We replaced all the i2c_send_recv() calls by the clearer i2c_recv()
and i2c_send(), so we can remove this confusing API.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Define TYPE_LM8323 in the public "hw/input/lm832x.h"
header and use it in hw/arm/nseries.c.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
lm832x_key_event() is specific go LM832x devices, not to the
I2C bus API. Move it out of "i2c.h" to a new header.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
move everything related to translate, as well as HELPER code in tcg/
mmu_helper.c stays put for now, as it contains both TCG and KVM code.
After the reshuffling, update MAINTAINERS accordingly.
Make use of the new directory:
target/s390x/tcg/
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210707105324.23400-8-acho@suse.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
backend_defaults property allow users to control if default block
properties should be decided with backend information.
If it is off, any backend information will be discarded, which is
suitable if you plan to perform live migration to a different disk backend.
If it is on, a block device may utilize backend information more
aggressively.
By default, it is auto, which uses backend information for block
sizes and ignores the others, which is consistent with the older
versions.
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-id: 20210705130458.97642-2-akihiko.odaki@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-M was the sole user of qemu_opts_set and qemu_opts_set_defaults,
remove them and the arguments that they used.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make -smp syntactic sugar for a compound property "-machine
smp.{cores,threads,cpu,...}". machine_smp_parse is replaced by the
setter for the property.
numa-test will now cover the new syntax, while other tests
still use -smp.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow parsing multiple keyval sequences into the same dictionary.
This will be used to simplify the parsing of the -M command line
option, which is currently a .merge_lists = true QemuOpts group.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch introduces a function that merges two keyval-produced
(or keyval-like) QDicts. It can be used to emulate the behavior of
.merge_lists = true QemuOpts groups, merging -readconfig sections and
command-line options in a single QDict, and also to implement -set.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Machines and accelerators are not user-creatable but they are going
to share similar command-line parsing machinery. Export functions
that will be used with -machine and -accel in softmmu/vl.c.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It can be difficult to debug issues with BHs in production environments.
Although BHs can usually be identified by looking up their ->cb()
function pointer, this requires debug information for the program. It is
also not possible to print human-readable diagnostics about BHs because
they have no identifier.
This patch adds a name to each BH. The name is not unique per instance
but differentiates between cb() functions, which is usually enough. It's
done by changing aio_bh_new() and friends to macros that stringify cb.
The next patch will use the name field when reporting leaked BHs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210414200247.917496-2-stefanha@redhat.com>
- Extract nanoMIPS, microMIPS, Code Compaction from translate.c
- Allow PCI config accesses smaller than 32-bit on Bonito64 device
- Fix migration of g364fb device on Jazz Magnum
- Fix dp8393x PROM checksum on Jazz Magnum and Quadra 800
- Map the UART devices unconditionally on Jazz Magnum
- Add functional test booting Linux on the Fuloong 2E
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmDfMnMACgkQ4+MsLN6t
wN71nhAArpyoJ5mTkt54wAxZwxvqjWAyesvogV4pLIvOyNJmQcExY/Ly8Qb5dbDg
2PEhCpDU7GlT7oCfgh7O5KrEjnqVfmHQzzbvQ0Ygq9kL5hdjaSSlHB/yeirU7CR1
cMQXfj9kvRVa5Oayt3L+kiKgTA0f1vbGmnveiFxJKJupyVDtursstD3nSZCThSVL
FWdFJbqLzvTY3cQqLBVq7jEnN7LzSYeYnq8Tvri0nuwoBwLJY382IljqsQZrGHzr
Bbya3KFInMrQK5VAM0pOkfvPYXZmtJ8i8VuR6S+IdICiZ+61sknKRUq5z09/4NXA
HaxarWO/fv07qd7q6Z2+i5Q6fiDrV4p2qfHeddM4Xwqvu8O98EPhaBE3veLOiNgr
VxgkJFslI1gstje31qqvNjFxB+cOIBYjWTlIVu1xOuOKGWMjMT+9XcVLseweA2rT
H/nTKnWTAiJ/mxT4KIv59SS0ZQa4QJ3CjYr26AcQ9YrJ9vCMay8rLPEn0iVlhB2H
ZyW4Be84ynEvuAvvuWt1AXvFjT41Zqj4Px1M6Pa15e9eV6guiW1KV13Thb45gJqt
LTQtoME83r3gUhQOcoZaowYh6ffCbjtW3eAcTZP3Zu7fOeBjSye+CvS9NJMtK3Vj
0/YjtqGPUvjNy4sR9VqkRRPMZpHftMTRILxaFm8Y5tlThkAydw0=
=rR+B
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/philmd/tags/mips-20210702' into staging
MIPS patches queue
- Extract nanoMIPS, microMIPS, Code Compaction from translate.c
- Allow PCI config accesses smaller than 32-bit on Bonito64 device
- Fix migration of g364fb device on Jazz Magnum
- Fix dp8393x PROM checksum on Jazz Magnum and Quadra 800
- Map the UART devices unconditionally on Jazz Magnum
- Add functional test booting Linux on the Fuloong 2E
# gpg: Signature made Fri 02 Jul 2021 16:36:19 BST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* remotes/philmd/tags/mips-20210702:
hw/mips/jazz: Map the UART devices unconditionally
hw/mips/jazz: specify correct endian for dp8393x device
hw/m68k/q800: fix PROM checksum and MAC address storage
qemu/bitops.h: add bitrev8 implementation
dp8393x: remove onboard PROM containing MAC address and checksum
hw/m68k/q800: move PROM and checksum calculation from dp8393x device to board
hw/mips/jazz: move PROM and checksum calculation from dp8393x device to board
dp8393x: convert to trace-events
dp8393x: checkpatch fixes
g364fb: add VMStateDescription for G364SysBusState
g364fb: use RAM memory region for framebuffer
tests/acceptance: Test Linux on the Fuloong 2E machine
hw/pci-host/bonito: Allow PCI config accesses smaller than 32-bit
hw/pci-host/bonito: Trace PCI config accesses smaller than 32-bit
target/mips: Extract nanoMIPS ISA translation routines
target/mips: Extract the microMIPS ISA translation routines
target/mips: Extract Code Compaction ASE translation routines
target/mips: Add declarations for generic TCG helpers
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This will be required for an upcoming checksum calculation.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210625065401.30170-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This is just enough to make reboot and poweroff work. Works for
linux, u-boot, and the arm trusted firmware. Not tested, but should
work for plan9, and bare-metal/hobby OSes, since they seem to generally
do what linux does for reset.
The watchdog timer functionality is not yet implemented.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/64
Signed-off-by: Nolan Leake <nolan@sigbus.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210625210209.1870217-1-nolan@sigbus.net
[PMM: tweaked commit title; fixed region size to 0x200;
moved header file to include/]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* namespace eui64 support (Heinrich)
* aiocb refactoring (Klaus)
* controller parameter for auto zone transitioning (Niklas)
* misc fixes and additions (Gollu, Klaus, Keith)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEUigzqnXi3OaiR2bATeGvMW1PDekFAmDbap8ACgkQTeGvMW1P
Denz9wf9GtU4z7G0pQpFlv1smWPvQfv9qlkGiRvGcIQjEgPjcKjEsJC1CqtG6nZ5
BGYzey706iM7NAs6wPkyUyHgWXl+N/Ku5nNBvNJuzTN5J9aHgnqehe8civvS6ONU
0WtAQEEZCLu0Qbvw6KawCShXrJ6Jpmu+bYGggYEy5+baumLF0Qj50l2wRiDbImNu
rE3iZuhEenwnNNSTTkyt2RT8lSzizAtth4bAqzl3Zw1Q3NO+qDCL0wPOWD7YuD8p
LTZ8/ojB1+YUI3JrMM4MlzbJyGstdk2Y5zq6QpAcf86agaaFEmS0iqjOic52ImpL
VzmZZZ0W7kmhJVc5H/Q5y/KAFIHnFA==
=efjI
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/nvme/tags/nvme-next-pull-request' into staging
hw/nvme patches
* namespace eui64 support (Heinrich)
* aiocb refactoring (Klaus)
* controller parameter for auto zone transitioning (Niklas)
* misc fixes and additions (Gollu, Klaus, Keith)
# gpg: Signature made Tue 29 Jun 2021 19:46:55 BST
# gpg: using RSA key 522833AA75E2DCE6A24766C04DE1AF316D4F0DE9
# gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [unknown]
# gpg: aka "Klaus Jensen <k.jensen@samsung.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468 4272 63D5 6FC5 E55D A838
# Subkey fingerprint: 5228 33AA 75E2 DCE6 A247 66C0 4DE1 AF31 6D4F 0DE9
* remotes/nvme/tags/nvme-next-pull-request: (23 commits)
hw/nvme: add 'zoned.zasl' to documentation
hw/nvme: fix pin-based interrupt behavior (again)
hw/nvme: fix missing check for PMR capability
hw/nvme: documentation fix
hw/nvme: fix endianess conversion and add controller list
Partially revert "hw/block/nvme: drain namespaces on sq deletion"
hw/nvme: reimplement format nvm to allow cancellation
hw/nvme: reimplement zone reset to allow cancellation
hw/nvme: reimplement the copy command to allow aio cancellation
hw/nvme: add dw0/1 to the req completion trace event
hw/nvme: use prinfo directly in nvme_check_prinfo and nvme_dif_check
hw/nvme: remove assert from nvme_get_zone_by_slba
hw/nvme: save reftag when generating pi
hw/nvme: reimplement dsm to allow cancellation
hw/nvme: add nvme_block_status_all helper
hw/nvme: reimplement flush to allow cancellation
hw/nvme: default for namespace EUI-64
hw/nvme: namespace parameter for EUI-64
hw/nvme: fix csi field for cns 0x00 and 0x11
hw/nvme: add param to control auto zone transitioning to zone state closed
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Instead of just returning 0/-1 and letting the caller make up a
meaningless error message, add an Error parameter to allow reporting the
real error and switch to 0/-errno so that different kind of errors can
be distinguished in the caller.
config_len in vhost_user_get_config() is defined by the device, so if
it's larger than VHOST_USER_MAX_CONFIG_SIZE, this is a programming
error. Turn the corresponding check into an assertion.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-6-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of just returning 0/-1 and letting the caller make up a
meaningless error message, add an Error parameter to allow reporting the
real error and switch to 0/-errno so that different kind of errors can
be distinguished in the caller.
Specifically, in vhost-user, EPROTO is used for all errors that relate
to the connection itself, whereas other error codes are used for errors
relating to the content of the connection. This will allow us later to
automatically reconnect when the connection goes away, without ending up
in an endless loop if it's a permanent error in the configuration.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-3-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This allows callers to return better error messages instead of making
one up while the real error ends up on stderr. Most callers can
immediately make use of this because they already have an Error
parameter themselves. The others just keep printing the error with
error_report_err().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-2-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Code consuming the "crypto/tlscreds*.h" APIs doesn't need
to access its internals. Move the structure definitions to
the "tlscredspriv.h" private header (only accessible by
implementations). The public headers (in include/) still
forward-declare the structures typedef.
Note, tlscreds.c and 3 of the 5 modified source files already
include "tlscredspriv.h", so only add it to tls-cipher-suites.c
and tlssession.c.
Removing the internals from the public header solves a bug
introduced by commit 7de2e85653 ("yank: Unregister function
when using TLS migration") which made migration/qemu-file-channel.c
include "io/channel-tls.h", itself sometime depends on GNUTLS,
leading to a build failure on OSX:
[2/35] Compiling C object libmigration.fa.p/migration_qemu-file-channel.c.o
FAILED: libmigration.fa.p/migration_qemu-file-channel.c.o
cc -Ilibmigration.fa.p -I. -I.. -Iqapi [ ... ] -o libmigration.fa.p/migration_qemu-file-channel.c.o -c ../migration/qemu-file-channel.c
In file included from ../migration/qemu-file-channel.c:29:
In file included from include/io/channel-tls.h:26:
In file included from include/crypto/tlssession.h:24:
include/crypto/tlscreds.h:28:10: fatal error: 'gnutls/gnutls.h' file not found
#include <gnutls/gnutls.h>
^~~~~~~~~~~~~~~~~
1 error generated.
Reported-by: Stefan Weil <sw@weilnetz.de>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/407
Fixes: 7de2e85653 ("yank: Unregister function when using TLS migration")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Introduce the qcrypto_tls_creds_check_endpoint() helper
to access QCryptoTLSCreds internal 'endpoint' field.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Implement the new semantics in the fallback expansion.
Change all callers to supply the flags that keep the
semantics unchanged locally.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will eventually simplify front-end usage, and will allow
backends to unset TCG_TARGET_HAS_MEMORY_BSWAP without loss of
optimization.
The argument is added during expansion, not currently exposed to the
front end translators. The backends currently only support a flags
value of either TCG_BSWAP_IZ, or (TCG_BSWAP_IZ | TCG_BSWAP_OZ),
since they all require zero top bytes and leave them that way.
At the existing call sites we pass in (TCG_BSWAP_IZ | TCG_BSWAP_OZ),
except for the flags-ignored cases of a 32-bit swap of a 32-bit
value and or a 64-bit swap of a 64-bit value, where we pass 0.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
Message-Id: <20210624105023.3852-6-zhiwei_liu@c-sky.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduce a convenient macro, that works for qemu_memalign() like
g_autofree works with g_malloc.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210628121133.193984-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When the x-blockdev-reopen was added it allowed reconfiguring the
graph by replacing backing files, but changing the 'file' option was
forbidden. Because of this restriction some operations are not
possible, notably inserting and removing block filters.
This patch adds support for replacing the 'file' option. This is
similar to replacing the backing file and the user is likewise
responsible for the correctness of the resulting graph, otherwise this
can lead to data corruption.
Signed-off-by: Alberto Garcia <berto@igalia.com>
[vsementsov: bdrv_reopen_parse_file_or_backing() is modified a lot]
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610120537.196183-9-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It's used only in bdrv_reopen_commit(). "backing" is covered by the
loop through all children except for case when we removed backing child
during reopen.
Make it more obvious and drop extra boolean field: qdict_del will not
fail if there is no such entry.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610120537.196183-8-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add the controller identifiers list CNS 0x13, available list of ctrls
in NVM Subsystem that may or may not be attached to namespaces.
In Identify Ctrl List of the CNS 0x12 and 0x13 no endian conversion
for the nsid field.
These two CNS values shows affect when there exists a Subsystem.
Added condition if there is no Subsystem return invalid field in
command.
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
The nvme_check_prinfo() and nvme_dif_check() functions operate on the
16 bit "control" member of the NvmeCmd. These functions do not otherwise
operate on an NvmeCmd or an NvmeRequest, so change them to expect the
actual 4 bit PRINFO field and add constants that work on this field as
well.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Add enums for the Identify Namespace FLBAS and MC fields.
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
[k.jensen: squashed separate flbas/mc commits into one]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEi5wmzbL9FHyIDoahVh8kwfGfefsFAmDVzrgACgkQVh8kwfGf
efuoKRAAsHqE46P2xwjjPROtwZC6HP/Ny5xfbF2CglfC4eX4ff89dwWSagjHfBig
1TQjHemOo5Fs8gyBGec0WPn0HaBVFthq75VGObt+QkHy7+owGDmtVghkXnEO8z2c
gldPoVeOXAbU4DZXgITQYfN1ljCOdrdKQMrMYmKqXLmw/vMzAfKlBKqsOFQiyVys
4egn25QxyiOh+T29zyVwmGVABaH2TIuJqkDr+iMh4IypYZf0BlqJRw++kLhGdxGq
RIpXQiXExy3lRm8htuh+GDAigSXNz93XKU3ZHe8RPBtPfdS0siGWwVTW2nLsH+Rc
vfUvfkqBSGcFaFjGHg/eNGvoMTYAZKHx8yq72voqrfFIPtz3NAoS2ahz/hIU/NiL
YLOmOcRZVx+xJ5lxjaxi/SvbTHVtZoBym/Aje/YiT3v4A8rziG6BeBUP9ec/bk+D
YYxMZNfzW2jVq0Vl4TZyYmV8e/H8Ha3HbLLJip3tiLXsBIjajfNti9iJ/xac5NzD
jV5pR27yIXilYHPR7GCYaMRp5LJv8uGiW704yAt2dwBizqonm7cgTg5Fcx0HFDyk
+HyPwu/TGI3cEF7dl+8V+AmwKug+jFj3VFVh5UMtf4bqNeliYNx+QpCUuWPuPMFc
U1nXWYBtcklD4JNPERUgUW7x2tmAJN5dGDY0LAa2brrOnKVTTwU=
=JXBO
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/vsementsov/tags/pull-jobs-2021-06-25' into staging
block: Make block-copy API thread-safe
# gpg: Signature made Fri 25 Jun 2021 13:40:24 BST
# gpg: using RSA key 8B9C26CDB2FD147C880E86A1561F24C1F19F79FB
# gpg: Good signature from "Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8B9C 26CD B2FD 147C 880E 86A1 561F 24C1 F19F 79FB
* remotes/vsementsov/tags/pull-jobs-2021-06-25:
block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
block-copy: add CoMutex lock
block-copy: move progress_set_remaining in block_copy_task_end
block-copy: streamline choice of copy_range vs. read/write
block-copy: small refactor in block_copy_task_entry and block_copy_common
co-shared-resource: protect with a mutex
progressmeter: protect with a mutex
blockjob: let ratelimit handle a speed of 0
block-copy: let ratelimit handle a speed of 0
ratelimit: treat zero speed as unlimited
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- Fix MISA in the DisasContext
- Fix GDB CSR XML generation
- QOMify the SiFive UART
- Add support for the OpenTitan timer
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE9sSsRtSTSGjTuM6PIeENKd+XcFQFAmDUc9oACgkQIeENKd+X
cFQnJQf/YJ1DcCc5HKnJD7dKOO7auWGrjcBydVLZpCKT/sBYO2m4+LcUoCkndJst
z2awR2sL6zgTqkpKTFJzENBKcXf0NOAvGvuvAznPQosvW26NhY20EsWHgRxn79DF
2CvFChD4J/aBZa/JwP7232CebsD2IqKn89gP5u6ldFNH36EGpzBRjFOroXLu98x3
arhr7AoyhTTpxcWkWuLW9YVwqZQ8xKKCVTMuqMC8SRI48FUB5+ndy3pTQqIjdoCg
U0wfJIrmPBakw3ik0nbNd47Lu/yxCQMU/O4M/flSbbC1GpomiUotlap9O3LlvNYo
7VeF8eS3/7Okn2/5jEwuFES+MmtUSQ==
=zVjG
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/alistair/tags/pull-riscv-to-apply-20210624-2' into staging
Third RISC-V PR for 6.1 release
- Fix MISA in the DisasContext
- Fix GDB CSR XML generation
- QOMify the SiFive UART
- Add support for the OpenTitan timer
# gpg: Signature made Thu 24 Jun 2021 13:00:26 BST
# gpg: using RSA key F6C4AC46D4934868D3B8CE8F21E10D29DF977054
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [full]
# Primary key fingerprint: F6C4 AC46 D493 4868 D3B8 CE8F 21E1 0D29 DF97 7054
* remotes/alistair/tags/pull-riscv-to-apply-20210624-2:
hw/riscv: OpenTitan: Connect the mtime and mtimecmp timer
hw/timer: Initial commit of Ibex Timer
hw/char/ibex_uart: Make the register layout private
hw/char: QOMify sifive_uart
hw/char: Consistent function names for sifive_uart
target/riscv: gdbstub: Fix dynamic CSR XML generation
target/riscv: Use target_ulong for the DisasContext misa
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As part of converting -smp to a property with a QAPI type, define
the struct and use it to do the actual parsing. machine_smp_parse
takes care of doing the QemuOpts->QAPI conversion by hand, for now.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210617155308.928754-10-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Clean up the smp_parse functions to use Error** instead of exiting.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210617155308.928754-9-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In order to make SMP configuration a Machine property, we need a getter as
well as a setter. To simplify the implementation put everything that the
getter needs in the CpuTopology struct.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210617155308.928754-7-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
By adding acquire/release pairs, we ensure that .ret and .error_is_read
fields are written by block_copy_dirty_clusters before .finished is true,
and that they are read by API user after .finished is true.
The atomic here are necessary because the fields are concurrently modified
in coroutines, and read outside.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210624072043.180494-6-eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
co-shared-resource is currently not thread-safe, as also reported
in co-shared-resource.h. Add a QemuMutex because co_try_get_from_shres
can also be invoked from non-coroutine context.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210614081130.22134-6-eesposit@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Progressmeter is protected by the AioContext mutex, which
is taken by the block jobs and their caller (like blockdev).
We would like to remove the dependency of block layer code on the
AioContext mutex, since most drivers and the core I/O code are already
not relying on it.
Create a new C file to implement the ProgressMeter API, but keep the
struct as public, to avoid forcing allocation on the heap.
Also add a mutex to be able to provide an accurate snapshot of the
progress values to the caller.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210614081130.22134-5-eesposit@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Both users of RateLimit, block-copy.c and blockjob.c, treat
a speed of zero as unlimited, while RateLimit treats it as
"as slow as possible". The latter is nicer from the code
point of view but pretty useless, so disable rate limiting
if a speed of zero is provided.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210614081130.22134-2-eesposit@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
- tweak tcg/kvm based GIC tests
- add header to MTTCG docs
- cleanup checkpatch handling
- GitLab feature and bug request templates
- symbol resolution helper for plugin API
- skip hppa/s390x signals test until fixes arrive
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmDVnaEACgkQ+9DbCVqe
KkTf1QgAgJmf4NdDriGqYfRf0xQRlgQvvVUlbpMcIRsDhCrOxX2HGDNehxbFaluB
mUCQuZrqBAQAAUfr8UbCmRrlV+1Ba4M4kUN5WFDSiIWawsZjp9qxXWiepaBv4jgO
zdosUtAmOJSb6xdtIu51GexrdQu28os6dsvnfzMTgjakRqF360gnvfvIpxe2eh7V
2n/LqOIpRAYlEyVxa2sjOIIxxZBsmix//cMuHokaTp1UenyFZhkanMd8IZvRNfsu
4tTWv3LmyMPsyIYyLsrPHPv+nzJLE7KHMKS1oL27yHDb7QWdcLIBYzW9Zon8Ai8k
71sCQlDM/iZ1sF3vITVIcYy/fKRR2Q==
=0kZ6
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-updates-250621-1' into staging
A few miscellaneous fixes
- tweak tcg/kvm based GIC tests
- add header to MTTCG docs
- cleanup checkpatch handling
- GitLab feature and bug request templates
- symbol resolution helper for plugin API
- skip hppa/s390x signals test until fixes arrive
# gpg: Signature made Fri 25 Jun 2021 10:10:57 BST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-testing-updates-250621-1:
plugins/api: expose symbol lookup to plugins
tests/tcg: skip the signals test for hppa/s390x for now
GitLab: Add "Feature Request" issue template.
GitLab: Add "Bug" issue reporting template
scripts/checkpatch: roll diff tweaking into checkpatch itself
docs/devel: Add a single top-level header to MTTCG's doc
tests/acceptance: tweak the tcg/kvm tests for virt
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is a quality of life helper for plugins so they don't need to
re-implement symbol lookup when dumping an address. The strings are
constant so don't need to be duplicated. One minor tweak is to return
NULL instead of a zero length string to show lookup failed.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>
Message-Id: <20210608040532.56449-2-ma.mandourr@gmail.com>
Message-Id: <20210623102749.25686-8-alex.bennee@linaro.org>
For block host devices, I/O can happen through either the kernel file
descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
or the SCSI passthrough ioctl SG_IO.
In the latter case, the size of each transfer can be limited by the
HBA, while for file descriptor I/O the kernel is able to split and
merge I/O in smaller pieces as needed. Applying the HBA limits to
file descriptor I/O results in more system calls and suboptimal
performance, so this patch splits the max_transfer limit in two:
max_transfer remains valid and is used in general, while max_hw_transfer
is limited to the maximum hardware size. max_hw_transfer can then be
included by the scsi-generic driver in the block limits page, to ensure
that the stricter hardware limit is used.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
osdep.h provides a ROUND_UP macro to hide bitwise operations for the
purpose of rounding a number up to a power of two; add a ROUND_DOWN
macro that does the same with truncation towards zero.
While at it, change the formatting of some comments.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Connect the Ibex timer to the OpenTitan machine. The timer can trigger
the RISC-V MIE interrupt as well as a custom device interrupt.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 5e7f4e9b4537f863bcb8db1264b840b56ef2a929.1624001156.git.alistair.francis@wdc.com
Add support for the Ibex timer. This is used with the RISC-V
mtime/mtimecmp similar to the SiFive CLINT.
We currently don't support changing the prescale or the timervalue.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 716fdea2244515ce86a2c46fe69467d013c03147.1624001156.git.alistair.francis@wdc.com
This QOMifies the SiFive UART model. Migration and reset have been
implemented.
Signed-off-by: Lukas Jünger <lukas.juenger@greensocs.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210616092326.59639-3-lukas.juenger@greensocs.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
- tcg: implement the vector enhancements facility and bump the
'qemu' cpu model to a stripped-down z14 GA2
- fix psw.mask handling in signals
- fix vfio-ccw sense data handling
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEw9DWbcNiT/aowBjO3s9rk8bwL68FAmDQYXwSHGNvaHVja0By
ZWRoYXQuY29tAAoJEN7Pa5PG8C+vrDAQALLimgzAm6br4SHP49T7ZsDkuhZwyYpP
Fg09vxjMmKWgLOIQNp7Xd1vQJ5voGc5D0KleVMuX2feycfmon0yVeIBMan6DTcfr
lygtiBYrgPWVAs36OXQ/rJUHt2ZUZaQsS57lTgn+Jtn7p+AMjiMrDlam+iqoAnU+
o5RtTFG+bhqa72WI7mCG54hRfXS1b/K8Ts1qs0oJJVDrDWlmLWfpjuJU3ehvhepA
hJYnhIRQgbFHPsaJI47s25aa6KC+cGTGGRMl4YmFPACMh1KNXqmGP1XbxyEhz5tl
LUdU9JRikbBsErcItHfZabGktLtBi7B9Vyh9KhxG9vK7ol8GUD4pomHrLNH2UUtH
MyhTcuVCcEakuhgRr8GwA+7KO2Y3quqHDC3/kCkIarrE4X+YJl9Glv76we6XvbYL
4SAPE87Ub465C3J3tUjLDtfq8LpCIUh7zCYLBfk2Yf4pIbjnWMzUBu3UD2XYrKGF
+g+J8ZVjE/WFsZzWUJL54lLuT8+FjPLgNOsth1WTENGUEK+JgWGpHbpR1XdqQFj8
f+bk1nrL94iq3KRdmCaO1w6vc+5xDG0tSY4tVJ1Nip1w3ZxmJFOUWGWLs04hz4mn
WLx3vHbq3g1HZn9dtqwh6BvAP9fdLw2xkBcRtepR7vPM0ydqp2dvBbu99MrZN3H1
3Sa6lIpr4bII
=bMQp
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/cohuck-gitlab/tags/s390x-20210621' into staging
s390x update:
- tcg: implement the vector enhancements facility and bump the
'qemu' cpu model to a stripped-down z14 GA2
- fix psw.mask handling in signals
- fix vfio-ccw sense data handling
# gpg: Signature made Mon 21 Jun 2021 10:53:00 BST
# gpg: using RSA key C3D0D66DC3624FF6A8C018CEDECF6B93C6F02FAF
# gpg: issuer "cohuck@redhat.com"
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>" [unknown]
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>" [full]
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>" [full]
# gpg: aka "Cornelia Huck <cohuck@kernel.org>" [unknown]
# gpg: aka "Cornelia Huck <cohuck@redhat.com>" [unknown]
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck-gitlab/tags/s390x-20210621: (37 commits)
s390x/css: Add passthrough IRB
s390x/css: Refactor IRB construction
s390x/css: Split out the IRB sense data
s390x/css: Introduce an ESW struct
linux-user/s390x: Save and restore psw.mask properly
target/s390x: Use s390_cpu_{set_psw, get_psw_mask} in gdbstub
target/s390x: Improve s390_cpu_dump_state vs cc_op
target/s390x: Do not modify cpu state in s390_cpu_get_psw_mask
target/s390x: Expose load_psw and get_psw_mask to cpu.h
configure: Check whether we can compile the s390-ccw bios with -msoft-float
s390x/cpumodel: Bump up QEMU model to a stripped-down IBM z14 GA2
s390x/tcg: We support Vector enhancements facility
linux-user: elf: s390x: Prepare for Vector enhancements facility
s390x/tcg: Implement VECTOR FP (MAXIMUM|MINIMUM)
s390x/tcg: Implement VECTOR FP NEGATIVE MULTIPLY AND (ADD|SUBTRACT)
s390x/tcg: Implement 32/128 bit for VECTOR FP MULTIPLY AND (ADD|SUBTRACT)
s390x/tcg: Implement 32/128 bit for VECTOR FP TEST DATA CLASS IMMEDIATE
s390x/tcg: Implement 32/128 bit for VECTOR FP PERFORM SIGN OPERATION
s390x/tcg: Implement 128 bit for VECTOR FP LOAD ROUNDED
s390x/tcg: Implement 64 bit for VECTOR FP LOAD LENGTHENED
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Arm MVE VDUP implementation would like to be able to emit code to
duplicate a byte or halfword value into an i32. We have code to do
this already in tcg-op-gvec.c, so all we need to do is make the
functions global.
For consistency with other functions made available to the frontends:
* we rename to tcg_gen_dup_*
* we expose both the _i32 and _i64 forms
* we provide the #define for a _tl form
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210617121628.20116-10-peter.maydell@linaro.org
Allow code elsewhere in the system to check whether the ACPI GHES
table is present, so it can determine whether it is OK to try to
record an error by calling acpi_ghes_record_errors().
(We don't need to migrate the new 'present' field in AcpiGhesState,
because it is set once at system initialization and doesn't change.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Dongjiu Geng <gengdongjiu1@gmail.com>
Message-id: 20210603171259.27962-3-peter.maydell@linaro.org
Features:
* Add ratelimit for bus locks acquired in guest (Chenyi Qiang)
Documentation:
* SEV documentation updates (Tom Lendacky)
* Add a table showing x86-64 ABI compatibility levels (Daniel P. Berrangé)
Automated changes:
* Update Linux headers to 5.13-rc4 (Eduardo Habkost)
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEWjIv1avE09usz9GqKAeTb5hNxaYFAmDM+T4UHGVoYWJrb3N0
QHJlZGhhdC5jb20ACgkQKAeTb5hNxabUrQ/+PtiJjd1cW9nhA0kWu8dVGq3xXJb4
Nbma86tRPKBauTeQCLccXEvUjLqgFejeQlArhq4QKErLisXu4TDuQ+GeAfdR7h5P
MTMSo0C665cT2/NbrwQizSPQdrNEgZAYRaDRafZLQTJ1TStzWDB1Vg79rzpWPcn0
76XjIfSdGZUa4B1OvjNvUFq/SXf+0hW75soCwRhDNh5tfzfyct0XCSRF/wTXqyR/
7yxDtfTzUAvT+6l3qb8ky+wqUTIY58BgjbdIGhyAUr5/N8y5YystF41TUVoy772k
pmCXHniMmgmhH7HVwGujtc6mPe5y1VFJVaA08Pzb7KwSfdO9F/3Gk3DHpKW8/whi
tCGluBqz0qlyhsnP9wDRJb6BzCBl2hVqu50DL+uSNsJOSIW60LLMJV4ANlDYdDM3
s33S5NrM0DsRAjrtczPdvKPWwaVE4NB2bYX1I3yYGgflwzQYOjBmswM/UgymhlZk
5dxtF9CX2p+Vre6UoLDKum1DJDCcWjHouJAAqZzxxEko56yWgTUSzTcK4GVOlsAc
qX4gJbFpOzDlSdpDTG/fcnQlCnwc1jxCzsB8Wy2KJiBif3Sa3Wh1s00Cp7oGNQt+
P/z2Fp1agl8u83bbvlIjZnsv0O2g5Ks4r5tBhXmqI36aiU26F/x39SUfp7/7OAUd
CQBGBGXqpnOmUcE=
=YWGX
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost-gl/tags/x86-next-pull-request' into staging
x86 queue, 2021-06-18
Features:
* Add ratelimit for bus locks acquired in guest (Chenyi Qiang)
Documentation:
* SEV documentation updates (Tom Lendacky)
* Add a table showing x86-64 ABI compatibility levels (Daniel P. Berrangé)
Automated changes:
* Update Linux headers to 5.13-rc4 (Eduardo Habkost)
# gpg: Signature made Fri 18 Jun 2021 20:51:26 BST
# gpg: using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
# gpg: issuer "ehabkost@redhat.com"
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost-gl/tags/x86-next-pull-request:
scripts: helper to generate x86_64 CPU ABI compat info
docs: add a table showing x86-64 ABI compatibility levels
docs/interop/firmware.json: Add SEV-ES support
docs: Add SEV-ES documentation to amd-memory-encryption.txt
doc: Fix some mistakes in the SEV documentation
i386: Add ratelimit for bus locks acquired in guest
Update Linux headers to 5.13-rc4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Wire in the subchannel callback for building the IRB
ESW and ECW space for passthrough devices, and copy
the hardware's ESW into the IRB we are building.
If the hardware presented concurrent sense, then copy
that sense data into the IRB's ECW space.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20210617232537.1337506-5-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Currently, all subchannel types have "sense data" copied into
the IRB.ECW space, and a couple flags enabled in the IRB.SCSW
and IRB.ESW. But for passthrough (vfio-ccw) subchannels,
this data isn't populated in the first place, so enabling
those flags leads to unexpected behavior if the guest tries to
process the sense data (zeros) in the IRB.ECW.
Let's add a subchannel callback that builds these portions of
the IRB, and move the existing code into a routine for those
virtual subchannels. The passthrough subchannels will be able
to piggy-back onto this later.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20210617232537.1337506-4-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The Interrupt Response Block is comprised of several other
structures concatenated together, but only the 12-byte
Subchannel-Status Word (SCSW) is defined as a proper struct.
Everything else is a simple array of 32-bit words.
Let's define a proper struct for the 20-byte Extended-Status
Word (ESW) so that we can make good decisions about the sense
data that would go into the ECW area for virtual vs
passthrough devices.
[CH: adapted ESW definition to build with mingw, as discussed]
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20210617232537.1337506-2-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's check for S390_FEAT_VECTOR_ENH and set HWCAP_S390_VXRS_EXT
accordingly. Add all missing HWCAP defined in upstream Linux.
Cc: Laurent Vivier <laurent@vivier.eu>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-25-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Leading underscores followed by a capital letter or underscore are
reserved by the C standard.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/369
Signed-off-by: Ahmed Abouzied <email@aabouzied.com>
Message-Id: <20210605174938.13782-1-email@aabouzied.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This commit moves into a separate file routines used to manipulate
TCGCond. These will be employed by the idef-parser.
Signed-off-by: Alessandro Di Federico <ale@rev.ng>
Signed-off-by: Paolo Montesel <babush@rev.ng>
Message-Id: <20210619093713.1845446-2-ale.qemu@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This removes all of the problems with unaligned accesses
to the bytecode stream.
With an 8-bit opcode at the bottom, we have 24 bits remaining,
which are generally split into 6 4-bit slots. This fits well
with the maximum length opcodes, e.g. INDEX_op_add2_i32, which
have 6 register operands.
We have, in previous patches, rearranged things such that there
are no operations with a label which have more than one other
operand. Which leaves us with a 20-bit field in which to encode
a label, giving us a maximum TB size of 512k -- easily large.
Change the INDEX_op_tci_movi_{i32,i64} opcodes to tci_mov[il].
The former puts the immediate in the upper 20 bits of the insn,
like we do for the label displacement. The later uses a label
to reference an entry in the constant pool. Thus, in the worst
case we still have a single memory reference for any constant,
but now the constants are out-of-line of the bytecode and can
be shared between different moves saving space.
Change INDEX_op_call to use a label to reference a pair of
pointers in the constant pool. This removes the only slightly
dodgy link with the layout of struct TCGHelperInfo.
The re-encode cannot be done in pieces.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This requires adjusting where arguments are stored.
Place them on the stack at left-aligned positions.
Adjust the stack frame to be at entirely positive offsets.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As noted by qemu-plugins.h, enum qemu_plugin_cb_flags is
currently unused -- plugins can neither read nor write
guest registers.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will shortly be interested in distinguishing pointers
from integers in the helper's declaration, as well as a
true void return. We currently have two parallel 1 bit
fields; merge them and expand to a 3 bit field.
Our current maximum is 7 helper arguments, plus the return
makes 8 * 3 = 24 bits used within the uint32_t typemask.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We'll need a possibility of non-blocking nbd_co_establish_connection(),
so that it returns immediately, and it returns success only if a
connections was previously established in background.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-30-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
block/nbd doesn't need underlying sioc channel anymore. So, we can
update nbd/client-connection interface to return only one top-most io
channel, which is more straight forward.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-27-vsementsov@virtuozzo.com>
[eblake: squash in Vladimir's fixes for uninit usage caught by clang]
Signed-off-by: Eric Blake <eblake@redhat.com>
Add an option for a thread to retry connecting until it succeeds. We'll
use nbd/client-connection both for reconnect and for initial connection
in nbd_open(), so we need a possibility to use same NBDClientConnection
instance to connect once in nbd_open() and then use retry semantics for
reconnect.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-21-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
Add arguments and logic to support nbd negotiation in the same thread
after successful connection.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-20-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
We now have bs-independent connection API, which consists of four
functions:
nbd_client_connection_new()
nbd_client_connection_release()
nbd_co_establish_connection()
nbd_co_establish_connection_cancel()
Move them to a separate file together with NBDClientConnection
structure which becomes private to the new API.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-18-vsementsov@virtuozzo.com>
[eblake: comment tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
Add function that transforms named fd inside SocketAddress structure
into number representation. This way it may be then used in a context
where current monitor is not available.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-6-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: comment tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
qemu_co_queue_next() and qemu_co_queue_restart_all() just call
aio_co_wake() which works well in non-coroutine context. So these
functions can be called from non-coroutine context as well. And
actually qemu_co_queue_restart_all() is called from
nbd_cancel_in_flight(), which is called from non-coroutine context.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
If we want to wake up a coroutine from a worker thread, aio_co_wake()
currently does not work. In that scenario, aio_co_wake() calls
aio_co_enter(), but there is no current AioContext and therefore
qemu_get_current_aio_context() returns the main thread. aio_co_wake()
then attempts to call aio_context_acquire() instead of going through
aio_co_schedule().
The default case of qemu_get_current_aio_context() was added to cover
synchronous I/O started from the vCPU thread, but the main and vCPU
threads are quite different. The main thread is an I/O thread itself,
only running a more complicated event loop; the vCPU thread instead
is essentially a worker thread that occasionally calls
qemu_mutex_lock_iothread(). It is only in those critical sections
that it acts as if it were the home thread of the main AioContext.
Therefore, this patch detaches qemu_get_current_aio_context() from
iothreads, which is a useless complication. The AioContext pointer
is stored directly in the thread-local variable, including for the
main loop. Worker threads (including vCPU threads) optionally behave
as temporary home threads if they have taken the big QEMU lock,
but if that is not the case they will always schedule coroutines
on remote threads via aio_co_schedule().
With this change, the stub qemu_mutex_iothread_locked() must be changed
from true to false. The previous value of true was needed because the
main thread did not have an AioContext in the thread-local variable,
but now it does have one.
Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210609122234.544153-1-pbonzini@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: tweak commit message per Vladimir's review]
Signed-off-by: Eric Blake <eblake@redhat.com>
A bus lock is acquired through either split locked access to writeback
(WB) memory or any locked access to non-WB memory. It is typically >1000
cycles slower than an atomic operation within a cache and can also
disrupts performance on other cores.
Virtual Machines can exploit bus locks to degrade the performance of
system. To address this kind of performance DOS attack coming from the
VMs, bus lock VM exit is introduced in KVM and it can report the bus
locks detected in guest. If enabled in KVM, it would exit to the
userspace to let the user enforce throttling policies once bus locks
acquired in VMs.
The availability of bus lock VM exit can be detected through the
KVM_CAP_X86_BUS_LOCK_EXIT. The returned bitmap contains the potential
policies supported by KVM. The field KVM_BUS_LOCK_DETECTION_EXIT in
bitmap is the only supported strategy at present. It indicates that KVM
will exit to userspace to handle the bus locks.
This patch adds a ratelimit on the bus locks acquired in guest as a
mitigation policy.
Introduce a new field "bus_lock_ratelimit" to record the limited speed
of bus locks in the target VM. The user can specify it through the
"bus-lock-ratelimit" as a machine property. In current implementation,
the default value of the speed is 0 per second, which means no
restrictions on the bus locks.
As for ratelimit on detected bus locks, simply set the ratelimit
interval to 1s and restrict the quota of bus lock occurence to the value
of "bus_lock_ratelimit". A potential alternative is to introduce the
time slice as a property which can help the user achieve more precise
control.
The detail of bus lock VM exit can be found in spec:
https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210521043820.29678-1-chenyi.qiang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Lots of this are expected to be coming in, create a directory for them.
Also move the tmp105.h file into the include directory where it
should be.
Cc: Cédric Le Goater <clg@kaod.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: qemu-arm@nongnu.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Cédric Le Goater <clg@kaod.org>
It's an adc, put it where it belongs.
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-arm@nongnu.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
It's an ADC, put it where it belongs.
Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Cc: Alistair Francis <alistair@alistair23.me>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-arm@nongnu.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
int128_make64() creates an Int128 from an unsigned 64 bit value; add
a function int128_makes64() creating an Int128 from a signed 64 bit
value.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210614151007.4545-34-peter.maydell@linaro.org
Currently the ARM SVE helper code defines locally some utility
functions for swapping 16-bit halfwords within 32-bit or 64-bit
values and for swapping 32-bit words within 64-bit values,
parallel to the byte-swapping bswap16/32/64 functions.
We want these also for the ARM MVE code, and they're potentially
generally useful for other targets, so move them to bitops.h.
(We don't put them in bswap.h with the bswap* functions because
they are implemented in terms of the rotate operations also
defined in bitops.h, and including bitops.h from bswap.h seems
better avoided.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210614151007.4545-17-peter.maydell@linaro.org
_Static_assert is part of C11, which is now required.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210614233143.1221879-9-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All previous users now use C11 _Generic.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-8-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is both more and less complicated than our expansion
using __builtin_choose_expr and __builtin_types_compatible_p.
The expansion through QEMU_MAKE_LOCKABLE_ doesn't work because
we're not emumerating all of the types within the same _Generic,
which results in errors about unhandled cases. We must also
handle void* explicitly, so that the NULL constant can be used.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-7-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We will shortly convert lockable.h to _Generic, and we cannot
have two compatible types in the same expansion. Wrap QemuMutex
in a struct, and unwrap in qemu-thread-posix.c.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210614233143.1221879-6-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Create macros for file+line expansion in qemu_rec_mutex_unlock
like we have for qemu_mutex_unlock.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210614233143.1221879-5-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the declarations from thread-win32.h into thread.h
and remove the macro redirection from thread-posix.h.
This will be required by following cleanups.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210614233143.1221879-4-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
_Static_assert is part of C11, which is now required.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-9-richard.henderson@linaro.org>
All previous users now use C11 _Generic.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-8-richard.henderson@linaro.org>
This is both more and less complicated than our expansion
using __builtin_choose_expr and __builtin_types_compatible_p.
The expansion through QEMU_MAKE_LOCKABLE_ doesn't work because
we're not emumerating all of the types within the same _Generic,
which results in errors about unhandled cases. We must also
handle void* explicitly, so that the NULL constant can be used.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-7-richard.henderson@linaro.org>
We will shortly convert lockable.h to _Generic, and we cannot
have two compatible types in the same expansion. Wrap QemuMutex
in a struct, and unwrap in qemu-thread-posix.c.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-6-richard.henderson@linaro.org>
Create macros for file+line expansion in qemu_rec_mutex_unlock
like we have for qemu_mutex_unlock.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-5-richard.henderson@linaro.org>
Move the declarations from thread-win32.h into thread.h
and remove the macro redirection from thread-posix.h.
This will be required by following cleanups.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-4-richard.henderson@linaro.org>
Let's provide a way to control the use of RAM_NORESERVE via memory
backends using the "reserve" property which defaults to true (old
behavior).
Only Linux currently supports clearing the flag (and support is checked at
runtime, depending on the setting of "/proc/sys/vm/overcommit_memory").
Windows and other POSIX systems will bail out with "reserve=false".
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM. This essentially allows
avoiding to set "/proc/sys/vm/overcommit_memory == 0") when using
virtio-mem and also supporting hugetlbfs in the future.
As really only Linux implements RAM_NORESERVE right now, let's expose
the property only with CONFIG_LINUX. Setting the property to "false"
will then only fail in corner cases -- for example on very old kernels
or when memory overcommit was completely disabled by the admin.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-11-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's support RAM_NORESERVE via MAP_NORESERVE on Linux. The flag has no
effect on most shared mappings - except for hugetlbfs and anonymous memory.
Linux man page:
"MAP_NORESERVE: Do not reserve swap space for this mapping. When swap
space is reserved, one has the guarantee that it is possible to modify
the mapping. When swap space is not reserved one might get SIGSEGV
upon a write if no physical memory is available. See also the discussion
of the file /proc/sys/vm/overcommit_memory in proc(5). In kernels before
2.6, this flag had effect only for private writable mappings."
Note that the "guarantee" part is wrong with memory overcommit in Linux.
Also, in Linux hugetlbfs is treated differently - we configure reservation
of huge pages from the pool, not reservation of swap space (huge pages
cannot be swapped).
The rough behavior is [1]:
a) !Hugetlbfs:
1) Without MAP_NORESERVE *or* with memory overcommit under Linux
disabled ("/proc/sys/vm/overcommit_memory == 2"), the following
accounting/reservation happens:
For a file backed map
SHARED or READ-only - 0 cost (the file is the map not swap)
PRIVATE WRITABLE - size of mapping per instance
For an anonymous or /dev/zero map
SHARED - size of mapping
PRIVATE READ-only - 0 cost (but of little use)
PRIVATE WRITABLE - size of mapping per instance
2) With MAP_NORESERVE, no accounting/reservation happens.
b) Hugetlbfs:
1) Without MAP_NORESERVE, huge pages are reserved.
2) With MAP_NORESERVE, no huge pages are reserved.
Note: With "/proc/sys/vm/overcommit_memory == 0", we were already able
to configure it for !hugetlbfs globally; this toggle now allows
configuring it more fine-grained, not for the whole system.
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.
[1] https://www.kernel.org/doc/Documentation/vm/overcommit-accounting
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-10-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The
new flag has the following semantics:
"
RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge
pages if applicable) is skipped: will bail out if not supported. When not
set, the OS will do the reservation, if supported for the memory type.
"
Allow passing it into:
- memory_region_init_ram_nomigrate()
- memory_region_init_resizeable_ram()
- memory_region_init_ram_from_file()
... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag.
Bail out if the flag is not supported, which is the case right now for
both, POSIX and win32. We will add Linux support next and allow specifying
RAM_NORESERVE via memory backends.
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-9-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's pass flags instead of bools to prepare for passing other flags and
update the documentation of qemu_ram_mmap(). Introduce new QEMU_MAP_
flags that abstract the mmap() PROT_ and MAP_ flag handling and simplify
it.
We expose only flags that are currently supported by qemu_ram_mmap().
Maybe, we'll see qemu_mmap() in the future as well that can implement these
flags.
Note: We don't use MAP_ flags as some flags (e.g., MAP_SYNC) are only
defined for some systems and we want to always be able to identify
these flags reliably inside qemu_ram_mmap() -- for example, to properly
warn when some future flags are not available or effective on a system.
Also, this way we can simplify PROT_ handling as well.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-8-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's pass ram_flags to qemu_ram_alloc() and qemu_ram_alloc_internal(),
preparing for passing additional flags.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-7-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's pass in ram flags just like we do with qemu_ram_alloc_from_file(),
to clean up and prepare for more flags.
Simplify the documentation of passed ram flags: Looking at our
documentation of RAM_SHARED and RAM_PMEM is sufficient, no need to be
repetitive.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We can create shared anonymous memory via
"-object memory-backend-ram,share=on,..."
which is, for example, required by PVRDMA for mremap() to work.
Shared anonymous memory is weird, though. Instead of MADV_DONTNEED, we
have to use MADV_REMOVE: MADV_DONTNEED will only remove / zap all
relevant page table entries of the current process, the backend storage
will not get removed, resulting in no reduced memory consumption and
a repopulation of previous content on next access.
Shared anonymous memory is internally really just shmem, but without a
fd exposed. As we cannot use fallocate() without the fd to discard the
backing storage, MADV_REMOVE gets the same job done without a fd as
documented in "man 2 madvise". Removing backing storage implicitly
invalidates all page table entries with relevant mappings - an additional
MADV_DONTNEED is not required.
Fixes: 06329ccecf ("mem: add share parameter to memory-backend-ram")
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210406080126.24010-3-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The LUN is selected with an IDENTIFY message, and persists
until the next message out phase. Instead of passing it to
do_busid_cmd, store it in ESPState. Because do_cmd can simply
skip the message out phase if cmdfifo_cdb_offset is zero, it
can now be used for the S without ATN cases as well.
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: M: Michael S. Tsirkin <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210614191335.1968807-4-stefanb@linux.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
At some point during the development of tcg_constant_*, I changed
my mind about whether such temps should be able to be passed to
tcg_temp_free_*. The final version committed allows this, but the
commentary was not updated to match.
Fixes: c0522136ad
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduce a function to remove everything emitted
since a given point.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These variables belong to the jit side, not the user side.
Since tcg_init_ctx is no longer used outside of tcg/, move
the declaration to tcg-internal.h.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For --enable-tcg-interpreter on Windows, we will need this.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Shortly, the full code_gen_buffer will only be visible
to region.c, so move in_code_gen_buffer out-of-line.
Move the debugging versions of tcg_splitwx_to_{rx,rw}
to region.c as well, so that the compiler gets to see
the implementation of in_code_gen_buffer.
This leaves exactly one use of in_code_gen_buffer outside
of region.c, in cpu_restore_state. Which, being on the
exception path, is not performance critical.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Start removing the include of hw/boards.h from tcg/.
Pass down the max_cpus value from tcg_init_machine,
where we have the MachineState already.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There is only one caller, and shortly we will need access
to the MachineState, which tcg_init_machine already has.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Perform both tcg_context_init and tcg_region_init.
Do not leave this split to the caller.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Buffer management is integral to tcg. Do not leave the allocation
to code outside of tcg/. This is code movement, with further
cleanups to follow.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This patch implements the vq notification mapping support for
vhost-vDPA. This is simply done by using mmap()/munmap() for the
vhost-vDPA fd during device start/stop. For the device without
notification mapping support, we fall back to eventfd based
notification gracefully.
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
These two commands are missing when adding the QMP sister commands.
Add them, so developers can play with them easier.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <4cc0039fc3ad6145136770cf3b0f056c09a2910b.1623027729.git.huangy81@chinatelecom.cn>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The OpenSBI BIOS image names are used by many RISC-V machines.
Let's define macros for them.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210430071302.1489082-7-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The symbol address_space_memory are already declared in
include/exec/address-spaces.h. So let's add this header file
and remove the redundant declaration in include/hw/virtio/vhost-vdpa.h.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210517123246.999-1-xieyongji@bytedance.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Long story short, we need a space here for the reference to work
correctly.
Longer story:
Without the space, kerneldoc generates a line like this:
one of :c:type:`MemoryListener.region_add\(\) <MemoryListener>`,:c:type:`MemoryListener.region_del\(\)
Sphinx does not process the role information correctly, so we get this
(my pseudo-notation) construct:
<text>,:c:type:</text>
<reference target="MemoryListener">MemoryListener.region_del()</reference>
which does not reference the desired entity, and leaves some extra junk
in the rendered output. See
https://qemu-project.gitlab.io/qemu/devel/memory.html#c.MemoryListener
member log_start for an example of the broken output as it looks today.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210511192950.2061326-1-jsnow@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Headers should be included from the 'include/' directory,
not from the root directory.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210516205034.694788-1-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Let -readconfig support parsing command line options into QDict or
QemuOpts. This will be used to add back support for objects in
-readconfig.
Cc: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210524105752.3318299-3-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Change the parser to put the values into a QDict and pass them
to a callback. qemu_config_parse's QemuOpts creation is
itself turned into a callback function.
This is useful for -readconfig to support keyval-based options;
getting a QDict from the parser removes a roundtrip from
QDict to QemuOpts and then back to QDict.
Unfortunately there is a disadvantage in that semantic errors will
point to the last line of the group, because the entries of the QDict
do not have a location attached.
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210524105752.3318299-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When RSS is enabled the device tries to load the eBPF program
to select RX virtqueue in the TUN. If eBPF can be loaded
the RSS will function also with vhost (works with kernel 5.8 and later).
Software RSS is used as a fallback with vhost=off when eBPF can't be loaded
or when hash population requested by the guest.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
For now, that method supported only by Linux TAP.
Linux TAP uses TUNSETSTEERINGEBPF ioctl.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Rename to parts$N_modrem. This was the last use of a lot
of the legacy infrastructure, so remove it as required.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use an enumeration instead of raw 32/64/80 values.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>