Add support for the Ibex timer. This is used with the RISC-V
mtime/mtimecmp similar to the SiFive CLINT.
We currently don't support changing the prescale or the timervalue.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 716fdea2244515ce86a2c46fe69467d013c03147.1624001156.git.alistair.francis@wdc.com
This QOMifies the SiFive UART model. Migration and reset have been
implemented.
Signed-off-by: Lukas Jünger <lukas.juenger@greensocs.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210616092326.59639-3-lukas.juenger@greensocs.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
- tcg: implement the vector enhancements facility and bump the
'qemu' cpu model to a stripped-down z14 GA2
- fix psw.mask handling in signals
- fix vfio-ccw sense data handling
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEw9DWbcNiT/aowBjO3s9rk8bwL68FAmDQYXwSHGNvaHVja0By
ZWRoYXQuY29tAAoJEN7Pa5PG8C+vrDAQALLimgzAm6br4SHP49T7ZsDkuhZwyYpP
Fg09vxjMmKWgLOIQNp7Xd1vQJ5voGc5D0KleVMuX2feycfmon0yVeIBMan6DTcfr
lygtiBYrgPWVAs36OXQ/rJUHt2ZUZaQsS57lTgn+Jtn7p+AMjiMrDlam+iqoAnU+
o5RtTFG+bhqa72WI7mCG54hRfXS1b/K8Ts1qs0oJJVDrDWlmLWfpjuJU3ehvhepA
hJYnhIRQgbFHPsaJI47s25aa6KC+cGTGGRMl4YmFPACMh1KNXqmGP1XbxyEhz5tl
LUdU9JRikbBsErcItHfZabGktLtBi7B9Vyh9KhxG9vK7ol8GUD4pomHrLNH2UUtH
MyhTcuVCcEakuhgRr8GwA+7KO2Y3quqHDC3/kCkIarrE4X+YJl9Glv76we6XvbYL
4SAPE87Ub465C3J3tUjLDtfq8LpCIUh7zCYLBfk2Yf4pIbjnWMzUBu3UD2XYrKGF
+g+J8ZVjE/WFsZzWUJL54lLuT8+FjPLgNOsth1WTENGUEK+JgWGpHbpR1XdqQFj8
f+bk1nrL94iq3KRdmCaO1w6vc+5xDG0tSY4tVJ1Nip1w3ZxmJFOUWGWLs04hz4mn
WLx3vHbq3g1HZn9dtqwh6BvAP9fdLw2xkBcRtepR7vPM0ydqp2dvBbu99MrZN3H1
3Sa6lIpr4bII
=bMQp
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/cohuck-gitlab/tags/s390x-20210621' into staging
s390x update:
- tcg: implement the vector enhancements facility and bump the
'qemu' cpu model to a stripped-down z14 GA2
- fix psw.mask handling in signals
- fix vfio-ccw sense data handling
# gpg: Signature made Mon 21 Jun 2021 10:53:00 BST
# gpg: using RSA key C3D0D66DC3624FF6A8C018CEDECF6B93C6F02FAF
# gpg: issuer "cohuck@redhat.com"
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>" [unknown]
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>" [full]
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>" [full]
# gpg: aka "Cornelia Huck <cohuck@kernel.org>" [unknown]
# gpg: aka "Cornelia Huck <cohuck@redhat.com>" [unknown]
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck-gitlab/tags/s390x-20210621: (37 commits)
s390x/css: Add passthrough IRB
s390x/css: Refactor IRB construction
s390x/css: Split out the IRB sense data
s390x/css: Introduce an ESW struct
linux-user/s390x: Save and restore psw.mask properly
target/s390x: Use s390_cpu_{set_psw, get_psw_mask} in gdbstub
target/s390x: Improve s390_cpu_dump_state vs cc_op
target/s390x: Do not modify cpu state in s390_cpu_get_psw_mask
target/s390x: Expose load_psw and get_psw_mask to cpu.h
configure: Check whether we can compile the s390-ccw bios with -msoft-float
s390x/cpumodel: Bump up QEMU model to a stripped-down IBM z14 GA2
s390x/tcg: We support Vector enhancements facility
linux-user: elf: s390x: Prepare for Vector enhancements facility
s390x/tcg: Implement VECTOR FP (MAXIMUM|MINIMUM)
s390x/tcg: Implement VECTOR FP NEGATIVE MULTIPLY AND (ADD|SUBTRACT)
s390x/tcg: Implement 32/128 bit for VECTOR FP MULTIPLY AND (ADD|SUBTRACT)
s390x/tcg: Implement 32/128 bit for VECTOR FP TEST DATA CLASS IMMEDIATE
s390x/tcg: Implement 32/128 bit for VECTOR FP PERFORM SIGN OPERATION
s390x/tcg: Implement 128 bit for VECTOR FP LOAD ROUNDED
s390x/tcg: Implement 64 bit for VECTOR FP LOAD LENGTHENED
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Arm MVE VDUP implementation would like to be able to emit code to
duplicate a byte or halfword value into an i32. We have code to do
this already in tcg-op-gvec.c, so all we need to do is make the
functions global.
For consistency with other functions made available to the frontends:
* we rename to tcg_gen_dup_*
* we expose both the _i32 and _i64 forms
* we provide the #define for a _tl form
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210617121628.20116-10-peter.maydell@linaro.org
Allow code elsewhere in the system to check whether the ACPI GHES
table is present, so it can determine whether it is OK to try to
record an error by calling acpi_ghes_record_errors().
(We don't need to migrate the new 'present' field in AcpiGhesState,
because it is set once at system initialization and doesn't change.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Dongjiu Geng <gengdongjiu1@gmail.com>
Message-id: 20210603171259.27962-3-peter.maydell@linaro.org
Features:
* Add ratelimit for bus locks acquired in guest (Chenyi Qiang)
Documentation:
* SEV documentation updates (Tom Lendacky)
* Add a table showing x86-64 ABI compatibility levels (Daniel P. Berrangé)
Automated changes:
* Update Linux headers to 5.13-rc4 (Eduardo Habkost)
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEWjIv1avE09usz9GqKAeTb5hNxaYFAmDM+T4UHGVoYWJrb3N0
QHJlZGhhdC5jb20ACgkQKAeTb5hNxabUrQ/+PtiJjd1cW9nhA0kWu8dVGq3xXJb4
Nbma86tRPKBauTeQCLccXEvUjLqgFejeQlArhq4QKErLisXu4TDuQ+GeAfdR7h5P
MTMSo0C665cT2/NbrwQizSPQdrNEgZAYRaDRafZLQTJ1TStzWDB1Vg79rzpWPcn0
76XjIfSdGZUa4B1OvjNvUFq/SXf+0hW75soCwRhDNh5tfzfyct0XCSRF/wTXqyR/
7yxDtfTzUAvT+6l3qb8ky+wqUTIY58BgjbdIGhyAUr5/N8y5YystF41TUVoy772k
pmCXHniMmgmhH7HVwGujtc6mPe5y1VFJVaA08Pzb7KwSfdO9F/3Gk3DHpKW8/whi
tCGluBqz0qlyhsnP9wDRJb6BzCBl2hVqu50DL+uSNsJOSIW60LLMJV4ANlDYdDM3
s33S5NrM0DsRAjrtczPdvKPWwaVE4NB2bYX1I3yYGgflwzQYOjBmswM/UgymhlZk
5dxtF9CX2p+Vre6UoLDKum1DJDCcWjHouJAAqZzxxEko56yWgTUSzTcK4GVOlsAc
qX4gJbFpOzDlSdpDTG/fcnQlCnwc1jxCzsB8Wy2KJiBif3Sa3Wh1s00Cp7oGNQt+
P/z2Fp1agl8u83bbvlIjZnsv0O2g5Ks4r5tBhXmqI36aiU26F/x39SUfp7/7OAUd
CQBGBGXqpnOmUcE=
=YWGX
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost-gl/tags/x86-next-pull-request' into staging
x86 queue, 2021-06-18
Features:
* Add ratelimit for bus locks acquired in guest (Chenyi Qiang)
Documentation:
* SEV documentation updates (Tom Lendacky)
* Add a table showing x86-64 ABI compatibility levels (Daniel P. Berrangé)
Automated changes:
* Update Linux headers to 5.13-rc4 (Eduardo Habkost)
# gpg: Signature made Fri 18 Jun 2021 20:51:26 BST
# gpg: using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
# gpg: issuer "ehabkost@redhat.com"
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost-gl/tags/x86-next-pull-request:
scripts: helper to generate x86_64 CPU ABI compat info
docs: add a table showing x86-64 ABI compatibility levels
docs/interop/firmware.json: Add SEV-ES support
docs: Add SEV-ES documentation to amd-memory-encryption.txt
doc: Fix some mistakes in the SEV documentation
i386: Add ratelimit for bus locks acquired in guest
Update Linux headers to 5.13-rc4
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Wire in the subchannel callback for building the IRB
ESW and ECW space for passthrough devices, and copy
the hardware's ESW into the IRB we are building.
If the hardware presented concurrent sense, then copy
that sense data into the IRB's ECW space.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20210617232537.1337506-5-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Currently, all subchannel types have "sense data" copied into
the IRB.ECW space, and a couple flags enabled in the IRB.SCSW
and IRB.ESW. But for passthrough (vfio-ccw) subchannels,
this data isn't populated in the first place, so enabling
those flags leads to unexpected behavior if the guest tries to
process the sense data (zeros) in the IRB.ECW.
Let's add a subchannel callback that builds these portions of
the IRB, and move the existing code into a routine for those
virtual subchannels. The passthrough subchannels will be able
to piggy-back onto this later.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20210617232537.1337506-4-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The Interrupt Response Block is comprised of several other
structures concatenated together, but only the 12-byte
Subchannel-Status Word (SCSW) is defined as a proper struct.
Everything else is a simple array of 32-bit words.
Let's define a proper struct for the 20-byte Extended-Status
Word (ESW) so that we can make good decisions about the sense
data that would go into the ECW area for virtual vs
passthrough devices.
[CH: adapted ESW definition to build with mingw, as discussed]
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20210617232537.1337506-2-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's check for S390_FEAT_VECTOR_ENH and set HWCAP_S390_VXRS_EXT
accordingly. Add all missing HWCAP defined in upstream Linux.
Cc: Laurent Vivier <laurent@vivier.eu>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-25-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Leading underscores followed by a capital letter or underscore are
reserved by the C standard.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/369
Signed-off-by: Ahmed Abouzied <email@aabouzied.com>
Message-Id: <20210605174938.13782-1-email@aabouzied.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This commit moves into a separate file routines used to manipulate
TCGCond. These will be employed by the idef-parser.
Signed-off-by: Alessandro Di Federico <ale@rev.ng>
Signed-off-by: Paolo Montesel <babush@rev.ng>
Message-Id: <20210619093713.1845446-2-ale.qemu@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This removes all of the problems with unaligned accesses
to the bytecode stream.
With an 8-bit opcode at the bottom, we have 24 bits remaining,
which are generally split into 6 4-bit slots. This fits well
with the maximum length opcodes, e.g. INDEX_op_add2_i32, which
have 6 register operands.
We have, in previous patches, rearranged things such that there
are no operations with a label which have more than one other
operand. Which leaves us with a 20-bit field in which to encode
a label, giving us a maximum TB size of 512k -- easily large.
Change the INDEX_op_tci_movi_{i32,i64} opcodes to tci_mov[il].
The former puts the immediate in the upper 20 bits of the insn,
like we do for the label displacement. The later uses a label
to reference an entry in the constant pool. Thus, in the worst
case we still have a single memory reference for any constant,
but now the constants are out-of-line of the bytecode and can
be shared between different moves saving space.
Change INDEX_op_call to use a label to reference a pair of
pointers in the constant pool. This removes the only slightly
dodgy link with the layout of struct TCGHelperInfo.
The re-encode cannot be done in pieces.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This requires adjusting where arguments are stored.
Place them on the stack at left-aligned positions.
Adjust the stack frame to be at entirely positive offsets.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As noted by qemu-plugins.h, enum qemu_plugin_cb_flags is
currently unused -- plugins can neither read nor write
guest registers.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will shortly be interested in distinguishing pointers
from integers in the helper's declaration, as well as a
true void return. We currently have two parallel 1 bit
fields; merge them and expand to a 3 bit field.
Our current maximum is 7 helper arguments, plus the return
makes 8 * 3 = 24 bits used within the uint32_t typemask.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We'll need a possibility of non-blocking nbd_co_establish_connection(),
so that it returns immediately, and it returns success only if a
connections was previously established in background.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-30-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
block/nbd doesn't need underlying sioc channel anymore. So, we can
update nbd/client-connection interface to return only one top-most io
channel, which is more straight forward.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-27-vsementsov@virtuozzo.com>
[eblake: squash in Vladimir's fixes for uninit usage caught by clang]
Signed-off-by: Eric Blake <eblake@redhat.com>
Add an option for a thread to retry connecting until it succeeds. We'll
use nbd/client-connection both for reconnect and for initial connection
in nbd_open(), so we need a possibility to use same NBDClientConnection
instance to connect once in nbd_open() and then use retry semantics for
reconnect.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-21-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
Add arguments and logic to support nbd negotiation in the same thread
after successful connection.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-20-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
We now have bs-independent connection API, which consists of four
functions:
nbd_client_connection_new()
nbd_client_connection_release()
nbd_co_establish_connection()
nbd_co_establish_connection_cancel()
Move them to a separate file together with NBDClientConnection
structure which becomes private to the new API.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-18-vsementsov@virtuozzo.com>
[eblake: comment tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
Add function that transforms named fd inside SocketAddress structure
into number representation. This way it may be then used in a context
where current monitor is not available.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-6-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: comment tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
qemu_co_queue_next() and qemu_co_queue_restart_all() just call
aio_co_wake() which works well in non-coroutine context. So these
functions can be called from non-coroutine context as well. And
actually qemu_co_queue_restart_all() is called from
nbd_cancel_in_flight(), which is called from non-coroutine context.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
If we want to wake up a coroutine from a worker thread, aio_co_wake()
currently does not work. In that scenario, aio_co_wake() calls
aio_co_enter(), but there is no current AioContext and therefore
qemu_get_current_aio_context() returns the main thread. aio_co_wake()
then attempts to call aio_context_acquire() instead of going through
aio_co_schedule().
The default case of qemu_get_current_aio_context() was added to cover
synchronous I/O started from the vCPU thread, but the main and vCPU
threads are quite different. The main thread is an I/O thread itself,
only running a more complicated event loop; the vCPU thread instead
is essentially a worker thread that occasionally calls
qemu_mutex_lock_iothread(). It is only in those critical sections
that it acts as if it were the home thread of the main AioContext.
Therefore, this patch detaches qemu_get_current_aio_context() from
iothreads, which is a useless complication. The AioContext pointer
is stored directly in the thread-local variable, including for the
main loop. Worker threads (including vCPU threads) optionally behave
as temporary home threads if they have taken the big QEMU lock,
but if that is not the case they will always schedule coroutines
on remote threads via aio_co_schedule().
With this change, the stub qemu_mutex_iothread_locked() must be changed
from true to false. The previous value of true was needed because the
main thread did not have an AioContext in the thread-local variable,
but now it does have one.
Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210609122234.544153-1-pbonzini@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: tweak commit message per Vladimir's review]
Signed-off-by: Eric Blake <eblake@redhat.com>
A bus lock is acquired through either split locked access to writeback
(WB) memory or any locked access to non-WB memory. It is typically >1000
cycles slower than an atomic operation within a cache and can also
disrupts performance on other cores.
Virtual Machines can exploit bus locks to degrade the performance of
system. To address this kind of performance DOS attack coming from the
VMs, bus lock VM exit is introduced in KVM and it can report the bus
locks detected in guest. If enabled in KVM, it would exit to the
userspace to let the user enforce throttling policies once bus locks
acquired in VMs.
The availability of bus lock VM exit can be detected through the
KVM_CAP_X86_BUS_LOCK_EXIT. The returned bitmap contains the potential
policies supported by KVM. The field KVM_BUS_LOCK_DETECTION_EXIT in
bitmap is the only supported strategy at present. It indicates that KVM
will exit to userspace to handle the bus locks.
This patch adds a ratelimit on the bus locks acquired in guest as a
mitigation policy.
Introduce a new field "bus_lock_ratelimit" to record the limited speed
of bus locks in the target VM. The user can specify it through the
"bus-lock-ratelimit" as a machine property. In current implementation,
the default value of the speed is 0 per second, which means no
restrictions on the bus locks.
As for ratelimit on detected bus locks, simply set the ratelimit
interval to 1s and restrict the quota of bus lock occurence to the value
of "bus_lock_ratelimit". A potential alternative is to introduce the
time slice as a property which can help the user achieve more precise
control.
The detail of bus lock VM exit can be found in spec:
https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210521043820.29678-1-chenyi.qiang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
int128_make64() creates an Int128 from an unsigned 64 bit value; add
a function int128_makes64() creating an Int128 from a signed 64 bit
value.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210614151007.4545-34-peter.maydell@linaro.org
Currently the ARM SVE helper code defines locally some utility
functions for swapping 16-bit halfwords within 32-bit or 64-bit
values and for swapping 32-bit words within 64-bit values,
parallel to the byte-swapping bswap16/32/64 functions.
We want these also for the ARM MVE code, and they're potentially
generally useful for other targets, so move them to bitops.h.
(We don't put them in bswap.h with the bswap* functions because
they are implemented in terms of the rotate operations also
defined in bitops.h, and including bitops.h from bswap.h seems
better avoided.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210614151007.4545-17-peter.maydell@linaro.org
_Static_assert is part of C11, which is now required.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210614233143.1221879-9-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All previous users now use C11 _Generic.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-8-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is both more and less complicated than our expansion
using __builtin_choose_expr and __builtin_types_compatible_p.
The expansion through QEMU_MAKE_LOCKABLE_ doesn't work because
we're not emumerating all of the types within the same _Generic,
which results in errors about unhandled cases. We must also
handle void* explicitly, so that the NULL constant can be used.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-7-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We will shortly convert lockable.h to _Generic, and we cannot
have two compatible types in the same expansion. Wrap QemuMutex
in a struct, and unwrap in qemu-thread-posix.c.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210614233143.1221879-6-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Create macros for file+line expansion in qemu_rec_mutex_unlock
like we have for qemu_mutex_unlock.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210614233143.1221879-5-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the declarations from thread-win32.h into thread.h
and remove the macro redirection from thread-posix.h.
This will be required by following cleanups.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210614233143.1221879-4-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
_Static_assert is part of C11, which is now required.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-9-richard.henderson@linaro.org>
All previous users now use C11 _Generic.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-8-richard.henderson@linaro.org>
This is both more and less complicated than our expansion
using __builtin_choose_expr and __builtin_types_compatible_p.
The expansion through QEMU_MAKE_LOCKABLE_ doesn't work because
we're not emumerating all of the types within the same _Generic,
which results in errors about unhandled cases. We must also
handle void* explicitly, so that the NULL constant can be used.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-7-richard.henderson@linaro.org>
We will shortly convert lockable.h to _Generic, and we cannot
have two compatible types in the same expansion. Wrap QemuMutex
in a struct, and unwrap in qemu-thread-posix.c.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-6-richard.henderson@linaro.org>
Create macros for file+line expansion in qemu_rec_mutex_unlock
like we have for qemu_mutex_unlock.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-5-richard.henderson@linaro.org>
Move the declarations from thread-win32.h into thread.h
and remove the macro redirection from thread-posix.h.
This will be required by following cleanups.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-4-richard.henderson@linaro.org>
Let's provide a way to control the use of RAM_NORESERVE via memory
backends using the "reserve" property which defaults to true (old
behavior).
Only Linux currently supports clearing the flag (and support is checked at
runtime, depending on the setting of "/proc/sys/vm/overcommit_memory").
Windows and other POSIX systems will bail out with "reserve=false".
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM. This essentially allows
avoiding to set "/proc/sys/vm/overcommit_memory == 0") when using
virtio-mem and also supporting hugetlbfs in the future.
As really only Linux implements RAM_NORESERVE right now, let's expose
the property only with CONFIG_LINUX. Setting the property to "false"
will then only fail in corner cases -- for example on very old kernels
or when memory overcommit was completely disabled by the admin.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-11-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's support RAM_NORESERVE via MAP_NORESERVE on Linux. The flag has no
effect on most shared mappings - except for hugetlbfs and anonymous memory.
Linux man page:
"MAP_NORESERVE: Do not reserve swap space for this mapping. When swap
space is reserved, one has the guarantee that it is possible to modify
the mapping. When swap space is not reserved one might get SIGSEGV
upon a write if no physical memory is available. See also the discussion
of the file /proc/sys/vm/overcommit_memory in proc(5). In kernels before
2.6, this flag had effect only for private writable mappings."
Note that the "guarantee" part is wrong with memory overcommit in Linux.
Also, in Linux hugetlbfs is treated differently - we configure reservation
of huge pages from the pool, not reservation of swap space (huge pages
cannot be swapped).
The rough behavior is [1]:
a) !Hugetlbfs:
1) Without MAP_NORESERVE *or* with memory overcommit under Linux
disabled ("/proc/sys/vm/overcommit_memory == 2"), the following
accounting/reservation happens:
For a file backed map
SHARED or READ-only - 0 cost (the file is the map not swap)
PRIVATE WRITABLE - size of mapping per instance
For an anonymous or /dev/zero map
SHARED - size of mapping
PRIVATE READ-only - 0 cost (but of little use)
PRIVATE WRITABLE - size of mapping per instance
2) With MAP_NORESERVE, no accounting/reservation happens.
b) Hugetlbfs:
1) Without MAP_NORESERVE, huge pages are reserved.
2) With MAP_NORESERVE, no huge pages are reserved.
Note: With "/proc/sys/vm/overcommit_memory == 0", we were already able
to configure it for !hugetlbfs globally; this toggle now allows
configuring it more fine-grained, not for the whole system.
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.
[1] https://www.kernel.org/doc/Documentation/vm/overcommit-accounting
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-10-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The
new flag has the following semantics:
"
RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge
pages if applicable) is skipped: will bail out if not supported. When not
set, the OS will do the reservation, if supported for the memory type.
"
Allow passing it into:
- memory_region_init_ram_nomigrate()
- memory_region_init_resizeable_ram()
- memory_region_init_ram_from_file()
... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag.
Bail out if the flag is not supported, which is the case right now for
both, POSIX and win32. We will add Linux support next and allow specifying
RAM_NORESERVE via memory backends.
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-9-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>