Use variable length suffixes for inode remapping instead of the fixed
16 bit size prefixes before. With this change the inode numbers on guest
will typically be much smaller (e.g. around >2^1 .. >2^7 instead of >2^48
with the previous fixed size inode remapping.
Additionally this solution is more efficient, since inode numbers in
practice can take almost their entire 64 bit range on guest as well, so
there is less likely a need for generating and tracking additional suffixes,
which might also be beneficial for nested virtualization where each level of
virtualization would shift up the inode bits and increase the chance of
expensive remapping actions.
The "Exponential Golomb" algorithm is used as basis for generating the
variable length suffixes. The algorithm has a parameter k which controls the
distribution of bits on increasing indeces (minimum bits at low index vs.
maximum bits at high index). With k=0 the generated suffixes look like:
Index Dec/Bin -> Generated Suffix Bin
1 [1] -> [1] (1 bits)
2 [10] -> [010] (3 bits)
3 [11] -> [110] (3 bits)
4 [100] -> [00100] (5 bits)
5 [101] -> [10100] (5 bits)
6 [110] -> [01100] (5 bits)
7 [111] -> [11100] (5 bits)
8 [1000] -> [0001000] (7 bits)
9 [1001] -> [1001000] (7 bits)
10 [1010] -> [0101000] (7 bits)
11 [1011] -> [1101000] (7 bits)
12 [1100] -> [0011000] (7 bits)
...
65533 [1111111111111101] -> [1011111111111111000000000000000] (31 bits)
65534 [1111111111111110] -> [0111111111111111000000000000000] (31 bits)
65535 [1111111111111111] -> [1111111111111111000000000000000] (31 bits)
Hence minBits=1 maxBits=31
And with k=5 they would look like:
Index Dec/Bin -> Generated Suffix Bin
1 [1] -> [000001] (6 bits)
2 [10] -> [100001] (6 bits)
3 [11] -> [010001] (6 bits)
4 [100] -> [110001] (6 bits)
5 [101] -> [001001] (6 bits)
6 [110] -> [101001] (6 bits)
7 [111] -> [011001] (6 bits)
8 [1000] -> [111001] (6 bits)
9 [1001] -> [000101] (6 bits)
10 [1010] -> [100101] (6 bits)
11 [1011] -> [010101] (6 bits)
12 [1100] -> [110101] (6 bits)
...
65533 [1111111111111101] -> [0011100000000000100000000000] (28 bits)
65534 [1111111111111110] -> [1011100000000000100000000000] (28 bits)
65535 [1111111111111111] -> [0111100000000000100000000000] (28 bits)
Hence minBits=6 maxBits=28
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
stat_to_qid attempts via qid_path_prefixmap to map unique files (which are
identified by 64 bit inode nr and 32 bit device id) to a 64 QID path value.
However this implementation makes some assumptions about inode number
generation on the host.
If qid_path_prefixmap fails, we still have 48 bits available in the QID
path to fall back to a less memory efficient full mapping.
Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com>
[CS: - Rebased to https://github.com/gkurz/qemu/commits/9p-next
(SHA1 7fc4c49e91).
- Updated hash calls to new xxhash API.
- Removed unnecessary parantheses in qpf_lookup_func().
- Removed unnecessary g_malloc0() result checks.
- Log error message when running out of prefixes in
qid_path_fullmap().
- Log warning message about potential degraded performance in
qid_path_prefixmap().
- Wrapped qpf_table initialization to dedicated qpf_table_init()
function.
- Fixed typo in comment. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
'warn' (default): Only log an error message (once) on host if more than one
device is shared by same export, except of that just ignore this config
error though. This is the default behaviour for not breaking existing
installations implying that they really know what they are doing.
'forbid': Like 'warn', but except of just logging an error this
also denies access of guest to additional devices.
'remap': Allows to share more than one device per export by remapping
inodes from host to guest appropriately. To support multiple devices on the
9p share, and avoid qid path collisions we take the device id as input to
generate a unique QID path. The lowest 48 bits of the path will be set
equal to the file inode, and the top bits will be uniquely assigned based
on the top 16 bits of the inode and the device id.
Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com>
[CS: - Rebased to https://github.com/gkurz/qemu/commits/9p-next
(SHA1 7fc4c49e91).
- Added virtfs option 'multidevs', original patch simply did the inode
remapping without being asked.
- Updated hash calls to new xxhash API.
- Updated docs for new option 'multidevs'.
- Fixed v9fs_do_readdir() not having remapped inodes.
- Log error message when running out of prefixes in
qid_path_prefixmap().
- Fixed definition of QPATH_INO_MASK.
- Wrapped qpp_table initialization to dedicated qpp_table_init()
function.
- Dropped unnecessary parantheses in qpp_lookup_func().
- Dropped unnecessary g_malloc0() result checks. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
[groug: - Moved "multidevs" parsing to the local backend.
- Added hint to invalid multidevs option error.
- Turn "remap" into "x-remap". ]
Signed-off-by: Greg Kurz <groug@kaod.org>
The QID path should uniquely identify a file. However, the
inode of a file is currently used as the QID path, which
on its own only uniquely identifies files within a device.
Here we track the device hosting the 9pfs share, in order
to prevent security issues with QID path collisions from
other devices.
We only print a warning for now but a subsequent patch will
allow users to have finer control over the desired behaviour.
Failing the I/O will be one the proposed behaviour, so we
also change stat_to_qid() to return an error here in order to
keep other patches simpler.
Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com>
[CS: - Assign dev_id to export root's device already in
v9fs_device_realize_common(), not postponed in
stat_to_qid().
- error_report_once() if more than one device was
shared by export.
- Return -ENODEV instead of -ENOSYS in stat_to_qid().
- Fixed typo in log comment. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
[groug, changed to warning, updated message and changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
It is more convenient to use the return value of the function to notify
errors, rather than to be tied up setting up the &local_err boilerplate.
Signed-off-by: Greg Kurz <groug@kaod.org>
There is no need for signedness on these QID fields for 9p.
Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com>
[CS: - Also make QID type unsigned.
- Adjust donttouch_stat() to new types.
- Adjust trace-events to new types. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
This pull request also contains the two commits from the previous pull request
that was dropped due to a mingw compilation error. The compilation should now
be fixed.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAl2coyIACgkQnKSrs4Gr
c8iE1AgAh1ACKXmuH+ml3jkKniLVpr4KEIWB5CmlMSQtmv2Hteyi+cI+T/WvhFGB
Xw8yW1yxVeV98lPWZJeb06s/Ry/ZXef+D63i6xcQzc/3a6pTJKeLdi8acbJuSRso
A3VjQaczjrP9GOPvOoy3XkXoLr5nuD4NI8TYMmhWgwb6eSETUHGYgQo1uoZkqkuF
4/K856dtN2wb3WqChdwBhg+1o8WRpxEZ74pVC0SBGtKcBDjiY7J/BEXfNJo5M4gJ
s1x3FVpSN19LYljgo4WCJt/hzMkOSy0uV4mSzCx6U/dEB+w+VUvuZ7wTt69Kweji
vTMsFDN67Xzq8xWqeiG5uGBM8NRIwQ==
=zUDj
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Pull request
This pull request also contains the two commits from the previous pull request
that was dropped due to a mingw compilation error. The compilation should now
be fixed.
# gpg: Signature made Tue 08 Oct 2019 15:54:26 BST
# gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
iotests/262: Switch source/dest VM launch order
block: Skip COR for inactive nodes
virtio-blk: schedule virtio_notify_config to run on main context
util/ioc.c: try to reassure Coverity about qemu_iovec_init_extended
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Launching the destination VM before the source VM gives us a regression
test for HEAD^:
The guest device causes a read from the disk image through
guess_disk_lchs(). This will not work if the first sector (containing
the partition table) is yet unallocated, we use COR, and the node is
inactive.
By launching the source VM before the destination, however, the COR
filter on the source will allocate that area in the image shared between
both VMs, thus the problem will not become apparent.
Switching the launch order causes the sector to still be unallocated
when guess_disk_lchs() runs on the inactive node in the destination VM,
and thus we get our test case.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191001174827.11081-3-mreitz@redhat.com
Message-Id: <20191001174827.11081-3-mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We must not write data to inactive nodes, and a COR is certainly
something we can simply not do without upsetting anyone. So skip COR
operations on inactive nodes.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191001174827.11081-2-mreitz@redhat.com
Message-Id: <20191001174827.11081-2-mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
virtio_notify_config() needs to acquire the global mutex, which isn't
allowed from an iothread, and may lead to a deadlock like this:
- main thead
* Has acquired: qemu_global_mutex.
* Is trying the acquire: iothread AioContext lock via
AIO_WAIT_WHILE (after aio_poll).
- iothread
* Has acquired: AioContext lock.
* Is trying to acquire: qemu_global_mutex (via
virtio_notify_config->prepare_mmio_access).
If virtio_blk_resize() is called from an iothread, schedule
virtio_notify_config() to be run in the main context BH.
[Removed unnecessary newline as suggested by Kevin Wolf
<kwolf@redhat.com>.
--Stefan]
Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20190916112411.21636-1-slp@redhat.com
Message-Id: <20190916112411.21636-1-slp@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Make it more obvious, that filling qiov corresponds to qiov allocation,
which in turn corresponds to total_niov calculation, based on mid_niov
(not mid_len). Still add an assertion to show that there should be no
difference.
[Added mingw "error: 'mid_iov' may be used uninitialized in this
function" compiler error fix suggested by Vladimir.
--Stefan]
Reported-by: Coverity (CID 1405302)
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190910090310.14032-1-vsementsov@virtuozzo.com
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190910090310.14032-1-vsementsov@virtuozzo.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
fixup! util/ioc.c: try to reassure Coverity about qemu_iovec_init_extended
drop Python2 dependency in EDK2 build scripts.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEicHnj2Ae6GyGdJXLoqP9bt6twN4FAl2bPjoACgkQoqP9bt6t
wN6fOhAAzJm5IDx6w8HFqLyMhdgfBh0q/6xCNKCpbiVIpkmZeetUPb8WsKizCQQ+
uhsl6KcKErrGTorBLPUcSDaeCdaCCPOoLEZqxbCYQ1xNDjiK6pXStm+7Ztyp9KBG
e92qQEXZMjKel0toWMNMLlVVR3yY6zNkKTqBXmIYIBb+qVGFoZdwCo/9f/8KlGv1
iyN1Bl4hGRaIw/qQ9ptjctuWeE1dOm1UawXdvAi7yBUs4kuJ1nLk7MaBEueOKZHh
3KdcroJw55N5kTbHiUonRbmGHikjxicOlmpKGi7x415iec4o3pGiXR5AGcQuoJNX
lhopEG/+Eiq2IzmVUp6mYFDmvCKzt4sTnFGFkPWVuPDmJ8ym8x+D12ESrCn/tbwj
eaYBPhSi2BcEkaIblk4pj7Bma7RntVt5ga2q/2chqhSWI5X85rmW/avMgBj9y4Xo
hQVhyWKIqTuw2cFdigL9OQFzRxhPaltFQO6e/5iyjbwNZnFKwzN1OoZ2PAOK1BtZ
veoyDWsD/6YjhcOogABVlUI13rSj+M9soR++fjmMe/n+DTqpvfjPUOOmZ6PvWS9Q
zodyctKkKZau0ecq5ZoE3JfMLMSPO0XIKk46bO/cRphn5VoxC4bnEjqQ+rMbnEp+
1arIEW3m7e3Ngmvb6zUrHIL7dTtvzri9kq4vFkAskwrg8t2R0Kg=
=BE2l
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/edk2-next-20191007' into staging
Improve scripts relying on the EDK2 submodule,
drop Python2 dependency in EDK2 build scripts.
# gpg: Signature made Mon 07 Oct 2019 14:31:38 BST
# gpg: using RSA key 89C1E78F601EE86C867495CBA2A3FD6EDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (Phil) <philmd@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 89C1 E78F 601E E86C 8674 95CB A2A3 FD6E DEAD C0DE
* remotes/philmd-gitlab/tags/edk2-next-20191007:
edk2 build scripts: work around TianoCore#1607 without forcing Python 2
edk2 build scripts: honor external BaseTools flags with uefi-test-tools
roms: Add a 'make help' target alias
roms/Makefile.edk2: don't pull in submodules when building from tarball
make-release: pull in edk2 submodules so we can build it from tarballs
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It turns out that forcing python2 for running the edk2 "build" utility is
neither necessary nor sufficient.
Forcing python2 is not sufficient for two reasons:
- QEMU is moving away from python2, with python2 nearing EOL,
- according to my most recent testing, the lacking dependency information
in the makefiles that are generated by edk2's "build" utility can cause
parallel build failures even when "build" is executed by python2.
And forcing python2 is not necessary because we can still return to the
original idea of filtering out jobserver-related options from MAKEFLAGS.
So do that.
While at it, cut short edk2's auto-detection of the python3.* minor
version, by setting PYTHON_COMMAND to "python3" (which we expect to be
available wherever we intend to build edk2).
With this patch, the guest UEFI binaries that are used as part of the BIOS
tables test, and the OVMF and ArmVirtQemu platform firmwares, will be
built strictly in a single job, regardless of an outermost "-jN" make
option. Alas, there appears to be no reliable way to build edk2 in an
(outer make, inner make) environment, with a jobserver enabled.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Reported-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20190920083808.21399-3-lersek@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Unify the recipe for "build-edk2-tools" in
"tests/uefi-test-tools/Makefile" with the recipe for "edk2-basetools" in
"roms/Makefile".
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20190920083808.21399-2-lersek@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Various C projects provide a 'make help' target. Our root directory
does so. The roms/ directory lacks a such rule, but already displays
a help output when the default target is called.
Add a 'help' target aliased to the default one, to avoid:
$ make -C roms help
make: *** No rule to make target 'help'. Stop.
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20190920171159.18633-1-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Currently the `make efi` target pulls submodules nested under the
roms/edk2 submodule as dependencies. However, when we attempt to build
from a tarball this fails since we are no longer in a git tree.
A preceding patch will pre-populate these submodules in the tarball,
so assume this build dependency is only needed when building from a
git tree.
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Bruce Rogers <brogers@suse.com>
Cc: qemu-stable@nongnu.org # v4.1.0
Reported-by: Bruce Rogers <brogers@suse.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Message-Id: <20190912231202.12327-3-mdroth@linux.vnet.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The `make efi` target added by 536d2173 is built from the roms/edk2
submodule, which in turn relies on additional submodules nested under
roms/edk2.
The make-release script currently only pulls in top-level submodules,
so these nested submodules are missing in the resulting tarball.
We could try to address this situation more generally by recursively
pulling in all submodules, but this doesn't necessarily ensure the
end-result will build properly (this case also required other changes).
Additionally, due to the nature of submodules, we may not always have
control over how these sorts of things are dealt with, so for now we
continue to handle it on a case-by-case in the make-release script.
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Bruce Rogers <brogers@suse.com>
Cc: qemu-stable@nongnu.org # v4.1.0
Reported-by: Bruce Rogers <brogers@suse.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Message-Id: <20190912231202.12327-2-mdroth@linux.vnet.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Here's the next batch of ppc and spapr patches. Includes:
* Fist part of a large cleanup to irq infrastructure
* Recreate the full FDT at CAS time, instead of making a difficult
to follow set of updates. This will help us move towards
eliminating CAS reboots altogether
* No longer provide RTAS blob to SLOF - SLOF can include it just as
well itself, since guests will generally need to relocate it with
a call to instantiate-rtas
* A number of DFP fixes and cleanups from Mark Cave-Ayland
* Assorted bugfixes
* Several new small devices for powernv
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl2XEn0ACgkQbDjKyiDZ
s5I6bA/7B5sjY/QxuE8axm5KupoAnE8zf205hN8mbYASwtDfFwgaeNreVaOSJUpr
fgcx/g9G3rAryGZv3O6i02+wcRgNw1DnJ3ynCthIrExZEcfbTYJiS4s9apwPEQy8
HFmBNdPDqrhFI0aFvXEUauiOp1aapPUUklm34eFscs94lJXxphRUEfa3XT5uEhUh
xrIZwYq20A+ih4UHwk3Onyx/cvFpl6BRB2nVEllQFqzwF5eTTfz9t8+JGTebxD/7
8qqt8ti0KM3wxSDTQnmyMUmpgy+C1iCvNYvv6nWFg+07QuGs48EHlQUUVVni4r9j
kUrDwKS2eC+8e8gP/xdIXEq3R2DsAMq+wFIswXZ3X6x4DoUV0OAJSHc9iMD4l+pr
LyWnVpDprc6XhJHWKpuHZ5w9EuBnZFbIXdlZGFno+8UvXtusnbbuwAZzHTrRJRqe
/AWVpFwGAoOF4KxIOFlPVBI8m4vFad/soVojC0vzIbRqaogOFZAjiL/yD5GwLmMa
tywOEMBUJ/j2lgudTCyKn5uCa/Ew3DS1TSdenJjyqRi/gZM0IaORIhJhyFYW/eO1
U7Uh8BnbC+4J11wwvFR5+W789dgM2+EEtAX9uI08VcE/R2ASabZlN4Zwrl0w4cb/
VRybMT4bgmjzHRpfrqYPxpn8wqPcIw0BCeipSOjY3QU1Q25TEYQ=
=PXXe
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.2-20191004' into staging
ppc patch queue 2019-10-04
Here's the next batch of ppc and spapr patches. Includes:
* Fist part of a large cleanup to irq infrastructure
* Recreate the full FDT at CAS time, instead of making a difficult
to follow set of updates. This will help us move towards
eliminating CAS reboots altogether
* No longer provide RTAS blob to SLOF - SLOF can include it just as
well itself, since guests will generally need to relocate it with
a call to instantiate-rtas
* A number of DFP fixes and cleanups from Mark Cave-Ayland
* Assorted bugfixes
* Several new small devices for powernv
# gpg: Signature made Fri 04 Oct 2019 10:35:57 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-4.2-20191004: (53 commits)
ppc/pnv: Remove the XICSFabric Interface from the POWER9 machine
spapr: Eliminate SpaprIrq::init hook
spapr: Add return value to spapr_irq_check()
spapr: Use less cryptic representation of which irq backends are supported
xive: Improve irq claim/free path
spapr, xics, xive: Better use of assert()s on irq claim/free paths
spapr: Handle freeing of multiple irqs in frontend only
spapr: Remove unhelpful tracepoints from spapr_irq_free_xics()
spapr: Eliminate SpaprIrq:get_nodename method
spapr: Simplify spapr_qirq() handling
spapr: Fix indexing of XICS irqs
spapr: Eliminate nr_irqs parameter to SpaprIrq::init
spapr: Clarify and fix handling of nr_irqs
spapr: Replace spapr_vio_qirq() helper with spapr_vio_irq_pulse() helper
spapr: Fold spapr_phb_lsi_qirq() into its single caller
xics: Create sPAPR specific ICS subtype
xics: Merge TYPE_ICS_BASE and TYPE_ICS_SIMPLE classes
xics: Eliminate reset hook
xics: Rename misleading ics_simple_*() functions
xics: Eliminate 'reject', 'resend' and 'eoi' class hooks
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When I run QEMU with KVM under Valgrind, I currently get this warning:
Syscall param ioctl(generic) points to uninitialised byte(s)
at 0x95BA45B: ioctl (in /usr/lib64/libc-2.28.so)
by 0x429DC3: kvm_ioctl (kvm-all.c:2365)
by 0x51B249: kvm_arch_get_supported_msr_feature (kvm.c:469)
by 0x4C2A49: x86_cpu_get_supported_feature_word (cpu.c:3765)
by 0x4C4116: x86_cpu_expand_features (cpu.c:5065)
by 0x4C7F8D: x86_cpu_realizefn (cpu.c:5242)
by 0x5961F3: device_set_realized (qdev.c:835)
by 0x7038F6: property_set_bool (object.c:2080)
by 0x707EFE: object_property_set_qobject (qom-qobject.c:26)
by 0x705814: object_property_set_bool (object.c:1338)
by 0x498435: pc_new_cpu (pc.c:1549)
by 0x49C67D: pc_cpus_init (pc.c:1681)
Address 0x1ffeffee74 is on thread 1's stack
in frame #2, created by kvm_arch_get_supported_msr_feature (kvm.c:445)
It's harmless, but a little bit annoying, so silence it by properly
initializing the whole structure with zeroes.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some secondary controls are automatically enabled/disabled based on the CPUID
values that are set for the guest. However, they are still available at a
global level and therefore should be present when KVM_GET_MSRS is sent to
/dev/kvm.
Unfortunately KVM forgot to include those, so fix that.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add code to convert the VMX feature words back into MSR values,
allowing the user to enable/disable VMX features as they wish. The same
infrastructure enables support for limiting VMX features in named
CPU models.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The low bits are 1 if the control must be one, the high bits
are 1 if the control can be one. Correct the variable names
as they are very confusing.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These will be used to compile the list of VMX features for named
CPU models, and/or by the code that sets up the VMX MSRs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
VMX requires 64-bit feature words for the IA32_VMX_EPT_VPID_CAP
and IA32_VMX_BASIC MSRs. (The VMX control MSRs are 64-bit wide but
actually have only 32 bits of information).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sometimes a CPU feature does not make sense unless another is
present. In the case of VMX features, KVM does not even allow
setting the VMX controls to some invalid combinations.
Therefore, this patch adds a generic mechanism that looks for bits
that the user explicitly cleared, and uses them to remove other bits
from the expanded CPU definition. If these dependent bits were also
explicitly *set* by the user, this will be a warning for "-cpu check"
and an error for "-cpu enforce". If not, then the dependent bits are
cleared silently, for convenience.
With VMX features, this will be used so that for example
"-cpu host,-rdrand" will also hide support for RDRAND exiting.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The next patch will add a different reason for filtering features, unrelated
to host feature support. Extract a new function that takes care of disabling
the features and optionally reporting them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-fsanitize=undefined is not the same thing as --enable-sanitizers. After
commit 47c823e ("tests/docker: add sanitizers back to clang build", 2019-09-11)
test-clang is almost duplicating the asan (test-debug) test, so
partly revert commit 47c823e5b while leaving ubsan enabled.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 05e514b1d4 introduced an AIO
context optimization to avoid calling event_notifier_test_and_clear() on
ctx->notifier. On Windows, the same notifier is being used to wakeup the
wait on socket events (see commit
d3385eb448).
The ctx->notifier event is added to the gpoll sources in
aio_set_event_notifier(), aio_ctx_check() should clear the event
regardless of ctx->notified, since Windows sets the event by itself,
bypassing the aio->notified. This fixes qemu not clearing the event
resulting in a busy loop.
Paolo suggested to me on irc to call event_notifier_test_and_clear()
after select() >0 from aio-win32.c's aio_prepare. Unfortunately, not all
fds associated with ctx->notifiers are in AIO fd handlers set.
(qemu_set_nonblock() in util/oslib-win32.c calls qemu_fd_register()).
This is essentially a v2 of a patch that was sent earlier:
https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg00420.html
that resurfaced when James investigated Spice performance issues on Windows:
https://gitlab.freedesktop.org/spice/spice/issues/36
In order to test that patch, I simply tried running test-char on
win32, and it hangs. Applying that patch solves it. QIO idle sources
are not dispatched. I haven't investigated much further, I suspect
source priorities and busy looping still come into play.
This version keeps the "notified" field, so event_notifier_poll()
should still work as expected.
Cc: James Le Cuirot <chewi@gentoo.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Serial test is currently hard-coded to /dev/null.
On Windows, serial chardev expect a COM: device, which may not be
availble.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In general, WSAEWOULDBLOCK can be mapped to EAGAIN as done by
socket_error() (or EWOULDBLOCK). But for connect() with non-blocking
sockets, it actually means the operation is in progress:
https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-connect
"The socket is marked as nonblocking and the connection cannot be completed immediately."
(this is also the behaviour implemented by GLib GSocket)
This fixes socket_can_bind_connect() test on win32.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is a problem, that you don't have access to the data using cpu_memory_rw_debug() function when in SMM. You can't remotely debug SMM mode program because of that for example.
Likely attrs version of get_phys_page_debug should be used to get correct asidx at the end to handle access properly.
Here the patch to fix it.
Signed-off-by: Dmitry Poletaev <poletaev@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, when a notifier is attempted to be registered and its
flags are not supported (especially the MAP one) by the IOMMU MR,
we generally abruptly exit in the IOMMU code. The failure could be
handled more nicely in the caller and especially in the VFIO code.
So let's allow memory_region_register_iommu_notifier() to fail as
well as notify_flag_changed() callback.
All sites implementing the callback are updated. This patch does
not yet remove the exit(1) in the amd_iommu code.
in SMMUv3 we turn the warning message into an error message saying
that the assigned device would not work properly.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The container error integer field is currently used to store
the first error potentially encountered during any
vfio_listener_region_add() call. However this fails to propagate
detailed error messages up to the vfio_connect_container caller.
Instead of using an integer, let's use an Error handle.
Messages are slightly reworded to accomodate the propagation.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CPUID bits CLZERO and XSAVEERPTR are availble on AMD's ZEN platform
and could be passed to the guest.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are just too many leaks in device-introspect-test (especially for
the plethora of arm and aarch64 boards) to make LeakSanitizer useful;
disable it for now.
Whoever is interested in debugging leaks can also use valgrind like this:
QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 \
QTEST_QEMU_IMG=qemu-img \
valgrind --trace-children=yes --leak-check=full \
tests/device-introspect-test -p /aarch64/device/introspect/concrete/defaults/none
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Bottom halves and ptimers are malloced, but nothing in these
files is freeing memory allocated by instance_init. Since
these are sysctl devices that are never unrealized, just moving
the allocations to realize is enough to avoid the leak in
practice (and also to avoid upsetting asan when running
device-introspect-test).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
memory_region_init_* takes care of copying the name into memory it owns.
Free it in the caller.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The array returned by qemu_allocate_irqs is malloced, free it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The device tree blob returned by load_device_tree is malloced.
Free it before returning.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The array returned by qemu_allocate_irqs is malloced, free it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Currently, isa-superio.c is always compiled as soon as CONFIG_ISA_BUS
is enabled. But there are also machines that have an ISA BUS without
any of the superio chips attached to it, so we should not compile
isa-superio.c in case we only compile a QEMU for such a machine.
Thus add a proper CONFIG_ISA_SUPERIO switch so that this file only gets
compiled when we really, really need it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some scripts check the Python version number and have two code paths to
accomodate both Python 2 and 3. Remove the code specific to Python 2 and
assert the minimum version of 3.6 instead (check skips Python tests in
this case, so the assertion would only ever trigger if a Python script
is executed manually).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Running iotests is not required to build QEMU, so we can have stricter
version requirements for Python here and can make use of new features
and drop compatibility code earlier.
This makes qemu-iotests skip all Python tests if a Python version before
3.6 is used for the build.
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>