A vdpa net device must initialize with SVQ in order to be migratable at
this moment, and initialization code verifies some conditions. If the
device is not initialized with the x-svq parameter, it will not expose
_F_LOG so the vhost subsystem will block VM migration from its
initialization.
Next patches change this, so we need to verify migration conditions
differently.
QEMU only supports a subset of net features in SVQ, and it cannot
migrate state that cannot track or restore in the destination. Add a
migration blocker if the device offers an unsupported feature.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-12-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Devices with CVQ need to migrate state beyond vq state. Leaving this to
future series.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-11-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This allows net to restart the device backend to configure SVQ on it.
Ideally, these changes should not be net specific and they could be done
in:
* vhost_vdpa_set_features (with VHOST_F_LOG_ALL)
* vhost_vdpa_set_vring_addr (with .enable_log)
* vhost_vdpa_set_log_base.
However, the vdpa net backend is the one with enough knowledge to
configure everything because of some reasons:
* Queues might need to be shadowed or not depending on its kind (control
vs data).
* Queues need to share the same map translations (iova tree).
Also, there are other problems that may have solutions but complicates
the implementation at this stage:
* We're basically duplicating vhost_dev_start and vhost_dev_stop, and
they could go out of sync. If we want to reuse them, we need a way to
skip some function calls to avoid recursiveness (either vhost_ops ->
vhost_set_features, vhost_set_vring_addr, ...).
* We need to traverse all vhost_dev of a given net device twice: one to
stop and get the vq state and another one after the reset to
configure properties like address, fd, etc.
Because of that it is cleaner to restart the whole net backend and
configure again as expected, similar to how vhost-kernel moves between
userspace and passthrough.
If more kinds of devices need dynamic switching to SVQ we can:
* Create a callback struct like VhostOps and move most of the code
there. VhostOps cannot be reused since all vdpa backend share them,
and to personalize just for networking would be too heavy.
* Add a parent struct or link all the vhost_vdpa or vhost_dev structs so
we can traverse them.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-9-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Only create iova_tree if and when it is needed.
The cleanup keeps being responsible for the last VQ but this change
allows it to merge both cleanup functions.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
VHOST_BACKEND_F_IOTLB_ASID is the feature bit, not the bitmask. Since
the device under test also provided VHOST_BACKEND_F_IOTLB_MSG_V2 and
VHOST_BACKEND_F_IOTLB_BATCH, this went unnoticed.
Fixes: c1a1008685 ("vdpa: always start CVQ in SVQ mode if possible")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
In stream mode, if the server shuts down there is currently
no way to reconnect the client to a new server without removing
the NIC device and the netdev backend (or to reboot).
This patch introduces a reconnect option that specifies a delay
to try to reconnect with the same parameters.
Add a new test in qtest to test the reconnect option and the
connect/disconnect events.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
When the VM is stopped using the HMP command "stop", soon the handler will
stop reading from the vmnet interface. This causes a flood of
`VMNET_INTERFACE_PACKETS_AVAILABLE` events to arrive and puts the host CPU
at 100%. We fix this by removing the event handler from vmnet when the VM
is no longer in a running state and restore it when we return to a running
state.
Signed-off-by: Joelle van Dyne <j@getutm.app>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Increase the allocated buffer size to fit larger packets.
Given that jumboframes can commonly be up to 9000 bytes the closest suitable
value seems to be 16 KiB.
Tested by running qemu towards a Linux L2TPv3 endpoint and pushing
jumboframe traffic through the interfaces.
Signed-off-by: Christian Svensson <blue@cmd.nu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Just because a NIC model is compiled into the QEMU binary does not
necessary mean that it can be used with each and every machine.
So let's rather talk about "available" models instead of "supported"
models, just to avoid confusion.
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Running QEMU with "-nic help" used to work in QEMU 5.2 and earlier versions
(it showed the available netdev backends), but this feature got broken during
some refactoring in version 6.0. Let's restore the old behavior, and while
we're at it, let's also print the available NIC models here now since this
option can be used to configure both, netdev backend and model in one go.
Fixes: ad6f932fe8 ("net: do not exit on "netdev_add help" monitor command")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The code that collects the available NIC models is not really specific
to PCI anymore and will be required in the next patch, too, so let's
move this into a new separate function in net.c instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This commit was created with scripts/clean-includes.
All .c should include qemu/osdep.h first. The script performs three
related cleanups:
* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c already includes
it. Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
Drop these, too.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230202133830.2152150-13-armbru@redhat.com>
This moves these commands from MAINTAINERS sections "Human
Monitor (HMP)" and "QMP" to "Network device backends".
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230124121946.1139465-16-armbru@redhat.com>
* Rework and improvements of the EINTR handling by Nikita
* Deprecate the -no-hpet command line option
* Disable the qtests in the 32-bit Windows CI job again
* Some other misc fixes here and there
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmO8It8RHHRodXRoQHJl
ZGhhdC5jb20ACgkQLtnXdP5wLbUwbA//dXgfHy95C1r2nTMDekk09+KkmNB1f6M8
3HK4ROmmrMT/aP9FwfqMBT7JHM/m4bwOGw0Sula8vfjg9NYGPWuSYjdObWKnrIq/
YORoTxqak9c98Co06EQbAfWn3Pj0ifQkX+FIyzcNGhu4856FWdBsMuyq52VLi36q
Z8ruSOmclzluoIB3mVYY/s5J7ED2A3K0h39frKLE9FGsKObX10KWj+MZyDHi9oGZ
ucTHai12OXgNghjlrwI0BqJziih4NxfIWs0JovSo3cN0at7m57G5JChjR38zTMNT
2Q46tDKoIXesY1GUmVuIgJ5F1Uoshc8Pz5qBSQ5mUbZUQMpivhFrEB666wsYmPd1
M/YwnZ+PFhWjem7p28fKmnmkeATvE0S+vMDifTVZ880nmAbyUm1vFKfqV6r2mBrT
p4iXfh/9easFfJWHueU4fBwyMndDGRaCRJnP8KQ5I9yb0WZbt+/0k/y8CQD8Oxr7
dNFFFoY3KnIO9DCRO5Wr+3OqUgtSAQyhBDf5V2wSMCFrwPHKsvWKSbdiWR3Qe4ck
41InWgawB3xx57+vXraDUA10+nBZ1VrM92ObqfLPTFqjLCom6Fm85cG4YFRLIvRt
rdlOC+ScpeVpec7MwcHrScGL0HmUgPnShDAo07pRy4oKK+c89sXzdAFf2nYJTAWS
WCuChrn7VFM=
=D+Yw
-----END PGP SIGNATURE-----
Merge tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu into staging
* s390x header clean-ups from Philippe
* Rework and improvements of the EINTR handling by Nikita
* Deprecate the -no-hpet command line option
* Disable the qtests in the 32-bit Windows CI job again
* Some other misc fixes here and there
# gpg: Signature made Mon 09 Jan 2023 14:21:19 GMT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu:
.gitlab-ci.d/windows: Do not run the qtests in the msys2-32bit job
error handling: Use RETRY_ON_EINTR() macro where applicable
Refactoring: refactor TFR() macro to RETRY_ON_EINTR()
docs/interop: Change the vnc-ledstate-Pseudo-encoding doc into .rst
i386: Deprecate the -no-hpet QEMU command line option
tests/qtest/bios-tables-test: Replace -no-hpet with hpet=off machine parameter
tests/readconfig: spice doesn't support unix socket on windows yet
target/s390x: Restrict sysemu/reset.h to system emulation
target/s390x/tcg/excp_helper: Restrict system headers to sysemu
target/s390x/tcg/misc_helper: Remove unused "memory.h" include
hw/s390x/pv: Restrict Protected Virtualization to sysemu
exec/memory: Expose memory_region_access_valid()
MAINTAINERS: Add MIPS-related docs and configs to the MIPS architecture section
tests/vm: Update get_default_jobs() to work on non-x86_64 non-KVM hosts
qemu-iotests/stream-under-throttle: do not shutdown QEMU
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There is a defined RETRY_ON_EINTR() macro in qemu/osdep.h
which handles the same while loop.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/415
Signed-off-by: Nikita Ivanov <nivanov@cloudlinux.com>
Message-Id: <20221023090422.242617-3-nivanov@cloudlinux.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[thuth: Dropped the hunk that changed socket_accept() in libqtest.c]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Rename macro name to more transparent one and refactor
it to expression.
Signed-off-by: Nikita Ivanov <nivanov@cloudlinux.com>
Message-Id: <20221023090422.242617-2-nivanov@cloudlinux.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We should stop if the GET_IOVA_RANGE ioctl failed.
Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221224114848.3062-3-longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
In commit a585fad26b ("vdpa: request iova_range only once") we remove
GET_IOVA_RANGE form vhost_vdpa_init, the generic vdpa device will start
without iova_range populated, so the device won't work. Let's call
GET_IOVA_RANGE ioctl explicitly.
Fixes: a585fad26b ("vdpa: request iova_range only once")
Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221224114848.3062-2-longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Use vhost_user_save_acked_features to implemente acked features
saving.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Message-Id: <6699ee88687b62fb8152fe021e576cd2f468d7ca.1671627406.git.huangy81@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Abstract vhost acked features saving into
vhost_user_save_acked_features, export it as util function.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Message-Id: <50dc9b09b0635e3052551efcc1046c2a85332fcb.1671627406.git.huangy81@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
So qemu emulates it even in case the device does not support it.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221221115015.1400889-5-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since this capability is emulated by qemu shadowed CVQ cannot forward it
to the device. Process all that command within qemu.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20221221115015.1400889-4-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Isolate control virtqueue in its own group, allowing to intercept control
commands but letting dataplane run totally passthrough to the guest.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20221215113144.322011-13-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
The memory listener that thells the device how to convert GPA to qemu's
va is registered against CVQ vhost_vdpa. memory listener translations
are always ASID 0, CVQ ones are ASID 1 if supported.
Let's tell the listener if it needs to register them on iova tree or
not.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221215113144.322011-12-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
CVQ can be shadowed two ways:
- Device has x-svq=on parameter (current way)
- The device can isolate CVQ in its own vq group
QEMU needs to check for the second condition dynamically, because CVQ
index is not known before the driver ack the features. Since this is
dynamic, the CVQ isolation could vary with different conditions, making
it possible to go from "not isolated group" to "isolated".
Saving the cmdline parameter in an extra field so we never disable CVQ
SVQ in case the device was started with x-svq cmdline.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221215113144.322011-11-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
So the caller can choose which ASID is destined.
No need to update the batch functions as they will always be called from
memory listener updates at the moment. Memory listener updates will
always update ASID 0, as it's the passthrough ASID.
All vhost devices's ASID are 0 at this moment.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221215113144.322011-10-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The next patches will start control SVQ if possible. However, we don't
know if that will be possible at qemu boot anymore.
Since the moved checks will be already evaluated at net/ to know if it
is ok to shadow CVQ, move them.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221215113144.322011-8-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently iova range is requested once per queue pair in the case of
net. Reduce the number of ioctls asking it once at initialization and
reusing that value for each vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20221215113144.322011-7-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasonwang@redhat.com>
It will be reused at vdpa device start so let's extract in its own
function.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221215113144.322011-6-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmOZ6lYSHGFybWJydUBy
ZWRoYXQuY29tAAoJEDhwtADrkYZT6VEQAKynjWh3AIZ4/qOgrVqsP0oRspevLmfH
BbuGoldjYpEE7RbwuCaZalZ7iy7TcSySxnPfUDVsFHd7NWffJVjwKHifGC0D/Ez0
+Ggyb1CBebN+mS7t+BNFUHdMM+wxFIlHwg4f4aTFbn2o0HKgj2a8tcNzNRonZbfa
xURnvbD4G4u0VZEc3Jak+x193xbOJFsuuWq0BZnDuNk+XqjyW2RwfpXLPJVk+82a
4uy/YgYuqXUqBeULwcJj+shBL4SXR9GyajTFMS64przSUle0ADUmXkPtaS2agV7e
Pym/UQuAcxvNyw34fJsiMZxx6rZI9YU30jQUMRLoYcPRR/Q/aiPeiiHtiD6Kaid7
IfOeH/EArXaQRFpD89xj4YcaTnRLQOEj0NXgXvAbQf6eD8JYyao/S/0lCsPZEoA2
nibLqEQ25ncDNXoSomuwtfjVff3w68lODFbhwqfA0gf3cPtCgVZ6xQ8P/McNY6K6
wqFHXMWTDHk1LOCTucjYz1z2TGzTnSG4iWi5Yt6FSxAc958AO+v5ALn/1pcYun+E
azM/MF0AInKj2aJCT530zT0tpCs/Jo07YKC8k6ubi77S0ZdmGS1XLeXkRXfk1+yI
OhuUgiVlSTHxD69DagT2vbnx1mDMM9X+OBIMvEi5nwvD9A/ghaCgkDeGFvbA1ud0
t0mxPBZJ+tiZ
=JJjG
-----END PGP SIGNATURE-----
Merge tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru into staging
Miscellaneous patches for 2022-12-14
# gpg: Signature made Wed 14 Dec 2022 15:23:02 GMT
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru:
ppc4xx_sdram: Simplify sdram_ddr_size() to return
block/vmdk: Simplify vmdk_co_create() to return directly
cleanup: Tweak and re-run return_directly.cocci
io: Tidy up fat-fingered parameter name
qapi: Use returned bool to check for failure (again)
sockets: Use ERRP_GUARD() where obviously appropriate
qemu-config: Use ERRP_GUARD() where obviously appropriate
qemu-config: Make config_parse_qdict() return bool
monitor: Use ERRP_GUARD() in monitor_init()
monitor: Simplify monitor_fd_param()'s error handling
error: Move ERRP_GUARD() to the beginning of the function
error: Drop a few superfluous ERRP_GUARD()
error: Drop some obviously superfluous error_propagate()
Drop more useless casts from void * to pointer
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The has_FOO for pointer-valued FOO are redundant, except for arrays.
They are also a nuisance to work with. Recent commit "qapi: Start to
elide redundant has_FOO in generated C" provided the means to elide
them step by step. This is the step for qapi/net.json.
Said commit explains the transformation in more detail. The invariant
violations mentioned there do not occur here.
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221104160712.3005652-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[Fixes for MacOS squashed in]
Commit 012d4c96e2 changed the visitor functions taking Error ** to
return bool instead of void, and the commits following it used the new
return value to simplify error checking. Since then a few more uses
in need of the same treatment crept in. Do that. All pretty
mechanical except for
* balloon_stats_get_all()
This is basically the same transformation commit 012d4c96e2 applied
to the virtual walk example in include/qapi/visitor.h.
* set_max_queue_size()
Additionally replace "goto end of function" by return.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221121085054.683122-10-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
With the G_GNUC_PRINTF function attribute the compiler detects
two potential insecure format strings:
../../../net/stream.c:248:31: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
qemu_set_info_str(&s->nc, uri);
^~~
../../../net/stream.c:322:31: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
qemu_set_info_str(&s->nc, uri);
^~~
There are also two other warnings:
../../../net/socket.c:182:35: warning: zero-length gnu_printf format string [-Wformat-zero-length]
182 | qemu_set_info_str(&s->nc, "");
| ^~
../../../net/stream.c:170:35: warning: zero-length gnu_printf format string [-Wformat-zero-length]
170 | qemu_set_info_str(&s->nc, "");
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221126152507.283271-7-sw@weilnetz.de>
Commit 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
enabled VIRTIO_F_RING_RESET by default for all virtio devices.
This feature is not currently emulated by QEMU, so for vhost and
vhost-user devices we need to make sure it is supported by the offloaded
device emulation (in-kernel or in another process).
To do this we need to add VIRTIO_F_RING_RESET to the features bitmap
passed to vhost_get_features(). This way it will be masked if the device
does not support it.
This issue was initially discovered with vhost-vsock and vhost-user-vsock,
and then also tested with vhost-user-rng which confirmed the same issue.
They fail when sending features through VHOST_SET_FEATURES ioctl or
VHOST_USER_SET_FEATURES message, since VIRTIO_F_RING_RESET is negotiated
by the guest (Linux >= v6.0), but not supported by the device.
Fixes: 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1318
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20221121101101.29400-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Replaces TABs with spaces, making sure to have a consistent coding style
of 4 space indentations in the net subsystem.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/377
Signed-off-by: Ahmed Abouzied <email@aabouzied.com>
Message-Id: <20210614183849.20622-1-email@aabouzied.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[thuth: Fixed mis-aligned indentation in some of the files]
Signed-off-by: Thomas Huth <thuth@redhat.com>
The citing commit has incorrect code in vhost_vdpa_receive() that returns
zero instead of full packet size to the caller. This renders pending packets
unable to be freed so then get clogged in the tx queue forever. When device
is being reset later on, below assertion failure ensues:
0 0x00007f86d53bb387 in raise () from /lib64/libc.so.6
1 0x00007f86d53bca78 in abort () from /lib64/libc.so.6
2 0x00007f86d53b41a6 in __assert_fail_base () from /lib64/libc.so.6
3 0x00007f86d53b4252 in __assert_fail () from /lib64/libc.so.6
4 0x000055b8f6ff6fcc in virtio_net_reset (vdev=<optimized out>) at /usr/src/debug/qemu/hw/net/virtio-net.c:563
5 0x000055b8f7012fcf in virtio_reset (opaque=0x55b8faf881f0) at /usr/src/debug/qemu/hw/virtio/virtio.c:1993
6 0x000055b8f71f0086 in virtio_bus_reset (bus=bus@entry=0x55b8faf88178) at /usr/src/debug/qemu/hw/virtio/virtio-bus.c:102
7 0x000055b8f71f1620 in virtio_pci_reset (qdev=<optimized out>) at /usr/src/debug/qemu/hw/virtio/virtio-pci.c:1845
8 0x000055b8f6fafc6c in memory_region_write_accessor (mr=<optimized out>, addr=<optimized out>, value=<optimized out>,
size=<optimized out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at /usr/src/debug/qemu/memory.c:483
9 0x000055b8f6fadce9 in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f867e7fb7e8, size=size@entry=1,
access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x55b8f6fafc20 <memory_region_write_accessor>,
mr=0x55b8faf80a50, attrs=...) at /usr/src/debug/qemu/memory.c:544
10 0x000055b8f6fb1d0b in memory_region_dispatch_write (mr=mr@entry=0x55b8faf80a50, addr=addr@entry=20, data=0, op=<optimized out>,
attrs=attrs@entry=...) at /usr/src/debug/qemu/memory.c:1470
11 0x000055b8f6f62ada in flatview_write_continue (fv=fv@entry=0x7f86ac04cd20, addr=addr@entry=549755813908, attrs=...,
attrs@entry=..., buf=buf@entry=0x7f86d0223028 <Address 0x7f86d0223028 out of bounds>, len=len@entry=1, addr1=20, l=1,
mr=0x55b8faf80a50) at /usr/src/debug/qemu/exec.c:3266
12 0x000055b8f6f62c8f in flatview_write (fv=0x7f86ac04cd20, addr=549755813908, attrs=...,
buf=0x7f86d0223028 <Address 0x7f86d0223028 out of bounds>, len=1) at /usr/src/debug/qemu/exec.c:3306
13 0x000055b8f6f674cb in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>,
len=<optimized out>) at /usr/src/debug/qemu/exec.c:3396
14 0x000055b8f6f67575 in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=..., attrs@entry=...,
buf=buf@entry=0x7f86d0223028 <Address 0x7f86d0223028 out of bounds>, len=<optimized out>, is_write=<optimized out>)
at /usr/src/debug/qemu/exec.c:3406
15 0x000055b8f6fc1cc8 in kvm_cpu_exec (cpu=cpu@entry=0x55b8f9aa0e10) at /usr/src/debug/qemu/accel/kvm/kvm-all.c:2410
16 0x000055b8f6fa5f5e in qemu_kvm_cpu_thread_fn (arg=0x55b8f9aa0e10) at /usr/src/debug/qemu/cpus.c:1318
17 0x000055b8f7336e16 in qemu_thread_start (args=0x55b8f9ac8480) at /usr/src/debug/qemu/util/qemu-thread-posix.c:519
18 0x00007f86d575aea5 in start_thread () from /lib64/libpthread.so.0
19 0x00007f86d5483b2d in clone () from /lib64/libc.so.6
Make vhost_vdpa_receive() return the size passed in as is, so that the
caller qemu_deliver_packet_iov() would eventually propagate it back to
virtio_net_flush_tx() to release pending packets from the async_tx queue.
Which corresponds to the drop path where qemu_sendv_packet_async() returns
non-zero in virtio_net_flush_tx().
Fixes: 846a1e85da ("vdpa: Add dummy receive callback")
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221108041929.18417-2-jasowang@redhat.com>
Commit 8801ccd050 introduced a compilation failure with clang
version 10.0.0-4ubuntu1:
../../net/vhost-vdpa.c:654:16: error: variable 'vdpa_device_fd' is
used uninitialized whenever 'if' condition is false
[-Werror,-Wsometimes-uninitialized]
} else if (opts->has_vhostfd) {
^~~~~~~~~~~~~~~~~
../../net/vhost-vdpa.c:662:33: note: uninitialized use occurs here
r = vhost_vdpa_get_features(vdpa_device_fd, &features, errp);
^~~~~~~~~~~~~~
../../net/vhost-vdpa.c:654:12: note: remove the 'if' if its condition
is always true
} else if (opts->has_vhostfd) {
^~~~~~~~~~~~~~~~~~~~~~~
../../net/vhost-vdpa.c:629:23: note: initialize the variable
'vdpa_device_fd' to silence this warning
int vdpa_device_fd;
^
= 0
1 error generated.
It's a false positive -- the compiler doesn't manage to figure out
that the error checks further up mean that there's no code path where
vdpa_device_fd isn't initialized. Put another way, the problem is
that we check "if (opts->has_vhostfd)" when in fact that condition
must always be true. A cleverer static analyser would probably warn
that we were checking an always-true condition.
Fix the compilation failure by removing the unnecessary if().
Fixes: 8801ccd050 ("vhost-vdpa: allow passing opened vhostfd to vhost-vdpa")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20221031132901.1277150-1-peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The netdev reports NETDEV_STREAM_CONNECTED event when the backend
is connected, and NETDEV_STREAM_DISCONNECTED when it is disconnected.
The NETDEV_STREAM_CONNECTED event includes the destination address.
This allows a system manager like libvirt to detect when the server
fails.
For instance with passt:
{ 'execute': 'qmp_capabilities' }
{ "return": { } }
{ "timestamp": { "seconds": 1666341395, "microseconds": 505347 },
"event": "NETDEV_STREAM_CONNECTED",
"data": { "netdev-id": "netdev0",
"addr": { "path": "/tmp/passt_1.socket", "type": "unix" } } }
[killing passt here]
{ "timestamp": { "seconds": 1666341430, "microseconds": 968694 },
"event": "NETDEV_STREAM_DISCONNECTED",
"data": { "netdev-id": "netdev0" } }
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Use QIOChannel, QIOChannelSocket and QIONetListener.
This allows net/stream to use all the available parameters provided by
SocketAddress.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com> (QAPI schema)
Signed-off-by: Jason Wang <jasowang@redhat.com>
It is less complex to manage special cases directly in
net_dgram_mcast_init() and net_dgram_udp_init().
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
dgram_dst is a sockaddr_in structure. To be able to use it with
unix socket, use a pointer to a generic sockaddr structure.
Rename it dest_addr, and store socket length in dest_len.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com> (QAPI schema)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Other errors are treated as failure by net_stream_client_init(),
but if connect() returns EINVAL, we'll fail silently. Remove the
related exception.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
[lvivier: applied to net/stream.c]
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Other errors are treated as failure by net_socket_connect_init(),
but if connect() returns EINVAL, we'll fail silently. Remove the
related exception.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Copied from socket netdev file and modified to use SocketAddress
to be able to introduce new features like unix socket.
"udp" and "mcast" are squashed into dgram netdev, multicast is detected
according to the IP address type.
"listen" and "connect" modes are managed by stream netdev. An optional
parameter "server" defines the mode (off by default)
The two new types need to be parsed the modern way with -netdev, because
with the traditional way, the "type" field of netdev structure collides with
the "type" field of SocketAddress and prevents the correct evaluation of the
command line option. Moreover the traditional way doesn't allow to use
the same type (SocketAddress) several times with the -netdev option
(needed to specify "local" and "remote" addresses).
The previous commit paved the way for parsing the modern way, but
omitted one detail: how to pick modern vs. traditional, in
netdev_is_modern().
We want to pick based on the value of parameter "type". But how to
extract it from the option argument?
Parsing the option argument, either the modern or the traditional way,
extracts it for us, but only if parsing succeeds.
If parsing fails, there is no good option. No matter which parser we
pick, it'll be the wrong one for some arguments, and the error
reporting will be confusing.
Fortunately, the traditional parser accepts *anything* when called in
a certain way. This maximizes our chance to extract the value of
"type", and in turn minimizes the risk of confusing error reporting.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Embed the setting of info_str in a function.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
As qemu_opts_parse_noisily() flattens the QAPI structures ("type" field
of Netdev structure can collides with "type" field of SocketAddress),
we introduce a way to bypass qemu_opts_parse_noisily() and use directly
visit_type_Netdev() to parse the backend parameters.
More details from Markus:
qemu_init() passes the argument of -netdev, -nic, and -net to
net_client_parse().
net_client_parse() parses with qemu_opts_parse_noisily(), passing
QemuOptsList qemu_netdev_opts for -netdev, qemu_nic_opts for -nic, and
qemu_net_opts for -net. Their desc[] are all empty, which means any
keys are accepted. The result of the parse (a QemuOpts) is stored in
the QemuOptsList.
Note that QemuOpts is flat by design. In some places, we layer non-flat
on top using dotted keys convention, but not here.
net_init_clients() iterates over the stored QemuOpts, and passes them to
net_init_netdev(), net_param_nic(), or net_init_client(), respectively.
These functions pass the QemuOpts to net_client_init(). They also do
other things with the QemuOpts, which we can ignore here.
net_client_init() uses the opts visitor to convert the (flat) QemOpts to
a (non-flat) QAPI object Netdev. Netdev is also the argument of QMP
command netdev_add.
The opts visitor was an early attempt to support QAPI in
(QemuOpts-based) CLI. It restricts QAPI types to a certain shape; see
commit eb7ee2cbeb "qapi: introduce OptsVisitor".
A more modern way to support QAPI is qobject_input_visitor_new_str().
It uses keyval_parse() instead of QemuOpts for KEY=VALUE,... syntax, and
it also supports JSON syntax. The former isn't quite as expressive as
JSON, but it's a lot closer than QemuOpts + opts visitor.
This commit paves the way to use of the modern way instead.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
All net_client_parse() callers exit in case of error.
Move exit(1) to net_client_parse() and remove error checking from
the callers.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The only caller passes &error_fatal, so use this directly in the function.
It's what we do for -blockdev, -device, and -object.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The current message when using '-net user...' with SLIRP disabled at
compile time is:
qemu-system-x86_64: -net user: Parameter 'type' expects a net backend type (maybe it is not compiled into this binary)
An observation is that we're using the 'netdev->type' field here which
is an enum value, produced after QAPI has converted from its string
form.
IOW, at this point in the code, we know that the user's specified
type name was a valid network backend. The only possible scenario that
can make the backend init function be NULL, is if support for that
backend was disabled at build time. Given this, we don't need to caveat
our error message with a 'maybe' hint, we can be totally explicit.
The use of QERR_INVALID_PARAMETER_VALUE doesn't really lend itself to
user friendly error message text. Since this is not used to set a
specific QAPI error class, we can simply stop using this pre-formatted
error text and provide something better.
Thus the new message is:
qemu-system-x86_64: -net user: network backend 'user' is not compiled into this binary
The case of passing 'hubport' for -net is also given a message reminding
people they should have used -netdev/-nic instead, as this backend type
is only valid for the modern syntax.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Similar to other vhost backends, vhostfd can be passed to vhost-vdpa
backend as another parameter to instantiate vhost-vdpa net client.
This would benefit the use case where only open file descriptors, as
opposed to raw vhost-vdpa device paths, are accessible from the QEMU
process.
(qemu) netdev_add type=vhost-vdpa,vhostfd=61,id=vhost-vdpa1
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The guest will see undefined behavior if it issue not negotiate
commands, bit it is expected somehow.
Simplify code deleting this check.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This entry was duplicated on referenced commit. Removing it.
Fixes: 402378407d ("vhost-vdpa: multiqueue support")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
If the net tap initializes successful, but failed during
network card hot-plugging, the net-tap will remains,
so cleanup.
Signed-off-by: lu zhipeng <luzhipeng@cestc.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
So we are sure we can update the device model properly before sending to
the device.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Same way as with the MAC, restore the expected number of queues at
device's start.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Since there may be many commands we need to issue to load the NIC
state, let's split them in individual functions
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This allows to simplify the code. Rename to status while we're at it.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
When enabled the virtio-net-pci, guest network packet will
load the vnet_hdr. In COLO status, the primary VM's network
packet maybe redirect to another VM, it needs filter-redirect
enable the vnet_hdr flag at the same time, COLO-proxy will
correctly parse the original network packet. If have any
misconfiguration here, the vnet_hdr_len is wrong for parse
the packet, the data+offset will point to wrong place.
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
We can restore the device state in the destination via CVQ now. Remove
the migration blocker.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This is needed so the destination vdpa device see the same state a the
guest set in the source.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
So we can reuse it to inject state messages.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
--
v7:
* Remove double free error
v6:
* Do not assume in buffer sent to the device is sizeof(virtio_net_ctrl_ack)
v5:
* Do not use an artificial !NULL VirtQueueElement
* Use only out size instead of iovec dev_buffers for these functions.
Signed-off-by: Jason Wang <jasowang@redhat.com>
As this series will reuse them to restore the device state at the end of
a migration (or a device start), let's allocate only once at the device
start so we don't duplicate their map and unmap.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Next patches will add a new info callback to restore NIC status through
CVQ. Since only the CVQ vhost device is needed, create it with a new
NetClientInfo.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
It's convenient to call iova_tree_remove from a map returned from
iova_tree_find or iova_tree_find_iova. With the current code this is not
possible, since we will free it, and then we will try to search for it
again.
Fix it making accepting the map by value, forcing a copy of the
argument. Not applying a fixes tag, since there is no use like that at
the moment.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
File descriptor vdpa_device_fd is not free in the case of returning
error from vhost_vdpa_get_features. Fixing it by making all errors go to
the same error path.
Resolves: Coverity CID 1490785
Fixes: 8170ab3f43 ("vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20220802112447.249436-2-eperezma@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
When COLO use only one vnet_hdr_support parameter between
filter-redirector and filter-mirror(or colo-compare), COLO will crash
with segmentation fault. Back track as follow:
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555cb200b in eth_get_l2_hdr_length (p=0x0)
at /home/tao/project/COLO/colo-qemu/include/net/eth.h:296
296 uint16_t proto = be16_to_cpu(PKT_GET_ETH_HDR(p)->h_proto);
(gdb) bt
0 0x0000555555cb200b in eth_get_l2_hdr_length (p=0x0)
at /home/tao/project/COLO/colo-qemu/include/net/eth.h:296
1 0x0000555555cb22b4 in parse_packet_early (pkt=0x555556a44840) at
net/colo.c:49
2 0x0000555555cb2b91 in is_tcp_packet (pkt=0x555556a44840) at
net/filter-rewriter.c:63
So wrong vnet_hdr_len will cause pkt->data become NULL. Add check to
raise error and add trace-events to track vnet_hdr_len.
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Filter-rewriter no need to track connection in conn_list.
This patch fix the glib g_queue_is_empty assertion when COLO guest
keep a lot of network connection.
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
We notice the QEMU may crash when the guest has too many
incoming network connections with the following log:
15197@1593578622.668573:colo_proxy_main : colo proxy connection hashtable full, clear it
free(): invalid pointer
[1] 15195 abort (core dumped) qemu-system-x86_64 ....
This is because we create the s->connection_track_table with
g_hash_table_new_full() which is defined as:
GHashTable * g_hash_table_new_full (GHashFunc hash_func,
GEqualFunc key_equal_func,
GDestroyNotify key_destroy_func,
GDestroyNotify value_destroy_func);
The fourth parameter connection_destroy() will be called to free the
memory allocated for all 'Connection' values in the hashtable when
we call g_hash_table_remove_all() in the connection_hashtable_reset().
But both connection_track_table and conn_list reference to the same
conn instance. It will trigger double free in conn_list clear. So this
patch remove free action on hash table side to avoid double free the
conn.
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Finally offering the possibility to enable SVQ from the command line.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
To know the device features is needed for CVQ SVQ, so SVQ knows if it
can handle all commands or not. Extract from
vhost_vdpa_get_max_queue_pairs so we can reuse it.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Introduce the control virtqueue support for vDPA shadow virtqueue. This
is needed for advanced networking features like rx filtering.
Virtio-net control VQ copies the descriptors to qemu's VA, so we avoid
TOCTOU with the guest's or device's memory every time there is a device
model change. Otherwise, the guest could change the memory content in
the time between qemu and the device read it.
To demonstrate command handling, VIRTIO_NET_F_CTRL_MACADDR is
implemented. If the virtio-net driver changes MAC the virtio-net device
model will be updated with the new one, and a rx filtering change event
will be raised.
More cvq commands could be added here straightforwardly but they have
not been tested.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Do a simple forwarding of CVQ buffers, the same work SVQ could do but
through callbacks. No functional change intended.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
net/vhost-vdpa.c will need functions that are declared in
vhost-shadow-virtqueue.c, that needs functions of virtio-net.c.
Copy the vhost-vdpa-stub.c code so
only the constructor net_init_vhost_vdpa needs to be defined.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Interaction with vmnet.framework in different modes
differs only on configuration stage, so we can create
common `send`, `receive`, etc. procedures and reuse them.
Reviewed-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Tested-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Signed-off-by: Phillip Tennen <phillip@axleos.com>
Signed-off-by: Vladislav Yaroshchuk <Vladislav.Yaroshchuk@jetbrains.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
... such that no memory leaks on dangling net clients in case of
error.
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <1651890498-24478-4-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
libslirp 4.7 introduces a CFI-friendly version of the .timer_new callback.
The new callback replaces the function pointer with an enum; invoking the
callback is done with a new function slirp_handle_timer.
Support the new API so that CFI can be made compatible with using a system
libslirp.
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Marc-André Lureau <malureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Replace slirp_init with slirp_new, so that a more recent cfg.version
can be specified. The function appeared in version 4.1.0.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This struct will be extended in the next few patches to support the
new slirp_handle_timer() call. For that we need to store an additional
"int" for each SLIRP timer, in addition to the cb_opaque.
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Marc-André Lureau <malureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The qemu_*block() functions are meant to be be used with sockets (the
win32 implementation expects SOCKET)
Over time, those functions where used with Win32 SOCKET or
file-descriptors interchangeably. But for portability, they must only be
used with socket-like file-descriptors. FDs can use
g_unix_set_fd_nonblocking() instead.
Rename the functions with "socket" in the name to prevent bad usages.
This is effectively reverting commit f9e8cacc55 ("oslib-posix:
rename socket_set_nonblock() to qemu_set_nonblock()").
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Those calls are POSIX-specific. Use the dedicated GLib
API. (qemu_set_nonblock() is for socket-like)
(this is a preliminary patch before renaming qemu_set_nonblock())
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
All uses flush output immediately before or after qemu_log_unlock.
Instead of a separate call, move the flush into qemu_log_unlock.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220417183019.755276-20-richard.henderson@linaro.org>
Inside qemu_log, we perform qemu_log_trylock/unlock, which need
not be done if we have already performed the lock beforehand.
Always check the result of qemu_log_trylock -- only checking
qemu_loglevel_mask races with the acquisition of the lock on
the logfile.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220417183019.755276-10-richard.henderson@linaro.org>
This function can fail, which makes it more like ftrylockfile
or pthread_mutex_trylock than flockfile or pthread_mutex_lock,
so rename it.
To closer match the other trylock functions, release rcu_read_lock
along the failure path, so that qemu_log_unlock need not be called
on failure.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220417183019.755276-8-richard.henderson@linaro.org>
Replace a config-time define with a compile time condition
define (compatible with clang and gcc) that must be declared prior to
its usage. This avoids having a global configure time define, but also
prevents from bad usage, if the config header wasn't included before.
This can help to make some code independent from qemu too.
gcc supports __BYTE_ORDER__ from about 4.6 and clang from 3.2.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[ For the s390x parts I'm involved in ]
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220323155743.1585078-7-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move qemu_ether_ntoa() which is only needed in net/.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
The socket API wrappers were initially introduced in commit
00aa0040 ("Wrap recv to avoid warnings"), but made redundant with
commit a2d96af4 ("osdep: add wrappers for socket functions") which fixes
the win32 declarations and thus removed the earlier warnings.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Move the various memalign-related functions out of osdep.h and into
their own header, which we include only where they are used.
While we're doing this, add some brief documentation comments.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org
The IPv6 option headers all have in common that they start with some
common fields, in particular the type of the next header followed by the
extention header length. This is used to traverse the list of the
options. The ESP header does not follow that format, which can break the
IPv6 option header traversal code in eth_parse_ipv6_hdr().
The effect of that is that network interfaces such as vmxnet3 that use
the following call chain
eth_is_ip6_extension_header_type
eth_parse_ipv6_hdr
net_tx_pkt_parse_headers
net_tx_pkt_parse
vmxnet3_process_tx_queue
to send packets from the VM out to the host will drop packets of the
following structure:
Ethernet-Header(IPv6-Header(ESP(encrypted data)))
Note that not all types of network interfaces use the net_tx_pkt_parse
function though, leading to inconsistent behavior regarding sending
those packets. The e1000 network interface for example does not suffer
from this limitation.
By not considering ESP to be an IPv6 header we can allow sending those
packets out to the host on all types of network interfaces.
Fixes: 75020a7021 ("Common definitions for VMWARE devices")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/149
Buglink: https://bugs.launchpad.net/qemu/+bug/1758091
Signed-off-by: Thomas Jansen <mithi@mithi.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This patch is to improve the logic of QEMU main thread sleep code in
qemu_chr_write_buffer() where it can be blocked and can't run other
coroutines during COLO IO stress test.
Our approach is to put filter_send() in a coroutine. In this way,
filter_send() will call qemu_coroutine_yield() in qemu_co_sleep_ns(),
so that it can be scheduled out and QEMU main thread has opportunity to
run other tasks.
Signed-off-by: Lei Rao <lei.rao@intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Make the comments consistent with the REGULAR_PACKET_CHECK_MS.
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
COLO-compare use the glib function g_queue_find_custom to dump
another VM's networking packet to compare. But this function always
start find from the queue->head(here is the newest packet), It will
reduce the success rate of comparison. So this patch reversed
the order of the queues for performance.
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reported-by: leirao <lei.rao@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Match the other error handling in this function.
Fixes: e7b347d0bf ("net: detect errors from probing vnet hdr flag for TAP devices")
Reviewed-by: Patrick Venture <venture@google.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Foley <pefoley@google.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
softmmu/rtc.c defines two public functions: qemu_get_timedate() and
qemu_timedate_diff(). Currently we keep the prototypes for these in
qemu-common.h, but most files don't need them. Move them to their
own header, a new include/sysemu/rtc.h.
Since the C files using these two functions did not need to include
qemu-common.h for any other reason, we can remove those include lines
when we add the include of the new rtc.h.
The license for the .h file follows that of the softmmu/rtc.c
where both the functions are defined.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Qemu falls back on userland handlers even if vhost-user and vhost-vdpa
cases. These assumes a tap device can handle the packets.
If a vdpa device fail to start, it can trigger a sigsegv because of
that. Add dummy receiver that returns no progress so it can keep
running.
Fixes: 1e0a84ea49 ("vhost-vdpa: introduce vhost-vdpa net client")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20211125101614.76927-2-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
The TCP protocol ACK maybe bigger than uint32_t MAX.
At this time, the ACK will reverse to 0. This patch
fix the max_ack and min_ack track issue.
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Since net_init_vhost_vdpa is trying to open it. Not specifying it in the
command line crash qemu.
Fixes: 7327813d17 ("vhost-vdpa: open device fd in net_init_vhost_vdpa()")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20211112193431.2379298-3-eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There is no reason to keep using the old one, since we neither use the
variadics arguments nor open it with O_DIRECT.
Also, net_client_init1, the caller of net_init_vhost_vdpa, wants all
net_client_init_fun to use Error API, so it's a good step in that
direction.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20211112193431.2379298-2-eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use g_autofree to ensure that `config` is freed when
vhost_vdpa_get_max_queue_pairs() returns.
Reported-by: Coverity (CID 1465228: RESOURCE_LEAK)
Fixes: 402378407d ("vhost-vdpa: multiqueue support")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20211102155157.241034-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Hi
This are the pending migration patches on the list:
- Provide an error message for migration_cancel by Laurent
- Don't dump colo cache when a guest core is requested by Lukas
- Initialise Compression_conters for new migration by Yuxiating
On top of that I added another missing initialization
- Colo optimizations and crash improvements by Rao.
Please, apply.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmGCTC8ACgkQ9IfvGFhy
1yMz3A//YNORIjbnvV9zhJPIBWBXf9vo/Cdzh8RshI7QyuW/a61poM3/yxwErTRG
IFJagmwVvCm7lXqvNBY2XDLkS0DURZmCC6HiUTnomgCOSpLgpLwkJTYqsQRVel4Q
Ow1CuEhmIa/XWKxvhUS2KfQx3iy4pK+5Z+DmamW/VnakqoviGvLkwvgALAA6bP0l
1Ddbg6i9BRLJM5Jeo1UJRmDJfldJTr4k962lrX4VRBKqxv/PQRLuVkALXfBCjmY8
uqiPPfDlVx7XE2ralyMmP5qgvJT88jGeNDcpuCWAKxh3OsKwPtm3qWXkEvP7uuDj
6FfWulUT7QNduU1DWnmQUwvz1lRX16EhjDDgRyvgVv1Vn21z+8C751bLlLR0M+rt
Ip2PZfv3cgOLhUpyY3+jf0uTGFVb9q4kYijvAOoOcFPuLbL1u1N+ljDz5vB/OzuR
qxIajFq9uvR9uBP8JL8WIkDQa5laPdwPyLhuMMvNfmUsUoSlgV1r8hwqUfLUr6SN
vaCUWHAwmik8BiLAurlahsxoc76v6tRga6OTYJxOon7MrKcxab0BO5ndTfMRLncN
cffxdh1R6D1jb4uYR0f5+HI0nl86IlS2ReE+pd1mPA3xyv3ZyUzP4txJkV8k/eHs
6hVG6ia5MJLcvlfZyNatAgmK4Y3L98+JiaBIkUoLibu2tu/DsEg=
=msYG
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/juanquintela/tags/migration-20211102-pull-request' into staging
Migration Pull request
Hi
This are the pending migration patches on the list:
- Provide an error message for migration_cancel by Laurent
- Don't dump colo cache when a guest core is requested by Lukas
- Initialise Compression_conters for new migration by Yuxiating
On top of that I added another missing initialization
- Colo optimizations and crash improvements by Rao.
Please, apply.
# gpg: Signature made Wed 03 Nov 2021 04:45:35 AM EDT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [full]
* remotes/juanquintela/tags/migration-20211102-pull-request:
Optimized the function of fill_connection_key.
colo: Don't dump colo cache if dump-guest-core=off
Changed the last-mode to none of first start COLO
Removed the qemu_fclose() in colo_process_incoming_thread
colo: fixed 'Segmentation fault' when the simplex mode PVM poweroff
Fixed SVM hang when do failover before PVM crash
Fixed qemu crash when guest power off in COLO mode
Some minor optimizations for COLO
migration: Zero migration compression counters
migration: initialise compression_counters for a new migration
migration: provide an error message to migration_cancel()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Remove some unnecessary code to improve the performance of
the filter-rewriter module.
Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@trasno.org>
Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
And while we're at it, also provide a proper entry for this feature
in meson_options.txt, so that people who don't need it have a knob
to disable this feature.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20211028185910.1729744-3-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch implements the multiqueue support for vhost-vdpa. This is
done simply by reading the number of queue pairs from the config space
and initialize the datapath and control path net client.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-11-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces a boolean for the device has control queue which
can accepts control command via network queue.
The first user would be the control virtqueue support for vhost.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-6-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch switches to let net_vhost_vdpa_init() to return
NetClientState *. This is used for the callers to allocate multiqueue
NetClientState for multiqueue support.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-5-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch switches to open device fd in net_init_vhost_vpda(). This is
used to prepare for the multiqueue support.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-2-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost-vdpa works only with specific devices. At startup, it second
guesses what the command line option handling will do and error out if
it thinks a non-virtio device will attach to them.
This second guessing is not only ugly, it can lead to wrong error
messages ('-device floppy,netdev=foo' should complain about an unknown
property, not about the wrong kind of network device being attached) and
completely ignores hotplugging.
Drop the old checks and implement .check_peer_type() instead to fix
this. As a nice side effect, it also removes one more dependency on the
legacy QemuOpts infrastructure and even reduces the code size.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211008133442.141332-4-kwolf@redhat.com>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
vhost-user works only with specific devices. At startup, it second
guesses what the command line option handling will do and error out if
it thinks a non-virtio device will attach to them.
This second guessing is not only ugly, it can lead to wrong error
messages ('-device floppy,netdev=foo' should complain about an unknown
property, not about the wrong kind of network device being attached) and
completely ignores hotplugging.
Drop the old checks and implement .check_peer_type() instead to fix
this. As a nice side effect, it also removes one more dependency on the
legacy QemuOpts infrastructure and even reduces the code size.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211008133442.141332-3-kwolf@redhat.com>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The queue_index of NetClientState should be assigned in set_netdev()
afterwards, so trying to net_vhost_vdpa_init() is meaningless. This
patch removes this.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-12-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Vhost_vdpa_add() can fail for various reasons, so the assertion of the
succeed is wrong. Instead, we should free the NetClientState and
propagate the error to the caller
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-11-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce new error label to avoid the unnecessary checking of net
pointer.
Fixes: 1e0a84ea49 ("vhost-vdpa: introduce vhost-vdpa net client")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-10-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: 1e0a84ea49 ("vhost-vdpa: introduce vhost-vdpa net client")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-9-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The previous vhost_net_cleanup is sufficient for freeing, calling
vhost_vdpa_del() in this case will lead an extra round of free. Note
that this kind of "double free" is safe since vhost_dev_cleanup() zero
the whole structure.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-8-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The VhostVDPAState is just allocated by qemu_new_net_client() via
g_malloc0() in net_vhost_vdpa_init(). So s->vhost_net is NULL for
sure, let's remove this unnecessary check in vhost_vdpa_add().
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-7-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch switches to initialize dev.nvqs from the VhostNetOptions
instead of assuming it was 2. This is useful for implementing control
virtqueue support which will be a single vhost_net structure with a
single cvq.
Note that nvqs is still set to 2 for all users and this patch does not
change functionality.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-6-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
"acked_features" is unused, let's remove that.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-2-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We don't currently zero-initialize the 'struct sockaddr_in' that
parse_host_port() fills in, so any fields we don't explicitly
initialize might be left as random garbage. POSIX states that
implementations may define extensions in sockaddr_in, and that those
extensions must not trigger if zero-initialized. So not zero
initializing might result in inadvertently triggering an impdef
extension.
memset() the sockaddr_in before we start to fill it in.
Fixes: Coverity CID 1005338
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20210813150506.7768-2-peter.maydell@linaro.org
Since commit 9894dc0cdc "char: convert
from GIOChannel to QIOChannel", the first argument to the watch callback
can actually be a QIOChannel, which is not a GIOChannel (but a QEMU
Object).
Even though we never used that pointer, change the callback type to warn
the users. Possibly a better fix later, we may want to store the
callback and call it from intermediary functions.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
From clang-13:
../qemu/net/checksum.c:189:23: error: variable 'buf_off' set but not used \
[-Werror,-Wunused-but-set-variable]
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The data pointer has skipped vnet_hdr_len in the function of
parse_packet_early().So, we can not subtract vnet_hdr_len again
when calculating pkt->header_size in fill_pkt_tcp_info(). Otherwise,
it will cause network packet comparsion errors and greatly increase
the frequency of checkpoints.
Signed-off-by: Lei Rao <lei.rao@intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This patch fixes the following:
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f6ae4559859 in __GI_abort () at abort.c:79
#2 0x0000559aaa386720 in error_exit (err=16, msg=0x559aaa5973d0 <__func__.16227> "qemu_mutex_destroy") at util/qemu-thread-posix.c:36
#3 0x0000559aaa3868c5 in qemu_mutex_destroy (mutex=0x559aabffe828) at util/qemu-thread-posix.c:69
#4 0x0000559aaa2f93a8 in char_finalize (obj=0x559aabffe800) at chardev/char.c:285
#5 0x0000559aaa23318a in object_deinit (obj=0x559aabffe800, type=0x559aabfd7d20) at qom/object.c:606
#6 0x0000559aaa2331b8 in object_deinit (obj=0x559aabffe800, type=0x559aabfd9060) at qom/object.c:610
#7 0x0000559aaa233200 in object_finalize (data=0x559aabffe800) at qom/object.c:620
#8 0x0000559aaa234202 in object_unref (obj=0x559aabffe800) at qom/object.c:1074
#9 0x0000559aaa2356b6 in object_finalize_child_property (obj=0x559aac0dac10, name=0x559aac778760 "compare0-0", opaque=0x559aabffe800) at qom/object.c:1584
#10 0x0000559aaa232f70 in object_property_del_all (obj=0x559aac0dac10) at qom/object.c:557
#11 0x0000559aaa2331ed in object_finalize (data=0x559aac0dac10) at qom/object.c:619
#12 0x0000559aaa234202 in object_unref (obj=0x559aac0dac10) at qom/object.c:1074
#13 0x0000559aaa2356b6 in object_finalize_child_property (obj=0x559aac0c75c0, name=0x559aac0dadc0 "chardevs", opaque=0x559aac0dac10) at qom/object.c:1584
#14 0x0000559aaa233071 in object_property_del_child (obj=0x559aac0c75c0, child=0x559aac0dac10, errp=0x0) at qom/object.c:580
#15 0x0000559aaa233155 in object_unparent (obj=0x559aac0dac10) at qom/object.c:599
#16 0x0000559aaa2fb721 in qemu_chr_cleanup () at chardev/char.c:1159
#17 0x0000559aa9f9b110 in main (argc=54, argv=0x7ffeb62fa998, envp=0x7ffeb62fab50) at vl.c:4539
When chardev is cleaned up, chr_write_lock needs to be destroyed. But
the colo-compare module is not cleaned up normally before it when the
guest poweroff. It is holding chr_write_lock at this time. This will
cause qemu crash.So we add the function of colo_compare_cleanup() before
qemu_chr_cleanup() to fix the bug.
Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Use the packet_new_nocopy instead of packet_new in the
filter-rewriter module. There will be one less memory
copy in the processing of each network packet.
Signed-off-by: Lei Rao <lei.rao@intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The iov_size has been calculated in filter_send(). we can directly
return the size.In this way, this is no need to repeat calculations
in filter_redirector_receive_iov();
Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
There is the same trace code in the colo_compare_packet_payload.
Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
When RSS is enabled the device tries to load the eBPF program
to select RX virtqueue in the TUN. If eBPF can be loaded
the RSS will function also with vhost (works with kernel 5.8 and later).
Software RSS is used as a fallback with vhost=off when eBPF can't be loaded
or when hash population requested by the guest.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
For now, that method supported only by Linux TAP.
Linux TAP uses TUNSETSTEERINGEBPF ioctl.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Additional code that will be used for eBPF setting steering routine.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Commit e50caf4a5c ("tracing: convert documentation to rST")
converted docs/devel/tracing.txt to docs/devel/tracing.rst.
We still have several references to the old file, so let's fix them
with the following command:
sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-2-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Version: GnuPG v1
iQEcBAABAgAGBQJgrwypAAoJEO8Ells5jWIRDfsIAJyrDUbgDoeg4dM9TEvE+3xr
5Fd52wVozlQhn6zz2ZfEcqVfWNVbVHsVcYnpJQMXOCda5CIl5qZbl2EzS/lBrdkm
f5XUD4lFhwH1pUUugT20lEFYct0dv2UddEmLrEctI+CB5d9bA+SqeDdLAHYbrDIR
jfc/vOrP/1sfvWCsXmZzJ+0UhoGoS8qU2eLx9nwSvan3uLMmLWfWSTUltvwMNvZD
oA6a1O9poV00W89W25tSeMOKKPaS+RZYVFvDd0HzQYRI0t496VxK/Jw0VqfF+iCl
3KBzjhUi7QyFWgQY5xFQcMWwob8McOJOUGX4EQERm8EbHy8VhpugCHinqLPqf38=
=fRpo
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Thu 27 May 2021 04:06:17 BST
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
tap-bsd: Remove special casing for older OpenBSD releases
virtio-net: failover: add missing remove_migration_state_change_notifier()
hw/net/imx_fec: return 0xffff when accessing non-existing PHY
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
OpenBSD added support for tap(4) 10 releases ago.
Remove the special casing for older releases.
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Replace Windows specific macro with a more generic feature detection
macro. Allows slirp smb feature to be disabled manually as well.
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-Id: <20210315180341.31638-5-j@getutm.app>
[Use $default_feature as the default. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Stop including sysemu/sysemu.h in files that don't need it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-2-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Many files include qemu/log.h without needing it. Remove the superfluous
include statements.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20210328054833.2351597-1-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
There could be case that peer is NULL. This can happen when during
network device hot-add where net device needs to be added first. So
the patch check the existence of peer before trying to do the pad.
Fixes: 969e50b61a ("net: Pad short frames to minimum size before sending from SLiRP/TAP")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 20210423031803.1479-1-jasowang@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 969e50b61a ("net: Pad short frames to minimum size before
sending from SLiRP/TAP") tries to pad frames but try to recyle the
local array that is used for padding to tap thread. This patch fixes
this by recyling the original buffer.
Fixes: 969e50b61a ("net: Pad short frames to minimum size before sending from SLiRP/TAP")
Tested-by: Howard Spoelstra <hsp.cat7@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Several issues has been reported for query-netdev series. Consider
it's late in the rc, this reverts commit
d32ad10a14.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Several issues has been reported for query-netdev info
series. Consider it's late in the rc, this reverts commit
commit 59b5437eb7.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Several issues has been reported for query-netdev info
series. Consider it's late in the rc, this reverts commit
a0724776c5.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Several issues has been reported for query-netdev info
series. Consider it's late in the rc, this reverts commit
f2e8319d45.
Signed-off-by: Jason Wang <jasowang@redhat.com>
To simplify the function body, invert the if() statement, returning
earlier.
Since we already checked there is enough data in the iovec buffer,
simply add an assert() call to consume the bytes_read variable.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
We want to check fields from ip6_ext_hdr_routing structure
and if correct read the full in6_address. Let's directly check
if our iovec contains enough data for everything, else return
early.
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The 'offset' argument represents the offset to the ip6_ext_hdr
header, rename it as 'ext_hdr_offset'.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The length field is already contained in the ip6_ext_hdr structure.
Check it direcly in eth_parse_ipv6_hdr() before calling
_eth_get_rss_ex_dst_addr(), which gets a bit simplified.
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The in6_address comes after the ip6_ext_hdr_routing header,
not after the ip6_ext_hdr one. Fix the offset.
Cc: qemu-stable@nongnu.org
Reported-by: Stefano Garzarella <sgarzare@redhat.com>
Fixes: eb700029c7 ("net_pkt: Extend packet abstraction as required by e1000e functionality")
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
g_queue_remove needs to look up the list entry first, but we
already have it as result and can remove it directly with
g_queue_delete_link.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Additional to removing the packet from the secondary queue,
we also need to free it.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The minimum Ethernet frame length is 60 bytes. For short frames with
smaller length like ARP packets (only 42 bytes), on a real world NIC
it can choose either padding its length to the minimum required 60
bytes, or sending it out directly to the wire. Such behavior can be
hardcoded or controled by a register bit. Similarly on the receive
path, NICs can choose either dropping such short frames directly or
handing them over to software to handle.
On the other hand, for the network backends like SLiRP/TAP, they
don't expose a way to control the short frame behavior. As of today
they just send/receive data from/to the other end connected to them,
which means any sized packet is acceptable. So they can send and
receive short frames without any problem. It is observed that ARP
packets sent from SLiRP/TAP are 42 bytes, and SLiRP/TAP just send
these ARP packets to the other end which might be a NIC model that
does not allow short frames to pass through.
To provide better compatibility, for packets sent from QEMU network
backends like SLiRP/TAP, we change to pad short frames before sending
it out to the other end, if the other end does not forbid it via the
nc->do_not_pad flag. This ensures a backend as an Ethernet sender
does not violate the spec. But with this change, the behavior of
dropping short frames from SLiRP/TAP interfaces in the NIC model
cannot be emulated because it always receives a packet that is spec
complaint. The capability of sending short frames from NIC models is
still supported and short frames can still pass through SLiRP/TAP.
This commit should be able to fix the issue as reported with some
NIC models before, that ARP requests get dropped, preventing the
guest from becoming visible on the network. It was workarounded in
these NIC models on the receive path, that when a short frame is
received, it is padded up to 60 bytes.
The following 2 commits seem to be the one to workaround this issue
in e1000 and vmxenet3 before, and should probably be reverted.
commit 78aeb23ede ("e1000: Pad short frames to minimum size (60 bytes)")
commit 40a87c6c9b ("vmxnet3: Pad short frames to minimum size (60 bytes)")
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Add a helper to pad a short Ethernet frame to the minimum required
length, which can be used by backends' code.
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
As we use QAPI NetClientState->stored_config to store and get information
about backend network devices, we can drop fill of legacy field info_str
for them.
We still use info_str field for NIC and hubports, so we can not completely
remove it.
Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Replace usage of legacy field info_str of NetClientState for backend
network devices with QAPI NetdevInfo stored_config that already used
in QMP query-netdev.
This change increases the detail of the "info network" output and takes
a more general approach to composing the output.
NIC and hubports still use legacy info_str field.
Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The info_str field of the NetClientState structure is static and has a size
of 256 bytes. This amount is often unclaimed, and the field itself is used
exclusively for HMP "info network".
The patch translates info_str to dynamic memory allocation.
This action is also allows us to painlessly discard usage of this field
for backend devices.
Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The query-netdev command is used to get the configuration of the current
network device backends (netdevs).
This is the QMP analog of the HMP command "info network" but only for
netdevs (i.e. excluding NIC and hubports).
The query-netdev command returns an array of objects of the NetdevInfo
type, which are an extension of Netdev type. It means that response can
be used for netdev-add after small modification. This can be useful for
recreate the same netdev configuration.
Information about the network device is filled in when it is created or
modified and is available through the NetClientState->stored_config.
Signed-off-by: Alexey Kirillov <lekiravi@yandex-team.ru>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Some NIC supports loopback mode and this is done by calling
nc->info->receive() directly which in fact suppresses the effort of
reentrancy check that is done in qemu_net_queue_send().
Unfortunately we can't use qemu_net_queue_send() here since for
loopback there's no sender as peer, so this patch introduce a
qemu_receive_packet() which is used for implementing loopback mode
for a NIC with this check.
NIC that supports loopback mode will be converted to this helper.
This is intended to address CVE-2021-3416.
Cc: Prasad J Pandit <ppandit@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
When a network or network device is created from the command line or HMP,
QemuOpts ensures that the id passes the id_wellformed check. However,
QMP skips this:
$ qemu-system-x86_64 -qmp stdio -S -nic user,id=123/456
qemu-system-x86_64: -nic user,id=123/456: Parameter id expects an identifier
Identifiers consist of letters, digits, -, ., _, starting with a letter.
$ qemu-system-x86_64 -qmp stdio -S
{"execute":"qmp_capabilities"}
{"return": {}}
{"execute":"netdev_add", "arguments": {"type": "user", "id": "123/456"}}
{"return": {}}
After:
$ qemu-system-x86_64 -qmp stdio -S
{"execute":"qmp_capabilities"}
{"return": {}}
{"execute":"netdev_add", "arguments": {"type": "user", "id": "123/456"}}
{"error": {"class": "GenericError", "desc": "Parameter "id" expects an identifier"}}
Validity checks should be performed always at the bottom of the call chain,
because QMP skips all the steps above. At the same time we know that every
call chain should go through either QMP or (for legacy) through QemuOpts.
Because the id for -net and -nic is automatically generated and not
well-formed by design, just add the check to QMP.
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
"qemu-common.h" should be included to provide the forward declaration
of qemu_hexdump() when DEBUG_NET is on.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The 'running' argument from VMChangeStateHandler does not require
other value than 0 / 1. Make it a plain boolean.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210111152020.1422021-3-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
We already got a global function called id_generate() to create unique
IDs within QEMU. Let's use it in the network subsytem, too, instead of
inventing our own ID scheme here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210215090225.1046239-1-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
There are 23 files that include the "sysemu/qtest.h",
but they do not use any qtest functions.
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210226081414.205946-1-kuhn.chenqun@huawei.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
DNS should be DHCP
Signed-off-by: Doug Evans <dje@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20210122004251.843837-1-dje@google.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
These cases require a bit more thought to review; in each case, the
code was appending to a list, but not with a FOOList **tail variable.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210113221013.390592-6-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Flawed change to qmp_guest_network_get_interfaces() dropped]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
On first glance, the loop in qmp_query_rx_filter() has early return
paths that could leak any allocation of filter_list from a previous
iteration. But on closer inspection, it is obvious that all of the
early exits are guarded by has_name, and that the bulk of the loop
body can be executed at most once if the user is filtering by name,
thus, any early exit coincides with an empty list. Add asserts to
make this obvious.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210113221013.390592-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
At present net_checksum_calculate() blindly calculates all types of
checksums (IP, TCP, UDP). Some NICs may have a per type setting in
their BDs to control what checksum should be offloaded. To support
such hardware behavior, introduce a 'csum_flag' parameter to the
net_checksum_calculate() API to allow fine control over what type
checksum is calculated.
Existing users of this API are updated accordingly.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
At present net_checksum_calculate() only calculates TCP/UDP checksum
in an IP packet, but assumes the IP header checksum to be provided
by the software, e.g.: Linux kernel always calculates the IP header
checksum. However this might not always be the case, e.g.: for an IP
checksum offload enabled stack like VxWorks, the IP header checksum
can be zero.
This adds the checksum calculation of the IP header.
Signed-off-by: Guishan Qin <guishan.qin@windriver.com>
Signed-off-by: Yabing Liu <yabing.liu@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
To calculate the TCP/UDP checksum we need the whole datagram. Unless
the hardware has some logic to collect all fragments before sending
the whole datagram first, it can only be done by the network stack,
which is normally the case for the NICs we have seen so far.
Skip these fragmented IP packets to avoid checksum corruption.
Signed-off-by: Markus Carlstedt <markus.carlstedt@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
CLI -netdev accumulates in option group "netdev".
Before commit 08712fcb85 "net: Track netdevs in NetClientState rather
than QemuOpt", netdev_add added to the option group, and netdev_del
removed from it, both HMP and QMP. Thus, every netdev had a
corresponding QemuOpts in this option group.
Commit 08712fcb85 dropped this for QMP netdev_add and both netdev_del.
Now a netdev has a corresponding QemuOpts only when it was created
with CLI or HMP. Two issues:
* QMP and HMP netdev_del can leave QemuOpts behind, breaking HMP
netdev_add. Reproducer:
$ qemu-system-x86_64 -S -display none -nodefaults -monitor stdio
QEMU 5.1.92 monitor - type 'help' for more information
(qemu) netdev_add user,id=net0
(qemu) info network
net0: index=0,type=user,net=10.0.2.0,restrict=off
(qemu) netdev_del net0
(qemu) info network
(qemu) netdev_add user,id=net0
upstream-qemu: Duplicate ID 'net0' for netdev
Try "help netdev_add" for more information
Fix by restoring the QemuOpts deletion in qmp_netdev_del(), but with
a guard, because the QemuOpts need not exist.
* QMP netdev_add loses its "no duplicate ID" check. Reproducer:
$ qemu-system-x86_64 -S -display none -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 92, "minor": 1, "major": 5}, "package": "v5.2.0-rc2-1-g02c1f0142c"}, "capabilities": ["oob"]}}
{"execute": "qmp_capabilities"}
{"return": {}}
{"execute": "netdev_add", "arguments": {"type": "user", "id":"net0"}}
{"return": {}}
{"execute": "netdev_add", "arguments": {"type": "user", "id":"net0"}}
{"return": {}}
Fix by adding a duplicate ID check to net_client_init1() to replace
the lost one. The check is redundant for callers where QemuOpts
still checks, i.e. for CLI and HMP.
Reported-by: Andrew Melnichenko <andrew@daynix.com>
Fixes: 08712fcb85
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This commit is the result of running the timer-del-timer-free.cocci
script on the whole source tree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20201215154107.3255-4-peter.maydell@linaro.org
Instance properties make introspection hard and are not shown by
"-object ...,help". Convert them to class properties.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20201111183823.283752-12-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Trivial code reordering in some filter backends, to make the next
changes easier to review.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20201111183823.283752-11-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instance properties make introspection hard and are not shown by
"-object ...,help". Convert them to class properties.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Message-Id: <20201111183823.283752-9-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Providing the 'if' property, but not 'canbus' segfaults like this:
#0 0x0000555555b0f14d in can_bus_insert_client (bus=0x0, client=0x555556aa9af0) at ../net/can/can_core.c:88
#1 0x00005555559c3803 in can_host_connect (ch=0x555556aa9ac0, errp=0x7fffffffd568) at ../net/can/can_host.c:62
#2 0x00005555559c386a in can_host_complete (uc=0x555556aa9ac0, errp=0x7fffffffd568) at ../net/can/can_host.c:72
#3 0x0000555555d52de9 in user_creatable_complete (uc=0x555556aa9ac0, errp=0x7fffffffd5c8) at ../qom/object_interfaces.c:23
Add the missing NULL check.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201130105615.21799-5-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fixes: 63c4db4c2e (net: relocate paths to helpers and scripts)
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Close fd before returning.
Buglink: https://bugs.launchpad.net/qemu/+bug/1904486
Signed-off-by: yuanjungong <ruc_gongyuanjun@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1829272
When deleting queue pair, purge pending RX packets if any.
Example of problematic flow:
1. Bring up q35 VM with tap (vhost off) and virtio-net or e1000e
2. Run ping flood to the VM NIC ( 1 ms interval)
3. Hot unplug the NIC device (device_del)
During unplug process one or more packets come, the NIC
can't receive, tap disables read_poll
4. Hot plug the device (device_add) with the same netdev
The tap stays with read_poll disabled and does not receive
any packets anymore (tap_send never triggered)
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>