Commit Graph

30 Commits

Author SHA1 Message Date
Anthony Harivel
95fa0c79a0 qio: add support for SO_PEERCRED for socket channel
The function qio_channel_get_peercred() returns a pointer to the
credentials of the peer process connected to this socket.

This credentials structure is defined in <sys/socket.h> as follows:

struct ucred {
	pid_t pid;    /* Process ID of the sending process */
	uid_t uid;    /* User ID of the sending process */
	gid_t gid;    /* Group ID of the sending process */
};

The use of this function is possible only for connected AF_UNIX stream
sockets and for AF_UNIX stream and datagram socket pairs.

On platform other than Linux, the function return 0.

Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-2-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-22 13:47:41 +02:00
Nikolay Borisov
f1cfe39418 io: Add generic pwritev/preadv interface
Introduce basic pwritev/preadv support in the generic channel layer.
Specific implementation will follow for the file channel as this is
required in order to support migration streams with fixed location of
each ram page.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-4-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-01 15:42:04 +08:00
Stefan Hajnoczi
06e0f098d6 io: follow coroutine AioContext in qio_channel_yield()
The ongoing QEMU multi-queue block layer effort makes it possible for multiple
threads to process I/O in parallel. The nbd block driver is not compatible with
the multi-queue block layer yet because QIOChannel cannot be used easily from
coroutines running in multiple threads. This series changes the QIOChannel API
to make that possible.

In the current API, calling qio_channel_attach_aio_context() sets the
AioContext where qio_channel_yield() installs an fd handler prior to yielding:

  qio_channel_attach_aio_context(ioc, my_ctx);
  ...
  qio_channel_yield(ioc); // my_ctx is used here
  ...
  qio_channel_detach_aio_context(ioc);

This API design has limitations: reading and writing must be done in the same
AioContext and moving between AioContexts involves a cumbersome sequence of API
calls that is not suitable for doing on a per-request basis.

There is no fundamental reason why a QIOChannel needs to run within the
same AioContext every time qio_channel_yield() is called. QIOChannel
only uses the AioContext while inside qio_channel_yield(). The rest of
the time, QIOChannel is independent of any AioContext.

In the new API, qio_channel_yield() queries the AioContext from the current
coroutine using qemu_coroutine_get_aio_context(). There is no need to
explicitly attach/detach AioContexts anymore and
qio_channel_attach_aio_context() and qio_channel_detach_aio_context() are gone.
One coroutine can read from the QIOChannel while another coroutine writes from
a different AioContext.

This API change allows the nbd block driver to use QIOChannel from any thread.
It's important to keep in mind that the block driver already synchronizes
QIOChannel access and ensures that two coroutines never read simultaneously or
write simultaneously.

This patch updates all users of qio_channel_attach_aio_context() to the
new API. Most conversions are simple, but vhost-user-server requires a
new qemu_coroutine_yield() call to quiesce the vu_client_trip()
coroutine when not attached to any AioContext.

While the API is has become simpler, there is one wart: QIOChannel has a
special case for the iohandler AioContext (used for handlers that must not run
in nested event loops). I didn't find an elegant way preserve that behavior, so
I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true|false)
for opting in to the new AioContext model. By default QIOChannel uses the
iohandler AioHandler. Code that formerly called
qio_channel_attach_aio_context() now calls
qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is
created.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20230830224802.493686-5-stefanha@redhat.com>
[eblake: also fix migration/rdma.c]
Signed-off-by: Eric Blake <eblake@redhat.com>
2023-09-07 20:32:11 -05:00
Stefan Hajnoczi
acd4be64b8 io: check there are no qio_channel_yield() coroutines during ->finalize()
Callers must clean up their coroutines before calling
object_unref(OBJECT(ioc)) to prevent an fd handler leak. Add an
assertion to check this.

This patch is preparation for the fd handler changes that follow.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230830224802.493686-4-stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2023-09-07 20:32:11 -05:00
Kevin Wolf
7c1f51bf38 nbd/server: Fix drained_poll to wake coroutine in right AioContext
nbd_drained_poll() generally runs in the main thread, not whatever
iothread the NBD server coroutine is meant to run in, so it can't
directly reenter the coroutines to wake them up.

The code seems to have the right intention, it specifies the correct
AioContext when it calls qemu_aio_coroutine_enter(). However, this
functions doesn't schedule the coroutine to run in that AioContext, but
it assumes it is already called in the home thread of the AioContext.

To fix this, add a new thread-safe qio_channel_wake_read() that can be
called in the main thread to wake up the coroutine in its AioContext,
and use this in nbd_drained_poll().

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-3-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-19 19:16:53 +02:00
Paolo Bonzini
1dd91b22a6 io: mark mixed functions that can suspend
There should be no paths from a coroutine_fn to aio_poll, however in
practice coroutine_mixed_fn will call aio_poll in the !qemu_in_coroutine()
path.  By marking mixed functions, we can track accurately the call paths
that execute entirely in coroutine context, and find more missing
coroutine_fn markers.  This results in more accurate checks that
coroutine code does not end up blocking.

If the marking were extended transitively to all functions that call
these ones, static analysis could be done much more efficiently.
However, this is a start and makes it possible to use vrc's path-based
searches to find potential bugs where coroutine_fns call blocking functions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-04-20 11:17:35 +02:00
manish.mishra
84615a19dd io: Add support for MSG_PEEK for socket channel
MSG_PEEK peeks at the channel, The data is treated as unread and
the next read shall still return this data. This support is
currently added only for socket class. Extra parameter 'flags'
is added to io_readv calls to pass extra read flags like MSG_PEEK.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: manish.mishra <manish.mishra@nutanix.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-06 19:22:56 +01:00
Leonardo Bras
b88651cb4d QIOChannel: Add flags on io_writev and introduce io_flush callback
Add flags to io_writev and introduce io_flush as optional callback to
QIOChannelClass, allowing the implementation of zero copy writes by
subclasses.

How to use them:
- Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY),
- Wait write completion with qio_channel_flush().

Notes:
As some zero copy write implementations work asynchronously, it's
recommended to keep the write buffer untouched until the return of
qio_channel_flush(), to avoid the risk of sending an updated buffer
instead of the buffer state during write.

As io_flush callback is optional, if a subclass does not implement it, then:
- io_flush will return 0 without changing anything.

Also, some functions like qio_channel_writev_full_all() were adapted to
receive a flag parameter. That allows shared code between zero copy and
non-zero copy writev, and also an easier implementation on new flags.

Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20220513062836.965425-3-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-05-16 13:56:24 +01:00
Jagannathan Raman
c90e3512a4 io: error_prepend() in qio_channel_readv_full_all() causes segfault
Using error_prepend() in qio_channel_readv_full_all() causes a segfault
as errp is not set when ret is 0. This results in the failure of iotest
83. Replacing with error_setg() fixes the problem.

Additionally, removes a full stop at the end of error message

Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Fixes: bebab91ebd
  (io: add qio_channel_readv_full_all_eof & qio_channel_readv_full_all helpers)
Message-Id: <be476bcdb99e820fec0fa09fe8f04c9dd3e62473.1613128220.git.jag.raman@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-12 07:50:59 -06:00
Elena Ufimtseva
bebab91ebd io: add qio_channel_readv_full_all_eof & qio_channel_readv_full_all helpers
Adds qio_channel_readv_full_all_eof() and qio_channel_readv_full_all()
to read both data and FDs. Refactors existing code to use these helpers.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: b059c4cc0fb741e794d644c144cc21372cad877d.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-02-10 09:23:28 +00:00
Elena Ufimtseva
bfa4238750 io: add qio_channel_writev_full_all helper
Adds qio_channel_writev_full_all() to transmit both data and FDs.
Refactors existing code to use this helper.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 480fbf1fe4152495d60596c9b665124549b426a5.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-02-10 09:23:28 +00:00
Chetan Pant
c8198bd5f9 io: Fix Lesser GPL version number
There is no "version 2" of the "Lesser" General Public License.
It is either "GPL version 2.0" or "Lesser GPL version 2.1".
This patch replaces all occurrences of "Lesser GPL version 2" with
"Lesser GPL version 2.1" in comment section.

Signed-off-by: Chetan Pant <chetan4windows@gmail.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20201014134033.14095-1-chetan4windows@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-10-27 16:48:49 +01:00
Markus Armbruster
0b8fa32f55 Include qemu/module.h where needed, drop it from qemu-common.h
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-4-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c
hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c;
ui/cocoa.m fixed up]
2019-06-12 13:18:33 +02:00
Kevin Wolf
2a239e6e03 io: Remove redundant read/write_coroutine assignments
qio_channel_yield() now updates ioc->read_write/coroutine and calls
qio_channel_set_aio_fd_handlers(), so the code in the handlers has
become redundant and can be removed.

This does not make a difference in intermediate states because
aio_co_wake() really enters the coroutine immediately here: These
handlers are never run in coroutine context, and we're in the right
AioContext because qio_channel_attach_aio_context() asserts that the
handlers are inactive.

To make these conditions more obvious, assert the right AioContext.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-02-25 15:03:19 +01:00
Kevin Wolf
6886ceaf61 io: Make qio_channel_yield() interruptible
Similar to how qemu_co_sleep_ns() allows preemption from an external
coroutine entry, allow reentering qio_channel_yield() early.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-02-25 15:03:19 +01:00
Peter Xu
315409c711 qio: introduce qio_channel_add_watch_{full|source}
Firstly, introduce an internal qio_channel_add_watch_full(), which
enhances qio_channel_add_watch() that context can be specified.

Then add a new API wrapper qio_channel_add_watch_source() to return a
GSource pointer rather than a tag ID.

Note that the _source() call will keep a reference of GSource so that
callers need to unref them explicitly when finished using the GSource.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-06 10:19:04 +00:00
Eric Blake
e8ffaa3110 io: Add new qio_channel_read{, v}_all_eof functions
Some callers want to distinguish between clean EOF (no bytes read)
vs. a short read (at least one byte read, but EOF encountered
before reaching the desired length), as it allows clients the
ability to do a graceful shutdown when a server shuts down at
defined safe points in the protocol, rather than treating all
shutdown scenarios as an error due to EOF.  However, we don't want
to require all callers to have to check for early EOF.  So add
another wrapper function that can be used by the callers that care
about the distinction.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170905191114.5959-3-eblake@redhat.com>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
2017-09-06 10:11:54 -05:00
Eric Blake
9ffb827020 io: Yield rather than wait when already in coroutine
The new qio_channel_{read,write}{,v}_all functions are documented
as yielding until data is available.  When used on a blocking
channel, this yield is done via qio_channel_wait() which spawns
a nested event loop under the hood (so it is that secondary loop
which yields as needed); but if we are already in a coroutine (at
which point QIO_CHANNEL_ERR_BLOCK is only possible if we are a
non-blocking channel), we want to yield the current coroutine
instead of spawning a nested event loop.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170905191114.5959-2-eblake@redhat.com>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
[commit message updated]
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-09-06 10:10:58 -05:00
Daniel P. Berrange
d4622e5588 io: add new qio_channel_{readv, writev, read, write}_all functions
These functions wait until they are able to read / write the full
requested data buffer(s).

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-09-05 13:21:58 +01:00
Paolo Bonzini
8f7168b343 io: simplify qio_channel_attach_aio_context
If properly preceded by qio_channel_detach_aio_context, this function really
has nothing to do except setting ioc->ctx.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-05-26 10:38:08 +01:00
Paolo Bonzini
c4c497d27f io: make qio_channel_yield aware of AioContexts
Support separate coroutines for reading and writing, and place the
read/write handlers on the AioContext that the QIOChannel is registered
with.

Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213135235.12274-7-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:14:07 +00:00
Paolo Bonzini
bf88c1247f io: add methods to set I/O handlers on AioContext
This is in preparation for making qio_channel_yield work on
AioContexts other than the main one.

Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213135235.12274-6-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:14:07 +00:00
Daniel P. Berrange
20f4aa265e io: add ability to set a name for IO channels
The GSource object has ability to have a name, which is useful
when debugging performance problems with the mainloop event
callbacks that take too long. By associating a name with a
QIOChannel object, we can then set the name on any GSource
associated with the channel.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-27 09:13:10 +02:00
Felipe Franciosi
d8d3c7cc67 io: Introduce a qio_channel_set_feature() helper
Testing QIOChannel feature support can be done with a helper called
qio_channel_has_feature(). Setting feature support, however, was
done manually with a logical OR. This patch introduces a new helper
called qio_channel_set_feature() and makes use of it where applicable.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-26 18:19:53 +02:00
Felipe Franciosi
e413ae0c04 io: Use qio_channel_has_feature() where applicable
Parts of the code have been testing QIOChannel features directly with a
logical AND. This patch makes it all consistent by using the
qio_channel_has_feature() function to test if a feature is present.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-26 18:19:53 +02:00
Paolo Bonzini
0b8b8753e4 coroutine: move entry argument to qemu_coroutine_create
In practice the entry argument is always known at creation time, and
it is confusing that sometimes qemu_coroutine_enter is used with a
non-NULL argument to re-enter a coroutine (this happens in
block/sheepdog.c and tests/test-coroutine.c).  So pass the opaque value
at creation time, for consistency with e.g. aio_bh_new.

Mostly done with the following semantic patch:

@ entry1 @
expression entry, arg, co;
@@
- co = qemu_coroutine_create(entry);
+ co = qemu_coroutine_create(entry, arg);
  ...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);

@ entry2 @
expression entry, arg;
identifier co;
@@
- Coroutine *co = qemu_coroutine_create(entry);
+ Coroutine *co = qemu_coroutine_create(entry, arg);
  ...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);

@ entry3 @
expression entry, arg;
@@
- qemu_coroutine_enter(qemu_coroutine_create(entry), arg);
+ qemu_coroutine_enter(qemu_coroutine_create(entry, arg));

@ reentry @
expression co;
@@
- qemu_coroutine_enter(co, NULL);
+ qemu_coroutine_enter(co);

except for the aforementioned few places where the semantic patch
stumbled (as expected) and for test_co_queue, which would otherwise
produce an uninitialized variable warning.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-07-13 13:26:02 +02:00
Markus Armbruster
da34e65cb4 include/qemu/osdep.h: Don't include qapi/error.h
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef.  Since then, we've moved to include qemu/osdep.h
everywhere.  Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h.  That's in excess of
100KiB of crap most .c files don't actually need.

Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h.  Include qapi/error.h in .c files that need it and don't
get it now.  Include qapi-types.h in qom/object.h for uint16List.

Update scripts/clean-includes accordingly.  Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h.  Update the list of includes in the qemu/osdep.h
comment quoted above similarly.

This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third.  Unfortunately, the number depending on
qapi-types.h shrinks only a little.  More work is needed for that one.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:15 +01:00
Paolo Bonzini
a589720567 io: implement socket watch for win32 using WSAEventSelect+select
On Win32 we cannot directly poll on socket handles. Instead we
create a Win32 event object and associate the socket handle with
the event. When the event signals readyness we then have to
use select to determine which events are ready. Creating Win32
events is moderately heavyweight, so we don't want todo it
every time we create a GSource, so this associates a single
event with a QIOChannel.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:19:07 +00:00
Peter Maydell
cae9fc567d io: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-14-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Daniel P. Berrange
666a3af9c8 io: add abstract QIOChannel classes
Start the new generic I/O channel framework by defining a
QIOChannel abstract base class. This is designed to feel
similar to GLib's GIOChannel, but with the addition of
support for using iovecs, qemu error reporting, file
descriptor passing, coroutine integration and use of
the QOM framework for easier sub-classing.

The intention is that anywhere in QEMU that almost
anywhere that deals with sockets will use this new I/O
infrastructure, so that it becomes trivial to then layer
in support for TLS encryption. This will at least include
the VNC server, char device backend and migration code.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 12:18:05 +00:00