Current code uses some fields combinatorially to indicate the state of
a s390 pci device. This patch introduces device states in order to make
the code more readable and more logical.
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Currently, common base layers virtual css bridge and bus are
defined in hw/s390x/virtio-ccw.c(h). In order to support
multiple types of devices in the virtual channel subsystem,
especially non virtio-ccw, refactoring work needs to be done.
This work is just a pure code move without any functional change
except dropping an empty function virtual_css_bridge_init() and
virtio_ccw_busdev_unplug() changing. virtio_ccw_busdev_unplug()
is specific to virtio-ccw but gets referenced from the common
virtual css bridge code. To keep the functional changes to a
minimum we export this function from virtio-ccw.c and continue
to reference it inside virtual_css_bridge_class_init()
(now living in hw/s390x/css-bridge.c). A follow-up patch will
clean this up.
Signed-off-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
A lot of what virtio_ccw_device_realize() does isn't specific to
virtio; it would apply to emulated CCW as well. Factor it out to make
it easier to implement emulated CCW devices later on.
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
In various Freescale SOCs, the GPT timers can be configured to select
its input clock.
Depending on the SOC the set of available input clocks may vary.
The actual single GPT definition was no good enough and because of it
booting the sabrelite board with a i.MX6DL device tree would fail
because of an incorrect input clock definition for the i.MX6DL SOC.
This patch fixes the i.MX6DL boot failure by adding the ability to
define a different set of input clocks depending on the considered SOC.
A different class has been defined for i.MX25, i.MX31 and i.MX6 each with
its specific set of input clocks.
The patch has been tested by booting KZM, i.MX25 PDK, i.MX6Q sabrelite
and i.MX6DL sabrelite.
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 1467325619-8374-1-git-send-email-jcd@tribudubois.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: fixed spacing round '/' operator]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
On Windows 'aux.*' is a reserved name and cannot be used for
filenames; see
https://msdn.microsoft.com/en-gb/library/windows/desktop/aa365247(v=vs.85).aspx
This prevents cloning the QEMU git repo on Windows:
C:\Java\sources\kvm> git clone https://github.com/qemu/qemu.git
Cloning into 'qemu'...
remote: Counting objects: 279563, done.
remote: Total 279563 (delta 0), reused 0 (delta 0), pack-reused 279563R
Receiving objects: 100% (279563/279563), 122.45 MiB | 3.52 MiB/s, done.
Resolving deltas: 100% (221942/221942), done.
Checking connectivity... done.
error: unable to create file hw/misc/aux.c (No such file or directory)
error: unable to create file include/hw/misc/aux.h (No such file or directory)
Checking out files: 100% (4795/4795), done.
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry the checkout with 'git checkout -f HEAD'
(bug https://bugs.launchpad.net/bugs/1595240)
Rename the offending files for the benefit of Windows.
Reported-by: Алексей Курган <akurgan@yandex.ru>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Wei Huang <wei@redhat.com>
Tested-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1467377145-32385-1-git-send-email-peter.maydell@linaro.org
This patch add the capability of basic vhost net busy polling which is
supported by recent kernel. User could configure the maximum number of
us that could be spent on busy polling through a new property of tap
"poll-us".
Cc: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXfMjDAAoJEDhwtADrkYZTyKwP/i5Lva3qhDk9AJhvPNSbnps/
sxVGVkUTeUM0UoXdKmDChkk2zScv2AOVxL3oibb22PmU1MvDU+XoKYokae/VucVS
WL16t/Z77QxSJ1yKhXT3i0P7+9sR9Gq2eDHchSSITLreVO8/jNb7u1JjnV4nPyQP
Scg6sCCnfgSw8W4EOz0wDhwXHbRv7obSY0lGUD1R6FifETHQ98rTpII0BS1AyvbF
HIJMcGpirAe62BGNQeJv933mrD1EgRfBytQAkEL/l6DDlz/xk9AE26bsjdP5Xa9A
MwEBje9CPm0ICleqq1ZVKNQQqng2PVfXXXdJYKQKVnIHL+REzRDKbTGTlS4njdVi
v2i4E7MwDPIf86zRUXJ6eUkWDHgld4of2WZmoxNPZmWWmHMb31iqGcb0//mYmP06
QOPd/uxpxg9pze4JeQMhzHCbFum2/s43cyqaH24lrwhTw3EsI45t4sgJQTG1NIFt
ZdPXVwNBfa6ur50dsRWHYonQimpofW94YYupWLggiVGKTzvfSWVQJGc9Ho65P23y
ZJrkbN3bWNnwVFzIbWFI5TKnE/TsnWvTPwk+9cGSP4MKf1sy6t5nIycoTiP3vC9x
PesENYYKy5ZkOkKQpC6lGwjkMl24ii3lVl5g4vWXXjtZHwGNmq/7/UJ7Dm4htqIk
B5gm5gQ/2Tul4Dihlteb
=BgpV
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-07-06' into staging
QAPI patches for 2016-07-06
# gpg: Signature made Wed 06 Jul 2016 10:00:51 BST
# gpg: using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2016-07-06:
replay: Use new QAPI cloning
sockets: Use new QAPI cloning
qapi: Add new clone visitor
qapi: Add new visit_complete() function
tests: Factor out common code in qapi output tests
tests: Clean up test-string-output-visitor
qmp-output-visitor: Favor new visit_free() function
string-output-visitor: Favor new visit_free() function
qmp-input-visitor: Favor new visit_free() function
string-input-visitor: Favor new visit_free() function
opts-visitor: Favor new visit_free() function
qapi: Add new visit_free() function
qapi: Add parameter to visit_end_*
qemu-img: Don't leak errors when outputting JSON
qapi: Improve use of qmp/types.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rather than rolling our own clone via an expensive conversion
in and back out of QObject, use the new clone visitor.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-15-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
We have a couple places in the code base that want to deep-clone
one QAPI object into another, and they were resorting to serializing
the struct out to QObject then reparsing it. A much more efficient
version can be done by adding a new clone visitor.
Since cloning is still relatively uncommon, expose the use of the
new visitor via a QAPI_CLONE() macro that takes care of type-punning
the underlying function pointer, rather than generating lots of
unused functions for types that won't be cloned. And yes, we're
relying on the compiler treating all pointers equally, even though
a strict C program cannot portably do so - but we're not the first
one in the qemu code base to expect it to work (hello, glib!).
The choice of adding a fourth visitor type deserves some explanation.
On the surface, the clone visitor is mostly an input visitor (it
takes arbitrary input - in this case, another QAPI object - and
creates a new QAPI object during the course of the visit). But
ever since commit da72ab0 consolidated enum visits based on the
visitor type, using VISITOR_INPUT would cause us to run
visit_type_str(), even though for cloning there is nothing to do
(we just copy the enum value across, without regards to its mapping
to strings). Also, since our input happens to be a QAPI object,
we can also satisfy the internal checks for VISITOR_OUTPUT. So in
the end, I settled with a new VISITOR_CLONE, and chose its value
such that many internal checks can use 'v->type & mask', sticking
to 'v->type == value' where the difference matters.
Note that we can only clone objects (including alternates) and lists,
not built-ins or enums. The visitor core hides integer width from
the actual visitor (since commit 04e070d), and as long as that's the
case, we can't clone top-level integers. Then again, those can
always be cloned by direct copy, since they are not objects with
deep pointers, so it's no real loss. And restricting cloning to
just objects and lists is cleaner than restricting it to non-integers.
As such, I documented that the clone visitor is for direct use only
by code internal to QAPI, and should not be used on incomplete objects
(other than a hack to work around the fact that we allow NULL in place
of "" in visit_type_str() in other output visitors). Note that as
written, the clone visitor will never fail on a complete object.
Scalars (including enums) not at the root of the clone copy just fine
with no additional effort while visiting the scalar, by virtue of a
g_memdup() each time we push another struct onto the stack. Cloning
a string requires deduplication of a pointer, which means it can also
provide the guarantee of an input visitor of never producing NULL
even when still accepting NULL in place of "" the way the QMP output
visitor does.
Cloning an 'any' type could be possible by incrementing the QObject
refcnt, but it's not obvious whether that is better than implementing
a QObject deep clone. So for now, we document it as unsupported,
and intentionally omit the .type_any() callback to let a developer
know their usage needs implementation.
Add testsuite coverage for several different clone situations, to
ensure that the code is working. I also tested that valgrind was
happy with the test.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-14-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Making each output visitor provide its own output collection
function was the only remaining reason for exposing visitor
sub-types to the rest of the code base. Add a polymorphic
visit_complete() function which is a no-op for input visitors,
and which populates an opaque pointer for output visitors. For
maximum type-safety, also add a parameter to the output visitor
constructors with a type-correct version of the output pointer,
and assert that the two uses match.
This approach was considered superior to either passing the
output parameter only during construction (action at a distance
during visit_free() feels awkward) or only during visit_complete()
(defeating type safety makes it easier to use incorrectly).
Most callers were function-local, and therefore a mechanical
conversion; the testsuite was a bit trickier, but the previous
cleanup patch minimized the churn here.
The visit_complete() function may be called at most once; doing
so lets us use transfer semantics rather than duplication or
ref-count semantics to get the just-built output back to the
caller, even though it means our behavior is not idempotent.
Generated code is simplified as follows for events:
|@@ -26,7 +26,7 @@ void qapi_event_send_acpi_device_ost(ACP
| QDict *qmp;
| Error *err = NULL;
| QMPEventFuncEmit emit;
|- QmpOutputVisitor *qov;
|+ QObject *obj;
| Visitor *v;
| q_obj_ACPI_DEVICE_OST_arg param = {
| info
|@@ -39,8 +39,7 @@ void qapi_event_send_acpi_device_ost(ACP
|
| qmp = qmp_event_build_dict("ACPI_DEVICE_OST");
|
|- qov = qmp_output_visitor_new();
|- v = qmp_output_get_visitor(qov);
|+ v = qmp_output_visitor_new(&obj);
|
| visit_start_struct(v, "ACPI_DEVICE_OST", NULL, 0, &err);
| if (err) {
|@@ -55,7 +54,8 @@ void qapi_event_send_acpi_device_ost(ACP
| goto out;
| }
|
|- qdict_put_obj(qmp, "data", qmp_output_get_qobject(qov));
|+ visit_complete(v, &obj);
|+ qdict_put_obj(qmp, "data", obj);
| emit(QAPI_EVENT_ACPI_DEVICE_OST, qmp, &err);
and for commands:
| {
| Error *err = NULL;
|- QmpOutputVisitor *qov = qmp_output_visitor_new();
| Visitor *v;
|
|- v = qmp_output_get_visitor(qov);
|+ v = qmp_output_visitor_new(ret_out);
| visit_type_AddfdInfo(v, "unused", &ret_in, &err);
|- if (err) {
|- goto out;
|+ if (!err) {
|+ visit_complete(v, ret_out);
| }
|- *ret_out = qmp_output_get_qobject(qov);
|-
|-out:
| error_propagate(errp, err);
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-13-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Now that we have a polymorphic visit_free(), we no longer need
qmp_output_visitor_cleanup(); however, we still need to
expose the subtype for qmp_output_get_qobject().
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-10-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Now that we have a polymorphic visit_free(), we no longer need
string_output_visitor_cleanup(); however, we still need to
expose the subtype for string_output_get_string().
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-9-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Now that we have a polymorphic visit_free(), we no longer need
qmp_input_visitor_cleanup(); which in turn means we no longer
need to return a subtype from qmp_input_visitor_new() nor a
public upcast function.
Generated code changes to qmp-marshal.c look like:
|@@ -52,11 +52,10 @@ void qmp_marshal_add_fd(QDict *args, QOb
| {
| Error *err = NULL;
| AddfdInfo *retval;
|- QmpInputVisitor *qiv = qmp_input_visitor_new(QOBJECT(args), true);
| Visitor *v;
| q_obj_add_fd_arg arg = {0};
|
|- v = qmp_input_get_visitor(qiv);
|+ v = qmp_input_visitor_new(QOBJECT(args), true);
| visit_start_struct(v, NULL, NULL, 0, &err);
| if (err) {
| goto out;
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-8-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Now that we have a polymorphic visit_free(), we no longer need
string_input_visitor_cleanup(); which in turn means we no longer
need to return a subtype from string_input_visitor_new() nor a
public upcast function.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-7-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Now that we have a polymorphic visit_free(), we no longer need
opts_visitor_cleanup(); which in turn means we no longer need
to return a subtype from opts_visitor_new() nor a public upcast
function.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-6-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Making each visitor provide its own (awkwardly-named) FOO_cleanup()
is unusual, when we can instead have a polymorphic visit_free()
interface. Over the next few patches, we can use the polymorphic
functions to eliminate the need for a FOO_get_visitor() function
for accessing specific visitor functionality, once everything can
be accessed directly through the Visitor* interfaces.
The dealloc visitor is the first one converted to completely use
the new entry point, since qapi_dealloc_visitor_cleanup() was the
only reason that qapi_dealloc_get_visitor() existed, and only
generated and testsuite code was even using it. With the new
visit_free() entry point in place, we no longer need to expose
the QapiDeallocVisitor subtype through qapi_dealloc_visitor_new(),
and can get by with less generated code, with diffs that look like:
| void qapi_free_ACPIOSTInfo(ACPIOSTInfo *obj)
| {
|- QapiDeallocVisitor *qdv;
| Visitor *v;
|
| if (!obj) {
| return;
| }
|
|- qdv = qapi_dealloc_visitor_new();
|- v = qapi_dealloc_get_visitor(qdv);
|+ v = qapi_dealloc_visitor_new();
| visit_type_ACPIOSTInfo(v, NULL, &obj, NULL);
|- qapi_dealloc_visitor_cleanup(qdv);
|+ visit_free(v);
|}
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-5-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Rather than making the dealloc visitor track of stack of pointers
remembered during visit_start_* in order to free them during
visit_end_*, it's a lot easier to just make all callers pass the
same pointer to visit_end_*. The generated code has access to the
same pointer, while all other users are doing virtual walks and
can pass NULL. The dealloc visitor is then greatly simplified.
All three visit_end_*() functions intentionally take a void**,
even though the visit_start_*() functions differ between void**,
GenericList**, and GenericAlternate**. This is done for several
reasons: when doing a virtual walk, passing NULL doesn't care
what the type is, but when doing a generated walk, we already
have to cast the caller's specific FOO* to call visit_start,
while using void** lets us use visit_end without a cast. Also,
an upcoming patch will add a clone visitor that wants to use
the same implementation for all three visit_end callbacks,
which is made easier if all three share the same signature.
For visitors with already track per-object state (the QMP visitors
via a stack, and the string visitors which do not allow nesting),
add an assertion that the caller is indeed passing the same
pointer to paired calls.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-4-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
'qjson.h' is not a QObject subtype; include this file directly in
.c files that are using it, rather than abusing qmp/types.h for
that purpose.
Meanwhile, for files that include a list of individual QObject
subtypes, it's easier to just use qmp/types.h for that purpose.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-2-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Spice client needs the whole GL texture dimension to be able to show a
scanout with a monitor offset (different than +0+0).
Furthermore, this fixes a crash when calling surface_{width,height}()
after dpy_gfx_replace_surface(con, NULL) was called in
virgl_cmd_set_scanout()
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1465911849-30423-4-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In virgl_cmd_resource_flush(), when several consoles are updated, it
needs to keep blocking until all spice gl draws are done. This fixes an
assert() in spice when using multiple monitors with virgl.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1465911849-30423-2-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Spice deprecated this callback in 0.12.6.
It's not a problem yet, but it will cause Clang to fail in a -Werror
build due to the deprecated tag.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1467240095-12507-2-git-send-email-jsnow@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Some architectures (e.g. ARMv8) need the address which is aligned
to a size more than the size of the memory access.
To support such check it's enough the current costless alignment
check implementation in QEMU, but we need to support
an alignment size specifying.
Signed-off-by: Sergey Sorokin <afarallax@yandex.ru>
Message-Id: <1466705806-679898-1-git-send-email-afarallax@yandex.ru>
Signed-off-by: Richard Henderson <rth@twiddle.net>
[rth: Assert in tcg_canonicalize_memop. Leave get_alignment_bits
available for, though unused by, user-mode. Retain logging difference
based on ALIGNED_ONLY.]
iommus can not be added with -device.
cleanups and fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXe4l4AAoJECgfDbjSjVRpIz4IALye7mKG61/POA4Gqmhalc3d
HnlNSZ2YcKAuvPg7WWkBuRacrQvVY/MbW1mLloG1lY0tdFgZG8Cy+CY6wJg1NE4c
cXd+77vHkIyrnl+Nil+QOgTFiAsMnD+mXHHsnCDw2jGn3JbgVNuCMi7V34fGkQd2
PDkZyYfwTqO3HytuG0/j2Somc9du1gjYdn+9qigfZVgP96jGDojBuJWuuU5flCB3
Kj5xrOuI01XlbdTk71tVjBJBektQurWr6r7GECDqZIpUfc+BI70FU9jPh+OlLTO/
92yi29ncjyStz4tRnf18xoQ8uBgH/tD1xigEUPRtnm1+0i/tgONBL8cAdBF9FBE=
=ABGE
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio: new features, cleanups, fixes
iommus can not be added with -device.
cleanups and fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 05 Jul 2016 11:18:32 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (30 commits)
vmw_pvscsi: remove unnecessary internal msi state flag
e1000e: remove unnecessary internal msi state flag
vmxnet3: remove unnecessary internal msi state flag
mptsas: remove unnecessary internal msi state flag
megasas: remove unnecessary megasas_use_msi()
pci: Convert msi_init() to Error and fix callers to check it
pci bridge dev: change msi property type
megasas: change msi/msix property type
mptsas: change msi property type
intel-hda: change msi property type
usb xhci: change msi/msix property type
change pvscsi_init_msi() type to void
tests: add APIC.cphp and DSDT.cphp blobs
tests: acpi: add CPU hotplug testcase
log: Permit -dfilter 0..0xffffffffffffffff
range: Replace internal representation of Range
range: Eliminate direct Range member access
log: Clean up misuse of Range for -dfilter
pci_register_bar: cleanup
Revert "virtio-net: unbreak self announcement and guest offloads after migration"
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is the final patch for converting the common I/O path to take
a BdrvChild parameter instead of BlockDriverState.
The completion of this conversion means that all users that perform I/O
on an image need to actually hold a reference (in the form of BdrvChild,
possible as part of a BlockBackend) to that image. This also protects
against inconsistent use of BlockBackend vs. BlockDriverState functions
because direct use of a BlockDriverState isn't possible any more and
blk->root is private for block-backends.c.
In addition, we can now distinguish different users in the I/O path,
and the future op blockers work is going to add assertions based on
permissions stored in BdrvChild.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Using int for values that are only used as booleans is confusing.
While at it, rearrange a couple of members so that all the bools
are contiguous.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It makes more sense to have ALL block size limit constraints
in the same struct. Improve the documentation while at it.
Simplify a couple of conditionals, now that we have audited and
documented that request_alignment is always non-zero.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based limits are awkward to think about; in our on-going
quest to move to byte-based interfaces, convert max_discard and
discard_alignment. Rename them, using 'pdiscard' as an aid to
track which remaining discard interfaces need conversion, and so
that the compiler will help us catch the change in semantics
across any rebased code. The BlockLimits type is now completely
byte-based; and in iscsi.c, sector_limits_lun2qemu() is no
longer needed.
pdiscard_alignment is made unsigned (we use power-of-2 alignments
as bitmasks, where unsigned is easier to think about) while
leaving max_pdiscard signed (since we still have an 'int'
interface); this is comparable to what commit cf081fc did for
write zeroes limits. We may later want to make everything an
unsigned 64-bit limit - but that requires a bigger code audit.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Improve the documentation of the write zeroes limits, to mention
additional constraints that drivers should observe. Worth squashing
into commit cf081fca, if that hadn't been pushed already :)
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based limits are awkward to think about; in our on-going
quest to move to byte-based interfaces, convert max_transfer_length
and opt_transfer_length. Rename them (dropping the _length suffix)
so that the compiler will help us catch the change in semantics
across any rebased code, and improve the documentation. Use unsigned
values, so that we don't have to worry about negative values and
so that bit-twiddling is easier; however, we are still constrained
by 2^31 of signed int in most APIs.
When a value comes from an external source (iscsi and raw-posix),
sanitize the results to ensure that opt_transfer is a power of 2.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The NBD layer was breaking up request at a limit of 2040 sectors
(just under 1M) to cater to old qemu-nbd. But the server limit
was raised to 32M in commit 2d8214885 to match the kernel, more
than three years ago; and the upstream NBD Protocol is proposing
documentation that without any explicit communication to state
otherwise, a client should be able to safely assume that a 32M
transaction will work. It is time to rely on the larger sizing,
and any downstream distro that cares about maximum
interoperability to older qemu-nbd servers can just tweak the
value of #define NBD_MAX_SECTORS.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
msi_init() reports errors with error_report(), which is wrong
when it's used in realize().
Fix by converting it to Error.
Fix its callers to handle failure instead of ignoring it.
For those callers who don't handle the failure, it might happen:
when user want msi on, but he doesn't get what he want because of
msi_init fails silently.
cc: Gerd Hoffmann <kraxel@redhat.com>
cc: John Snow <jsnow@redhat.com>
cc: Dmitry Fleytman <dmitry@daynix.com>
cc: Jason Wang <jasowang@redhat.com>
cc: Michael S. Tsirkin <mst@redhat.com>
cc: Hannes Reinecke <hare@suse.de>
cc: Paolo Bonzini <pbonzini@redhat.com>
cc: Alex Williamson <alex.williamson@redhat.com>
cc: Markus Armbruster <armbru@redhat.com>
cc: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
This adds support for Dynamic DMA Windows (DDW) option defined by
the SPAPR specification which allows to have additional DMA window(s)
The "ddw" property is enabled by default on a PHB but for compatibility
the pseries-2.6 machine and older disable it.
This also creates a single DMA window for the older machines to
maintain backward migration.
This implements DDW for PHB with emulated and VFIO devices. The host
kernel support is required. The advertised IOMMU page sizes are 4K and
64K; 16M pages are supported but not advertised by default, in order to
enable them, the user has to specify "pgsz" property for PHB and
enable huge pages for RAM.
The existing linux guests try creating one additional huge DMA window
with 64K or 16MB pages and map the entire guest RAM to. If succeeded,
the guest switches to dma_direct_ops and never calls TCE hypercalls
(H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM
and not waste time on map/unmap later. This adds a "dma64_win_addr"
property which is a bus address for the 64bit window and by default
set to 0x800.0000.0000.0000 as this is what the modern POWER8 hardware
uses and this allows having emulated and VFIO devices on the same bus.
This adds 4 RTAS handlers:
* ibm,query-pe-dma-window
* ibm,create-pe-dma-window
* ibm,remove-pe-dma-window
* ibm,reset-pe-dma-window
These are registered from type_init() callback.
These RTAS handlers are implemented in a separate file to avoid polluting
spapr_iommu.c with PCI.
This changes sPAPRPHBState::dma_liobn to an array to allow 2 LIOBNs
and updates all references to dma_liobn. However this does not add
64bit LIOBN to the migration stream as in fact even 32bit LIOBN is
rather pointless there (as it is a PHB property and the management
software can/should pass LIOBNs via CLI) but we keep it for the backward
migration support.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
New VFIO_SPAPR_TCE_v2_IOMMU type supports dynamic DMA window management.
This adds ability to VFIO common code to dynamically allocate/remove
DMA windows in the host kernel when new VFIO container is added/removed.
This adds a helper to vfio_listener_region_add which makes
VFIO_IOMMU_SPAPR_TCE_CREATE ioctl and adds just created IOMMU into
the host IOMMU list; the opposite action is taken in
vfio_listener_region_del.
When creating a new window, this uses heuristic to decide on the TCE table
levels number.
This should cause no guest visible change in behavior.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Added some casts to prevent printf() warnings on certain targets
where the kernel headers' __u64 doesn't match uint64_t or PRIx64]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There are going to be multiple IOMMUs per a container. This moves
the single host IOMMU parameter set to a list of VFIOHostDMAWindow.
This should cause no behavioral change and will be used later by
the SPAPR TCE IOMMU v2 which will also add a vfio_host_win_del() helper.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This makes use of the new "memory registering" feature. The idea is
to provide the userspace ability to notify the host kernel about pages
which are going to be used for DMA. Having this information, the host
kernel can pin them all once per user process, do locked pages
accounting (once) and not spent time on doing that in real time with
possible failures which cannot be handled nicely in some cases.
This adds a prereg memory listener which listens on address_space_memory
and notifies a VFIO container about memory which needs to be
pinned/unpinned. VFIO MMIO regions (i.e. "skip dump" regions) are skipped.
The feature is only enabled for SPAPR IOMMU v2. The host kernel changes
are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does
not call it when v2 is detected and enabled.
This enforces guest RAM blocks to be host page size aligned; however
this is not new as KVM already requires memory slots to be host page
size aligned.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[dwg: Fix compile error on 32-bit host]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>