Commit Graph

4062 Commits

Author SHA1 Message Date
Eric Blake
68ab47e4b4 qapi: Change visit_type_FOO() to no longer return partial objects
Returning a partial object on error is an invitation for a careless
caller to leak memory.  We already fixed things in an earlier
patch to guarantee NULL if visit_start fails ("qapi: Guarantee
NULL obj on input visitor callback error"), but that does not
help the case where visit_start succeeds but some other failure
happens before visit_end, such that we leak a partially constructed
object outside visit_type_FOO(). As no one outside the testsuite
was actually relying on these semantics, it is cleaner to just
document and guarantee that ALL pointer-based visit_type_FOO()
functions always leave a safe value in *obj during an input visitor
(either the new object on success, or NULL if an error is
encountered), so callers can now unconditionally use
qapi_free_FOO() to clean up regardless of whether an error occurred.

The decision is done by adding visit_is_input(), then updating the
generated code to check if additional cleanup is needed based on
the type of visitor in use.

Note that we still leave *obj unchanged after a scalar-based
visit_type_FOO(); I did not feel like auditing all uses of
visit_type_Enum() to see if the callers would tolerate a specific
sentinel value (not to mention having to decide whether it would
be better to use 0 or ENUM__MAX as that sentinel).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-25-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake
d9f62dde13 qapi: Simplify semantics of visit_next_list()
The semantics of the list visit are somewhat baroque, with the
following pseudocode when FooList is used:

start()
for (prev = head; cur = next(prev); prev = &cur) {
    visit(&cur->value)
}

Note that these semantics (advance before visit) requires that
the first call to next() return the list head, while all other
calls return the next element of the list; that is, every visitor
implementation is required to track extra state to decide whether
to return the input as-is, or to advance.  It also requires an
argument of 'GenericList **' to next(), solely because the first
iteration might need to modify the caller's GenericList head, so
that all other calls have to do a layer of dereferencing.

Thankfully, we only have two uses of list visits in the entire
code base: one in spapr_drc (which completely avoids
visit_next_list(), feeding in integers from a different source
than uint8List), and one in qapi-visit.py.  That is, all other
list visitors are generated in qapi-visit.c, and share the same
paradigm based on a qapi FooList type, so we can refactor how
lists are laid out with minimal churn among clients.

We can greatly simplify things by hoisting the special case
into the start() routine, and flipping the order in the loop
to visit before advance:

start(head)
for (tail = *head; tail; tail = next(tail)) {
    visit(&tail->value)
}

With the simpler semantics, visitors have less state to track,
the argument to next() is reduced to 'GenericList *', and it
also becomes obvious whether an input visitor is allocating a
FooList during visit_start_list() (rather than the old way of
not knowing if an allocation happened until the first
visit_next_list()).  As a minor drawback, we now allocate in
two functions instead of one, and have to pass the size to
both functions (unless we were to tweak the input visitors to
cache the size to start_list for reuse during next_list, but
that defeats the goal of less visitor state).

The signature of visit_start_list() is chosen to match
visit_start_struct(), with the new parameters after 'name'.

The spapr_drc case is a virtual visit, done by passing NULL for
list, similarly to how NULL is passed to visit_start_struct()
when a qapi type is not used in those visits.  It was easy to
provide these semantics for qmp-output and dealloc visitors,
and a bit harder for qmp-input (several prerequisite patches
refactored things to make this patch straightforward).  But it
turned out that the string and opts visitors munge enough other
state during visit_next_list() to make it easier to just
document and require a GenericList visit for now; an assertion
will remind us to adjust things if we need the semantics in the
future.

Several pre-requisite cleanup patches made the reshuffling of
the various visitors easier; particularly the qmp input visitor.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-24-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake
15c2f669e3 qapi: Split visit_end_struct() into pieces
As mentioned in previous patches, we want to call visit_end_struct()
functions unconditionally, so that visitors can release resources
tied up since the matching visit_start_struct() without also having
to worry about error priority if more than one error occurs.

Even though error_propagate() can be safely used to ignore a second
error during cleanup caused by a first error, it is simpler if the
cleanup cannot set an error.  So, split out the error checking
portion (basically, input visitors checking for unvisited keys) into
a new function visit_check_struct(), which can be safely skipped if
any earlier errors are encountered, and leave the cleanup portion
(which never fails, but must be called unconditionally if
visit_start_struct() succeeded) in visit_end_struct().

Generated code in qapi-visit.c has diffs resembling:

|@@ -59,10 +59,12 @@ void visit_type_ACPIOSTInfo(Visitor *v,
|         goto out_obj;
|     }
|     visit_type_ACPIOSTInfo_members(v, obj, &err);
|-    error_propagate(errp, err);
|-    err = NULL;
|+    if (err) {
|+        goto out_obj;
|+    }
|+    visit_check_struct(v, &err);
| out_obj:
|-    visit_end_struct(v, &err);
|+    visit_end_struct(v);
| out:

and in qapi-event.c:

@@ -47,7 +47,10 @@ void qapi_event_send_acpi_device_ost(ACP
|         goto out;
|     }
|     visit_type_q_obj_ACPI_DEVICE_OST_arg_members(v, &param, &err);
|-    visit_end_struct(v, err ? NULL : &err);
|+    if (!err) {
|+        visit_check_struct(v, &err);
|+    }
|+    visit_end_struct(v);
|     if (err) {
|         goto out;

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-20-git-send-email-eblake@redhat.com>
[Conflict with a doc fixup resolved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake
3bc97fd592 qapi: Add visit_type_null() visitor
Right now, qmp-output-visitor happens to produce a QNull result
if nothing is actually visited between the creation of the visitor
and the request for the resulting QObject.  A stronger protocol
would require that a QMP output visit MUST visit something.  But
to still be able to produce a JSON 'null' output, we need a new
visitor function that states our intentions.  Yes, we could say
that such a visit must go through visit_type_any(), but that
feels clunky.

So this patch introduces the new visit_type_null() interface and
its no-op interface in the dealloc visitor, and stubs in the
qmp visitors (the next patch will finish the implementation).
For the visitors that will not implement the callback, document
the situation. The code in qapi-visit-core unconditionally
dereferences the callback pointer, so that a segfault will inform
a developer if they need to implement the callback for their
choice of visitor.

Note that JSON has a primitive null type, with the single value
null; likewise with the QNull type for QObject; but for QAPI,
we just have the 'null' value without a null type.  We may
eventually want to add more support in QAPI for null (most likely,
we'd use it via an alternate type that permits 'null' or an
object); but we'll create that usage when we need it.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-15-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
adfb264c9e qapi: Document visitor interfaces, add assertions
The visitor interface for mapping between QObject/QemuOpts/string
and QAPI is scandalously under-documented, making changes to visitor
core, individual visitors, and users of visitors difficult to
coordinate.  Among other questions: when is it safe to pass NULL,
vs. when a string must be provided; which visitors implement which
callbacks; the difference between concrete and virtual visits.

Correct this by retrofitting proper contracts, and document where some
of the interface warts remain (for example, we may want to modify
visit_end_* to require the same 'obj' as the visit_start counterpart,
so the dealloc visitor can be simplified).  Later patches in this
series will tackle some, but not all, of these warts.

Add assertions to (partially) enforce the contract.  Some of these
were only made possible by recent cleanup commits.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-13-git-send-email-eblake@redhat.com>
[Doc fix from Eric squashed in]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
fc471c18d5 qapi: Consolidate QMP input visitor creation
Rather than having two separate ways to create a QMP input
visitor, where the safer approach has the more verbose name,
it is better to consolidate things into a single function
where the caller must explicitly choose whether to be strict
or to ignore excess input.  This patch is the strictly
mechanical conversion; the next patch will then audit which
uses can be made stricter.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-6-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
42a502a7a6 qmp: Drop dead command->type
Ever since QMP was first added back in commit 43c20a43, we have
never had any QmpCommandType other than QCT_NORMAL.  It's
pointless to carry around the cruft.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-4-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
983f52d4b3 qapi-visit: Add visitor.type classification
We have three classes of QAPI visitors: input, output, and dealloc.
Currently, all implementations of these visitors have one thing in
common based on their visitor type: the implementation used for the
visit_type_enum() callback.  But since we plan to add more such
common behavior, in relation to documenting and further refining
the semantics, it makes more sense to have the visitor
implementations advertise which class they belong to, so the common
qapi-visit-core code can use that information in multiple places.

A later patch will better document the types of visitors directly
in visitor.h.

For this patch, knowing the class of a visitor implementation lets
us make input_type_enum() and output_type_enum() become static
functions, by replacing the callback function Visitor.type_enum()
with the simpler enum member Visitor.type.  Share a common
assertion in qapi-visit-core as part of the refactoring.

Move comments in opts-visitor.c to match the refactored layout.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-2-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Markus Armbruster
51b9b478cc qom: -object error messages lost location, restore it
qemu_opts_foreach() runs its callback with the error location set to
the option's location.  Any errors the callback reports use the
option's location automatically.

Commit 90998d5 moved the actual error reporting from "inside"
qemu_opts_foreach() to after it.  Here's a typical hunk:

	 if (qemu_opts_foreach(qemu_find_opts("object"),
    -                          object_create,
    -                          object_create_initial, NULL)) {
    +                          user_creatable_add_opts_foreach,
    +                          object_create_initial, &err)) {
    +        error_report_err(err);
	     exit(1);
	 }

Before, object_create() reports from within qemu_opts_foreach(), using
the option's location.  Afterwards, we do it after
qemu_opts_foreach(), using whatever location happens to be current
there.  Commonly a "none" location.

This is because Error objects don't have location information.
Problematic.

Reproducer:

    $ qemu-system-x86_64 -nodefaults -display none -object secret,id=foo,foo=bar
    qemu-system-x86_64: Property '.foo' not found

Note no location.  This commit restores it:

    qemu-system-x86_64: -object secret,id=foo,foo=bar: Property '.foo' not found

Note that the qemu_opts_foreach() bug just fixed could mask the bug
here: if the location it leaves dangling hasn't been clobbered, yet,
it's the correct one.

Reported-by: Eric Blake <eblake@redhat.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1461767349-15329-4-git-send-email-armbru@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Paragraph on Error added to commit message]
2016-04-28 08:19:36 +02:00
Fam Zheng
54e18d35e4 event-notifier: Add "is_external" parameter
All callers pass "false" keeping the old semantics. The windows
implementation doesn't distinguish the flag yet. On posix, it is passed
down to the underlying aio context.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-04-22 16:43:56 +02:00
Fam Zheng
bcd82a968f iohandler: Introduce iohandler_get_aio_context
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-04-22 16:43:42 +02:00
Ladi Prosek
b065e275a8 virtio-input: support absolute axis config in pass-through
VIRTIO_INPUT_CFG_ABS_INFO was not implemented for pass-through input
devices. This patch follows the existing design and pre-fetches the
config for all absolute axes using EVIOCGABS at realize time.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 1460558603-18331-1-git-send-email-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-04-13 17:26:12 +02:00
Gerd Hoffmann
441330f714 move const_le{16, 23} to qemu/bswap.h, add comment
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1460441239-867-1-git-send-email-kraxel@redhat.com
2016-04-13 15:52:28 +02:00
Gerd Hoffmann
a263bac192 virtio-input: add parenthesis to const_le{16, 32}
"_x" must be "(_x)" otherwise things fail if you pass in expressions.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1460440299-26654-1-git-send-email-kraxel@redhat.com
2016-04-13 15:52:28 +02:00
Fam Zheng
a77fd4bb29 block: Fix bdrv_drain in coroutine
Using the nested aio_poll() in coroutine is a bad idea. This patch
replaces the aio_poll loop in bdrv_drain with a BH, if called in
coroutine.

For example, the bdrv_drain() in mirror.c can hang when a guest issued
request is pending on it in qemu_co_mutex_lock().

Mirror coroutine in this case has just finished a request, and the block
job is about to complete. It calls bdrv_drain() which waits for the
other coroutine to complete. The other coroutine is a scsi-disk request.
The deadlock happens when the latter is in turn pending on the former to
yield/terminate, in qemu_co_mutex_lock(). The state flow is as below
(assuming a qcow2 image):

  mirror coroutine               scsi-disk coroutine
  -------------------------------------------------------------
  do last write

    qcow2:qemu_co_mutex_lock()
    ...
                                 scsi disk read

                                   tracked request begin

                                   qcow2:qemu_co_mutex_lock.enter

    qcow2:qemu_co_mutex_unlock()

  bdrv_drain
    while (has tracked request)
      aio_poll()

In the scsi-disk coroutine, the qemu_co_mutex_lock() will never return
because the mirror coroutine is blocked in the aio_poll(blocking=true).

With this patch, the added qemu_coroutine_yield() allows the scsi-disk
coroutine to make progress as expected:

  mirror coroutine               scsi-disk coroutine
  -------------------------------------------------------------
  do last write

    qcow2:qemu_co_mutex_lock()
    ...
                                 scsi disk read

                                   tracked request begin

                                   qcow2:qemu_co_mutex_lock.enter

    qcow2:qemu_co_mutex_unlock()

  bdrv_drain.enter
>   schedule BH
>   qemu_coroutine_yield()
>                                  qcow2:qemu_co_mutex_lock.return
>                                  ...
                                   tracked request end
    ...
    (resumed from BH callback)
  bdrv_drain.return
  ...

Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1459855253-5378-2-git-send-email-famz@redhat.com
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-04-11 16:59:09 +01:00
Gerd Hoffmann
ca58b45fbe ui/virtio-gpu: add and use qemu_create_displaysurface_pixman
Add a the new qemu_create_displaysurface_pixman function, to create
a DisplaySurface backed by an existing pixman image.  In that case
there is no need to create a new pixman image pointing to the same
backing storage.  We can just use the existing image directly.

This does not only simplify things a bit, but most importantly it
gets the reference counting right, so the backing storage for the
pixman image wouldn't be released underneath us.

Use new function in virtio-gpu, where using it actually fixes
use-after-free crashes.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1459499240-742-1-git-send-email-kraxel@redhat.com
2016-04-11 12:32:01 +02:00
Paolo Bonzini
a378b49a43 virtio: merge virtio_queue_aio_set_host_notifier_handler with virtio_queue_set_aio
Eliminating the reentrancy is actually a nice thing that we can do
with the API that Michael proposed, so let's make it first class.
This also hides the complex assign/set_handler conventions from
callers of virtio_queue_aio_set_host_notifier_handler, which in
fact was always called with assign=true.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-04-07 19:57:33 +03:00
Paolo Bonzini
a8f2e5c8ff virtio-scsi: use aio handler for data plane
In addition to handling IO in vcpu thread and in io thread, dataplane
introduces yet another mode: handling it by AioContext.

This reuses the same handler as previous modes, which triggers races as
these were not designed to be reentrant.  Use a separate handler just
for aio, and disable regular handlers when dataplane is active.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-04-07 19:57:33 +03:00
Michael S. Tsirkin
8a2fad57eb virtio-blk: use aio handler for data plane
In addition to handling IO in vcpu thread and in io thread, dataplane
introduces yet another mode: handling it by AioContext.

This reuses the same handler as previous modes, which triggers races as
these were not designed to be reentrant.  Use a separate handler just
for aio, and disable regular handlers when dataplane is active.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-04-07 19:57:33 +03:00
Michael S. Tsirkin
344dc16fae virtio: add aio handler
In addition to handling IO in vcpu thread and in io thread, blk dataplane
introduces yet another mode: handling it by AioContext.

Currently, this reuses the same handler as previous modes,
which triggers races as these were not designed to be reentrant.
Add instead a separate handler just for aio; this will make
it possible to disable regular handlers when dataplane is active.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-04-07 19:57:33 +03:00
Paolo Bonzini
43c696a298 virtio-scsi: fix disabled mode
Add two missing checks for s->dataplane_fenced.  In one case, QEMU
would skip injecting an IRQ due to a write to an uninitialized
EventNotifier's file descriptor.

In the second case, the dataplane_disabled field was used by mistake;
in fact after fixing this occurrence it is completely unused.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-04-07 19:57:33 +03:00
Paolo Bonzini
eb41cf78fc virtio-blk: fix disabled mode
We must not call virtio_blk_data_plane_notify if dataplane is
disabled: we would hit a segmentation fault in notify_guest_bh as
s->guest_notifier has not been setup and is NULL.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-04-07 19:57:33 +03:00
Paolo Bonzini
2b2cbcadc1 virtio: make virtio_queue_notify_vq static
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-04-07 19:57:33 +03:00
Gerd Hoffmann
bab47d9a75 Sort the fw_cfg file list
Entries are inserted in filename order instead of being
appended to the end in case sorting is enabled.

This will avoid any future issues of moving the file creation
around, it doesn't matter what order they are created now,
the will always be in filename order.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

Added machine type handling for compatibility.  This was
a fairly complex change, this will preserve the order of fw_cfg
for older versions no matter what order the firmware files
actually come in.  A list is kept of the correct legacy order
and the entries will be inserted based upon their order in
the list.  Except that some entries are ordered (in a specific
area of the list) based upon what order they appear on the
command line.  Special handling is added for those entries.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-04-07 19:57:33 +03:00
Stefan Weil
8d0ac88e23 acpi: Add missing GCC_FMT_ATTR
This fixes a compiler warning when compiling with -Wextra.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-04-07 19:57:33 +03:00
Peter Maydell
cc621a9838 * FreeBSD build fixes (atomics, qapi/error.h)
* x86 KVM fixes (SynIC, KVM_GET/SET_MSRS)
 * Memory API doc fix
 * checkpatch fix
 * Chardev and socket fixes
 * NBD fixes
 * exec.c SEGV fix
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJXA4nFAAoJEL/70l94x66DKqQIAIR+0iID6hXUDTtqa/D8ZgfY
 kGrRyFjyhihsHAM+pLg4YaXGpdYFOBZTW0ZA2qjUoM7u/6uigpbTkQTC25wpMSnd
 OpyApB0oEIv5vuwku1AayF43Meq9PuTl7baxM5gqqo8xzqkzbvlrfvX+62GYGai6
 NATpAEMQAB7usKcTdUElcKczaiUlGDfail+LnKQoq+ih5xDH4LYwpkD9p5EQCTK1
 pkF9LxAbRomFxAxar5m20zPFMMX+33QduEIvcUelTeZJN545R6di1eXMLpu5OGgu
 21zZ8o1ahgrBNI9nQZkeaSaFvFQr+n5T6pIEaoPES5rrMyAg77o0Zv47fpCZFiI=
 =ZB1f
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* FreeBSD build fixes (atomics, qapi/error.h)
* x86 KVM fixes (SynIC, KVM_GET/SET_MSRS)
* Memory API doc fix
* checkpatch fix
* Chardev and socket fixes
* NBD fixes
* exec.c SEGV fix

# gpg: Signature made Tue 05 Apr 2016 10:47:49 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream:
  net: fix missing include of qapi/error.h in netmap.c
  nbd: Fix poor debug message
  include/qemu/atomic: add compile time asserts
  cpus: don't use atomic_read for vm_clock_warp_start
  nbd: don't request FUA on FLUSH
  doc/memory: update MMIO section
  char: ensure all clients are in non-blocking mode
  char: fix broken EAGAIN retry on OS-X due to errno clobbering
  util: retry getaddrinfo if getting EAI_BADFLAGS with AI_V4MAPPED
  checkpatch: add target_ulong to typelist
  target-i386: assert that KVM_GET/SET_MSRS can set all requested MSRs
  target-i386: do not pass MSR_TSC_AUX to KVM ioctls if CPUID bit is not set
  memory: fix segv on qemu_ram_free(block=0x0)
  target-i386/kvm: Hyper-V VMBus hypercalls blank handlers
  update Linux headers to 4.6

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-04-05 11:03:18 +01:00
Alex Bennée
ca47a926ad include/qemu/atomic: add compile time asserts
To be safely portable no atomic access should be trying to do more than
the natural word width of the host. The most common abuse is trying to
atomically access 64 bit values on a 32 bit host.

This patch adds some QEMU_BUILD_BUG_ON to the __atomic instrinsic paths
to create a build failure if (sizeof(*ptr) > sizeof(void *)).

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1459780549-12942-3-git-send-email-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-04-05 11:46:52 +02:00
Paolo Bonzini
b89485a52e update Linux headers to 4.6
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-04-05 11:46:52 +02:00
Michael Roth
f40eb921da spapr_drc: enable immediate detach for unsignalled devices
Currently spapr doesn't support "aborting" hotplug of PCI
devices by allowing device_del to immediately remove the
device if we haven't signalled the presence of the device
to the guest.

In the past this wasn't an issue, since we always immediately
signalled device attach and simply relied on full guest-aware
add->remove path for device removal. However, as of 788d259,
we now defer signalling for PCI functions until function 0
is attached, so now we need to deal with these "abort" operations
for cases where a user hotplugs a non-0 function, then opts to
remove it prior hotplugging function 0. Currently they'd have to
reboot before the unplug completed. PCIe multifunction hotplug
does not have this requirement however, so from a management
implementation perspective it would be good to address this within
the same release as 788d259.

We accomplish this by simply adding a 'signalled' flag to track
whether a device hotplug event has been sent to the guest. If it
hasn't, we allow immediate removal under the assumption that the
guest will not be using the device. Devices present at boot/reset
time are also assumed to be 'signalled'.

For CPU/memory/etc, signalling will still happen immediately
as part of device_add, so only PCI functions should be affected.

Cc: bharata@linux.vnet.ibm.com
Cc: david@gibson.dropbear.id.au
Cc: sbhat@linux.vnet.ibm.com
Cc: qemu-ppc@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[dwg: This fixes a regression where an incorrect hot-add of a non-zero
      function can no longer be backed out until function 0 is added]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-04-05 10:47:03 +10:00
Cédric Le Goater
5c94b2a5e5 ppc: Rework POWER7 & POWER8 exception model
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>

This patch fixes the current AIL implementation for POWER8. The
interrupt vector address can be calculated directly from LPCR when the
exception is handled. The excp_prefix update becomes useless and we
can cleanup the H_SET_MODE hcall.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: Removed LPES0/1 handling for HV vs. !HV
      Fixed LPCR_ILE case for POWERPC_EXCP_POWER8 ]
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[dwg: This was written as a cleanup, but it also fixes a real bug
      where setting an alternative interrupt location would not be
      correctly migrated]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-04-05 10:38:24 +10:00
Denis V. Lunev
99affd1d5b log: move qemu_log_close/qemu_log_flush from header to log.c
There is no particular reason to keep these functions in the header.
Suggested by Paolo.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1458128212-4197-3-git-send-email-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-31 09:58:32 +01:00
Peter Xu
29039acf58 kvm: add kvm_device_supported() helper function
This can be used when probing whether KVM support specific device. Here,
a raw vmfd is used.

Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1458788142-17509-4-git-send-email-peterx@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-30 17:27:24 +01:00
Peter Maydell
489ef4c810 MIPS patches 2016-03-29
Changes:
 * add initial MIPS CPS support
 * implement ITU block
 * implement MAAR
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJW+43VAAoJEFIRjjwLKdprLmoH/1iWT4WsUJDF+9KX7PpFANbQ
 DT+QSDBJr6K+jCenLlqfvB30txS+NDRFzmW65J8hlawVOwhamg1X+pcQTbAYy0sm
 Du3Wexye0uw5YKUmqK2oCrgLJCKm3AqsmraaITE8q1URlkrQpuOuzazlIx5UA+RW
 RgF/DsPAlit8TkZMHwaVIOeXUl8vl8152fU26QvwOGAT6J3lV+lQJ+gMPGRSAWOw
 dcuVGNOTV0g3+kzOWisiqZc/V0Wp2Yu5IPezEVkFjZ4iyTTpnR8gkzuTNebzoRZo
 Zmws4mqaZAX6ijveevd+ueh5sR+AX+mFBunoXQVSSBvRFlqBcEEL2nrwb9wOUX4=
 =Ns6n
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/lalrae/tags/mips-20160329-2' into staging

MIPS patches 2016-03-29

Changes:
* add initial MIPS CPS support
* implement ITU block
* implement MAAR

# gpg: Signature made Wed 30 Mar 2016 09:27:01 BST using RSA key ID 0B29DA6B
# gpg: Good signature from "Leon Alrae <leon.alrae@imgtec.com>"

* remotes/lalrae/tags/mips-20160329-2: (21 commits)
  target-mips: add MAAR, MAARI register
  target-mips: use CP0_CHECK for gen_m{f|t}hc0
  hw/mips/cps: enable ITU for multithreading processors
  target-mips: make ITC Configuration Tags accessible to the CPU
  target-mips: check CP0 enabled for CACHE instruction also in R6
  hw/mips: implement ITC Storage - Bypass View
  hw/mips: implement ITC Storage - P/V Sync and Try Views
  hw/mips: implement ITC Storage - Empty/Full Sync and Try Views
  hw/mips: implement ITC Storage - Control View
  hw/mips: implement ITC Configuration Tags and Storage Cells
  target-mips: enable CM GCR in MIPS64R6-generic CPU
  hw/mips_malta: add CPS to Malta board
  hw/mips_malta: move CPU creation to a separate function
  hw/mips_malta: remove redundant irq and clock init
  hw/mips_malta: remove CPUMIPSState from the write_bootloader()
  hw/mips/cps: create CPC block inside CPS
  hw/mips: add initial Cluster Power Controller support
  hw/mips/cps: create GCR block inside CPS
  hw/mips: add initial Global Config Register support
  target-mips: add CMGCRBase register
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-30 16:06:45 +01:00
Kevin Wolf
09cf9db1bc block: Remove bdrv_(set_)enable_write_cache()
The only remaining users were block jobs (mirror and backup) which
unconditionally enabled WCE on the BlockBackend of the target image. As
these block jobs don't go through BlockBackend for their I/O requests,
they aren't affected by this setting anyway but always get a writeback
mode, so that call can be removed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:03 +02:00
Kevin Wolf
61de4c6808 block: Remove BDRV_O_CACHE_WB
The previous patches have successively made blk->enable_write_cache the
true source for the information whether a writethrough mode must be
implemented. The corresponding BDRV_O_CACHE_WB is only useless baggage
we're carrying around, so now's the time to remove it.

At the same time, we remove the 'cache.writeback' option parsing on the
BDS level as the only effect was setting the BDRV_O_CACHE_WB flag.

This change requires test cases that explicitly enabled the option to
drop it. Other than that and the change of the error message when
writethrough is enabled on the BDS level (from "Can't set writethrough
mode" to "doesn't support the option"), there should be no change in
behaviour.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:03 +02:00
Kevin Wolf
53e8ae0100 block: Remove bdrv_parse_cache_flags()
All users are converted to bdrv_parse_cache_mode() now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:03 +02:00
Kevin Wolf
93f5e6d88a block: Introduce bdrv_co_writev_flags()
This function will allow drivers to implement BDRV_REQ_FUA natively
instead of sending a separate flush after the write.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:02 +02:00
Kevin Wolf
c83f9fba2a block/qapi: Use blk_enable_write_cache()
Now that WCE is handled on the BlockBackend level, the flag is
meaningless for BDSes. As the schema requires us to fill the field,
we return an enabled write cache for them.

Note that this means that querying the BlockBackend name may return
writethrough as the cache information, whereas querying the node-name of
the root of that same BlockBackend will return writeback.

This may appear odd at first, but it actually makes sense because it
correctly repesents the layer that implements the WCE handling. This
becomes more apparent when you consider nodes that are the root node of
multiple BlockBackends, where each BB can have its own WCE setting.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:02 +02:00
Kevin Wolf
bfd18d1e0b block: Move enable_write_cache to BB level
Whether a write cache is used or not is a decision that concerns the
user (e.g. the guest device) rather than the backend. It was already
logically part of the BB level as bdrv_move_feature_fields() always kept
it on top of the BDS tree; with this patch, the core of it (the actual
flag and the additional flushes) is also implemented there.

Direct callers of bdrv_open() must pass BDRV_O_CACHE_WB now if bs
doesn't have a BlockBackend attached.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:02 +02:00
Kevin Wolf
baf5602ed9 block: Add bdrv_parse_cache_mode()
It's like bdrv_parse_cache_flags(), except that writethrough mode isn't
included in the flags, but returned as a separate bool.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:00 +02:00
Pavel Dovgalyuk
63785678f3 replay: introduce block devices record/replay
This patch introduces block driver that implement recording
and replaying of block devices' operations.
All block completion operations are added to the queue.
Queue is flushed at checkpoints and information about processed requests
is recorded to the log. In replay phase the queue is matched with
events read from the log. Therefore block devices requests are processed
deterministically.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
[ kwolf: Rebased onto modified and already applied part of the series ]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-30 12:15:57 +02:00
Pavel Dovgalyuk
c32b82afaf block: add flush callback
This patch adds callback for flush request. This callback is responsible
for flushing whole block devices stack. bdrv_flush function does not
proceed to underlying devices. It should be performed by this callback
function, if needed.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-30 12:12:15 +02:00
Daniel P. Berrange
e6ff69bf5e block: move encryption deprecation warning into qcow code
For a couple of releases we have been warning

  Encrypted images are deprecated
  Support for them will be removed in a future release.
  You can use 'qemu-img convert' to convert your image to an unencrypted one.

This warning was issued by system emulators, qemu-img, qemu-nbd
and qemu-io. Such a broad warning was issued because the original
intention was to rip out all the code for dealing with encryption
inside the QEMU block layer APIs.

The new block encryption framework used for the LUKS driver does
not rely on the unloved block layer API for encryption keys,
instead using the QOM 'secret' object type. It is thus no longer
appropriate to warn about encryption unconditionally.

When the qcow/qcow2 drivers are converted to use the new encryption
framework too, it will be practical to keep AES-CBC support present
for use in qemu-img, qemu-io & qemu-nbd to allow for interoperability
with older QEMU versions and liberation of data from existing encrypted
qcow2 files.

This change moves the warning out of the generic block code and
into the qcow/qcow2 drivers. Further, the warning is set to only
appear when running the system emulators, since qemu-img, qemu-io,
qemu-nbd are expected to support qcow2 encryption long term now that
the maint burden has been eliminated.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-30 12:12:15 +02:00
Daniel P. Berrange
abb06c5ac1 block: add flag to indicate that no I/O will be performed
When opening an image it is useful to know whether the caller
intends to perform I/O on the image or not. In the case of
encrypted images this will allow the block driver to avoid
having to prompt for decryption keys when we merely want to
query header metadata about the image. eg qemu-img info

This flag is enforced at the top level only, since even if
we don't want todo I/O on the 'qcow2' file payload, the
underlying 'file' driver will still need todo I/O to read
the qcow2 header, for example.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-30 11:59:32 +02:00
Kevin Wolf
72f41b6fbd block: Remove blk_set_bs()
The function is unused since commit f21d96d0 ('block: Use BdrvChild in
BlockBackend').

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-03-30 11:59:32 +02:00
Kevin Wolf
63eaaae08c block: Remove bdrv_make_anon()
The call in hmp_drive_del() is dead code because blk_remove_bs() is
called a few lines above. The only other remaining user is
bdrv_delete(), which only abuses bdrv_make_anon() to remove it from the
named nodes list. This path inlines the list entry removal into
bdrv_delete() and removes bdrv_make_anon().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 11:59:32 +02:00
Leon Alrae
408294352a hw/mips/cps: enable ITU for multithreading processors
Make ITU available in the system if CPU supports multithreading
and is part of CPS.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-03-30 09:14:00 +01:00
Leon Alrae
34fa7e83e1 hw/mips: implement ITC Configuration Tags and Storage Cells
Implement ITC as a single object consisting of two memory regions:

1) tag_io: ITC Configuration Tags (i.e. ITCAddressMap{0,1} registers) which
are accessible by the CPU via CACHE instruction. Also adding
MemoryRegion *itc_tag to the CPUMIPSState so that CACHE instruction will
dispatch reads/writes directly.

2) storage_io: memory-mapped ITC Storage whose address space is configurable
(i.e. enabled/remapped/resized) by writing to ITCAddressMap{0,1} registers.

ITC Storage contains FIFO and Semaphore cells. Read-only FIFO bit in the
ITC cell tag indicates the type of the cell. If the ITC Storage contains
both types of cells then FIFOs are located before Semaphores.

Since issuing thread can get blocked on the access to a cell (in E/F
Synchronized and P/V Synchronized Views) each cell has a bitmap to track
which threads are currently blocked.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-03-30 09:14:00 +01:00
Leon Alrae
2edd5261ff hw/mips/cps: create CPC block inside CPS
Create Cluster Power Controller and add a link to the CPC MemoryRegion
in GCR. Guest can enable / map CPC to any physical address by writing to
the memory-mapped GCR_CPC_BASE register.

Set vp-start-reset property to 1 to allow only first VP to run from reset.
Others are brought up by the guest via CPC memory-mapped registers.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-03-30 09:13:59 +01:00
Leon Alrae
1f93a6e4f3 hw/mips: add initial Cluster Power Controller support
Cluster Power Controller (CPC) is responsible for power management in
multiprocessing system. It provides registers to control the power and the
clock frequency of the individual elements in the system.

This patch implements only three registers that are used to control the
power state of each VP on a single core:
* VP Run is a write-only register used to set each VP to the run state
* VP Stop is a write-only register used to set each VP to the suspend state
* VP Running is a read-only register indicating the run state of each VP

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-03-30 09:13:59 +01:00