Commit Graph

436 Commits

Author SHA1 Message Date
Anton Nefedov
29cd0403f1 qapi: remove empty flat union branches and types
Flat unions may now have uncovered branches, so it is possible to get rid
of empty types defined for that purpose only.

Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1529311206-76847-3-git-send-email-anton.nefedov@virtuozzo.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-06-22 16:33:46 +02:00
Cédric Le Goater
54961aac19 cpus: tcg: fix never exiting loop on unplug
Commit 9b0605f983 ("cpus: tcg: unregister thread with RCU, fix
exiting of loop on unplug") changed the exit condition of the loop in
the vCPU thread function but forgot to remove the beginning 'while (1)'
statement. The resulting code :

	while (1) {
	...
	} while (!cpu->unplug || cpu_can_run(cpu));

is a sequence of two distinct two while() loops, the first not exiting
in case of an unplug event.

Remove the first while (1) to fix CPU unplug.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20180425131828.15604-1-clg@kaod.org>
Cc: qemu-stable@nongnu.org
Fixes: 9b0605f983
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2018-05-09 00:13:37 +02:00
Markus Armbruster
f056158d69 cpus: Fix event order on resume of stopped guest
When resume of a stopped guest immediately runs into block device
errors, the BLOCK_IO_ERROR event is sent before the RESUME event.

Reproducer:

1. Create a scratch image
   $ dd if=/dev/zero of=scratch.img bs=1M count=100

   Size doesn't actually matter.

2. Prepare blkdebug configuration:

   $ cat >blkdebug.conf <<EOF
   [inject-error]
   event = "write_aio"
   errno = "5"
   EOF

   Note that errno 5 is EIO.

3. Run a guest with an additional scratch disk, i.e. with additional
   arguments
   -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
   -device virtio-blk-pci,id=scratch,drive=scratch-drive

   The blkdebug part makes all writes to the scratch drive fail with
   EIO.  The werror=stop pauses the guest on write errors.

4. Connect to the QMP socket e.g. like this:
   $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '

   Issue QMP command 'qmp_capabilities':
   QMP> { "execute": "qmp_capabilities" }

5. Boot the guest.

6. In the guest, write to the scratch disk, e.g. like this:

   # dd if=/dev/zero of=/dev/vdb count=1

   Do double-check the device specified with of= is actually the
   scratch device!

7. Issue QMP command 'cont':
   QMP> { "execute": "cont" }

After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.

After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.

The funny event order confuses libvirt: virsh -r domstate DOMAIN
--reason reports "paused (unknown)" rather than "paused (I/O error)".

The culprit is vm_prepare_start().

    /* Ensure that a STOP/RESUME pair of events is emitted if a
     * vmstop request was pending.  The BLOCK_IO_ERROR event, for
     * example, according to documentation is always followed by
     * the STOP event.
     */
    if (runstate_is_running()) {
        qapi_event_send_stop(&error_abort);
        res = -1;
    } else {
        replay_enable_events();
        cpu_enable_ticks();
        runstate_set(RUN_STATE_RUNNING);
        vm_state_notify(1, RUN_STATE_RUNNING);
    }

    /* We are sending this now, but the CPUs will be resumed shortly later */
    qapi_event_send_resume(&error_abort);
    return res;

When resuming a stopped guest, we take the else branch before we get
to sending RESUME.  vm_state_notify() runs virtio_vmstate_change(),
among other things.  This restarts I/O, triggering the BLOCK_IO_ERROR
event.

Reshuffle vm_prepare_start() to send the RESUME event earlier.

Fixes RHBZ 1566153.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180423084518.2426-1-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-09 00:13:37 +02:00
Laszlo Ersek
daa9d2bc6d qapi: discriminate CpuInfoFast on SysEmuTarget, not CpuInfoArch
Add a new field @target (of type @SysEmuTarget) to the output of the
@query-cpus-fast command, which provides more information about the
emulation target than the field @arch (of type @CpuInfoArch). Make @target
the new discriminator for the @CpuInfoFast return structure. Keep @arch
for compatibility.

Cc: "Daniel P. Berrange" <berrange@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180427192852.15013-5-lersek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-05-04 08:27:53 +02:00
Laszlo Ersek
96054f5639 qapi: fill in CpuInfoFast.arch in query-cpus-fast
* Commit ca230ff33f added the @arch field to @CpuInfoFast, but it failed
  to set the new field in qmp_query_cpus_fast(), when TARGET_S390X was not
  defined. The updated @query-cpus-fast example in "qapi-schema.json"
  showed "arch":"x86" only because qmp_query_cpus_fast() calls g_malloc0()
  to allocate @CpuInfoFast, and the CPU_INFO_ARCH_X86 enum constant is
  generated with value 0.

  All @arch values other than @s390 implied the @CpuInfoOther sub-struct
  for @CpuInfoFast -- at the time of writing the patch --, thus no fields
  other than @arch needed to be set when TARGET_S390X was not defined. Set
  @arch now, by copying the corresponding assignments from
  qmp_query_cpus().

* Commit 25fa194b7b added the @riscv enum constant to @CpuInfoArch (used
  in both @CpuInfo and @CpuInfoFast -- the return types of the @query-cpus
  and @query-cpus-fast commands, respectively), and assigned, in both
  return structures, the @CpuInfoRISCV sub-structure to the new enum
  value.

  However, qmp_query_cpus_fast() would not populate either the @arch field
  or the @CpuInfoRISCV sub-structure, when TARGET_RISCV was defined; only
  qmp_query_cpus() would.

  Assign @CpuInfoOther to the @riscv enum constant in @CpuInfoFast, and
  populate only the @arch field in qmp_query_cpus_fast(). Getting CPU
  state without interrupting KVM is an exceptional thing that only S390X
  does currently. Quoting Cornelia Huck <cohuck@redhat.com>, "s390x is
  exceptional in that it has state in QEMU that is actually interesting
  for upper layers and can be retrieved without performance penalty". See
  also
  <https://www.redhat.com/archives/libvir-list/2018-February/msg00121.html>.

Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Viktor VM Mihajlovski <mihajlov@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Fixes: ca230ff33f
Fixes: 25fa194b7b
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180427192852.15013-2-lersek@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-05-04 08:27:53 +02:00
Peter Maydell
c52e7132d7 cpus.c: ensure running CPU recalculates icount deadlines on timer expiry
When we run in TCG icount mode, we calculate the number of instructions
to execute using tcg_get_icount_limit(), which ensures that we stop
execution at the next timer deadline. However there is a bug where
currently we do not recalculate that limit if the guest reprograms
a timer so that the next deadline moves closer, and so we will
continue execution until the original limit and fire the timer
later than we should.

Fix this bug in qemu_timer_notify_cb(): if we are currently running
a VCPU in icount mode, we simply need to kick it out of the main
loop and back to tcg_cpu_exec(), where it will recalculate the
icount limit. If we are not currently running a VCPU, then we
retain the existing logic for waking up a halted CPU.

Cc: qemu-stable@nongnu.org
Fixes: https://bugs.launchpad.net/qemu/+bug/1754038
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180406123838.21249-1-peter.maydell@linaro.org
2018-04-10 13:02:25 +01:00
Alex Bennée
d759c951f3 replay: push replay_mutex_lock up the call tree
Now instead of using the replay_lock to guard the output of the log we
now use it to protect the whole execution section. This replaces what
the BQL used to do when it was held during TCG execution.

We also introduce some rules for locking order - mainly that you
cannot take the replay_mutex while holding the BQL. This leads to some
slight sophistry during start-up and extending the
replay_mutex_destroy function to unlock the mutex without checking
for the BQL condition so it can be cleanly dropped in the non-replay
case.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Tested-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095248.1060.40374.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2018-03-12 17:10:36 +01:00
Peter Maydell
e4ae62b802 -----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJaoonGAAoJEJykq7OBq3PIuvIH/1sU3LokJ9KroaKaqYyAQnOX
 V9ow3x4z3CQ8qOUpFWXA3l3lMLWE3YzGLvSMLsUVXafobX6qmK/LhtmLk3oNrg4j
 Q5T+d/JFZFZx+MsO4yqD29yJFi2BN1paZ1dpjo6uY5BtABg3zi/cKHOcwkCQDvBA
 XNHCSATt0neew51zZ7xKf2ja8tCPbaeshGY56FW1N118LTCNxIU42JKvK3sCZ8KL
 bgWRqg3FDZEF5MY0xZwCuCMwskIpu1nw6xgwXe5UdB42p2QntzGGfd9xzlmAcy2O
 nYjBqlL7ACN0kbKcPtTNPsikP7O4huoT+62s4cRkFuIUNssot3NSv+iV+HJ3ESs=
 =zmof
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

# gpg: Signature made Fri 09 Mar 2018 13:19:02 GMT
# gpg:                using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  vl: introduce vm_shutdown()
  virtio-scsi: fix race between .ioeventfd_stop() and vq handler
  virtio-blk: fix race between .ioeventfd_stop() and vq handler
  block: add aio_wait_bh_oneshot()
  virtio-blk: dataplane: Don't batch notifications if EVENT_IDX is present
  README: Fix typo 'git-publish'
  block: Fix qemu crash when using scsi-block

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:28:16 +00:00
Stefan Hajnoczi
4486e89c21 vl: introduce vm_shutdown()
Commit 00d09fdbba ("vl: pause vcpus before
stopping iothreads") and commit dce8921b2b
("iothread: Stop threads before main() quits") tried to work around the
fact that emulation was still active during termination by stopping
iothreads.  They suffer from race conditions:
1. virtio_scsi_handle_cmd_vq() racing with iothread_stop_all() hits the
   virtio_scsi_ctx_check() assertion failure because the BDS AioContext
   has been modified by iothread_stop_all().
2. Guest vq kick racing with main loop termination leaves a readable
   ioeventfd that is handled by the next aio_poll() when external
   clients are enabled again, resulting in unwanted emulation activity.

This patch obsoletes those commits by fully disabling emulation activity
when vcpus are stopped.

Use the new vm_shutdown() function instead of pause_all_vcpus() so that
vm change state handlers are invoked too.  Virtio devices will now stop
their ioeventfds, preventing further emulation activity after vm_stop().

Note that vm_stop(RUN_STATE_SHUTDOWN) cannot be used because it emits a
QMP STOP event that may affect existing clients.

It is no longer necessary to call replay_disable_events() directly since
vm_shutdown() does so already.

Drop iothread_stop_all() since it is no longer used.

Cc: Fam Zheng <famz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-08 17:38:51 +00:00
Michael Clark
25fa194b7b
RISC-V Build Infrastructure
This adds RISC-V into the build system enabling the following targets:

- riscv32-softmmu
- riscv64-softmmu
- riscv32-linux-user
- riscv64-linux-user

This adds defaults configs for RISC-V, enables the build for the RISC-V
CPU core, hardware, and Linux User Emulation. The 'qemu-binfmt-conf.sh'
script is updated to add the RISC-V ELF magic.

Expected checkpatch errors for consistency reasons:

ERROR: line over 90 characters
FILE: scripts/qemu-binfmt-conf.sh

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
David Hildenbrand
5a9c973b6c cpus: CPU threads are always created initially for one CPU only
It can never happen for single-threaded TCG that we have more than one
CPU in the list, while the first one has not been marked as "created".

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180209195239.16048-4-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06 14:00:59 +01:00
David Hildenbrand
81e9631168 cpus: wait for CPU creation at central place
We can now also wait for the CPU creation for single-threaded TCG, so we
can move the waiting bits further out.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180209195239.16048-3-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06 14:00:59 +01:00
David Hildenbrand
a342173ab7 cpus: properly inititalize CPU > 1 under single-threaded TCG
All but the first CPU are currently not fully inititalized (e.g.
cpu->created is never set).

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180209195239.16048-2-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06 14:00:59 +01:00
Markus Armbruster
112ed241f5 qapi: Empty out qapi-schema.json
The previous commit improved compile time by including less of the
generated QAPI headers.  This is impossible for stuff defined directly
in qapi-schema.json, because that ends up in headers that that pull in
everything.

Move everything but include directives from qapi-schema.json to new
sub-module qapi/misc.json, then include just the "misc" shard where
possible.

It's possible everywhere, except:

* monitor.c needs qmp-command.h to get qmp_init_marshal()

* monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need
  qapi-event.h to get enum QAPIEvent

Perhaps we'll get rid of those some other day.

Adding a type to qapi/migration.json now recompiles some 120 instead
of 2300 out of 5100 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-25-armbru@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-02 13:45:50 -06:00
Markus Armbruster
9af2398977 Include less of the generated modular QAPI headers
In my "build everything" tree, a change to the types in
qapi-schema.json triggers a recompile of about 4800 out of 5100
objects.

The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h,
qapi-types.h.  Each of these headers still includes all its shards.
Reduce compile time by including just the shards we actually need.

To illustrate the benefits: adding a type to qapi/migration.json now
recompiles some 2300 instead of 4800 objects.  The next commit will
improve it further.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-24-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-02 13:45:50 -06:00
Viktor Mihajlovski
ca230ff33f qmp: add architecture specific cpu data for query-cpus-fast
The s390 CPU state can be retrieved without interrupting the
VM execution. Extendend the CpuInfoFast union with architecture
specific data and an implementation for s390.

Return data looks like this:
 [
   {"thread-id":64301,"props":{"core-id":0},
    "arch":"s390","cpu-state":"operating",
    "qom-path":"/machine/unattached/device[0]","cpu-index":0},
   {"thread-id":64302,"props":{"core-id":1},
    "arch":"s390","cpu-state":"operating",
    "qom-path":"/machine/unattached/device[1]","cpu-index":1}
]

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1518797321-28356-4-git-send-email-mihajlov@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-26 12:55:26 +01:00
Luiz Capitulino
ce74ee3dea qmp: add query-cpus-fast
The query-cpus command has an extremely serious side effect:
it always interrupts all running vCPUs so that they can run
ioctl calls. This can cause a huge performance degradation for
some workloads. And most of the information retrieved by the
ioctl calls are not even used by query-cpus.

This commit introduces a replacement for query-cpus called
query-cpus-fast, which has the following features:

 o Never interrupt vCPUs threads. query-cpus-fast only returns
   vCPU information maintained by QEMU itself, which should be
   sufficient for most management software needs

 o Drop "halted" field as it can not be retrieved in a fast
   way on most architectures

 o Drop irrelevant fields such as "current", "pc" and "arch"

 o Rename some fields for better clarification & proper naming
   standard

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Message-Id: <1518797321-28356-3-git-send-email-mihajlov@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-26 12:55:26 +01:00
Viktor Mihajlovski
9d0306dfdf qmp: expose s390-specific CPU info
Presently s390x is the only architecture not exposing specific
CPU information via QMP query-cpus. Upstream discussion has shown
that it could make sense to report the architecture specific CPU
state, e.g. to detect that a CPU has been stopped.

With this change the output of query-cpus will look like this on
s390:

   [
     {"arch": "s390", "current": true,
      "props": {"core-id": 0}, "cpu-state": "operating", "CPU": 0,
      "qom_path": "/machine/unattached/device[0]",
      "halted": false, "thread_id": 63115},
     {"arch": "s390", "current": false,
      "props": {"core-id": 1}, "cpu-state": "stopped", "CPU": 1,
      "qom_path": "/machine/unattached/device[1]",
      "halted": true, "thread_id": 63116}
   ]

This change doesn't add the s390-specific data to HMP 'info cpus'.
A follow-on patch will remove all architecture specific information
from there.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1518797321-28356-2-git-send-email-mihajlov@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-26 12:55:26 +01:00
Markus Armbruster
922a01a013 Move include qemu/option.h from qemu-common.h to actual users
qemu-common.h includes qemu/option.h, but most places that include the
former don't actually need the latter.  Drop the include, and add it
to the places that actually need it.

While there, drop superfluous includes of both headers, and
separate #include from file comment with a blank line.

This cleanup makes the number of objects depending on qemu/option.h
drop from 4545 (out of 4743) to 284 in my "build everything" tree.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-20-armbru@redhat.com>
[Semantic conflict with commit bdd6a90a9e in block/nvme.c resolved]
2018-02-09 13:52:16 +01:00
Markus Armbruster
e688df6bc4 Include qapi/error.h exactly where needed
This cleanup makes the number of objects depending on qapi/error.h
drop from 1910 (out of 4743) to 1612 in my "build everything" tree.

While there, separate #include from file comment with a blank line,
and drop a useless comment on why qemu/osdep.h is included first.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-5-armbru@redhat.com>
[Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
2018-02-09 13:50:17 +01:00
Peter Maydell
7b213bb475 * socket option parsing fix (Daniel)
* SCSI fixes (Fam)
 * Readline double-free fix (Greg)
 * More HVF attribution fixes (Izik)
 * WHPX (Windows Hypervisor Platform Extensions) support (Justin)
 * POLLHUP handler (Klim)
 * ivshmem fixes (Ladi)
 * memfd memory backend (Marc-André)
 * improved error message (Marcelo)
 * Memory fixes (Peter Xu, Zhecheng)
 * Remove obsolete code and comments (Peter M.)
 * qdev API improvements (Philippe)
 * Add CONFIG_I2C switch (Thomas)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJaexoYAAoJEL/70l94x66DVL0IAJC//aZCwwgyN9CRNDcOo10/
 UPtzprfezERkur77r1KvEYVNIfslRF6iTBou2+suOWkzoNL2LJ0XZ+wi+2u2sFIF
 ikvbQVk4dOWqJJQj7e1cmv5A2EZy2dcxjAoD1IG6CRy76+HzYqwjHVw+HkYY5CUS
 qwnUWjQddP6WtH9MsUHpX7p7atWo7T1tzkx4v8H+CIHBO3uUJQSZLkGYflvcstpj
 Fo04bZzSkDj2rnlqqBo/6UgJQXD8++Rs64vmiX2xwcK47TWO31Vbuwu+r8V9osWm
 LHFmRpL8ZkZfL0yqf0bpjmd688dirjVpHIJ5KE043Lo6AdI+K5xBfoBjXxtPiKE=
 =o90D
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* socket option parsing fix (Daniel)
* SCSI fixes (Fam)
* Readline double-free fix (Greg)
* More HVF attribution fixes (Izik)
* WHPX (Windows Hypervisor Platform Extensions) support (Justin)
* POLLHUP handler (Klim)
* ivshmem fixes (Ladi)
* memfd memory backend (Marc-André)
* improved error message (Marcelo)
* Memory fixes (Peter Xu, Zhecheng)
* Remove obsolete code and comments (Peter M.)
* qdev API improvements (Philippe)
* Add CONFIG_I2C switch (Thomas)

# gpg: Signature made Wed 07 Feb 2018 15:24:08 GMT
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (47 commits)
  Add the WHPX acceleration enlightenments
  Introduce the WHPX impl
  Add the WHPX vcpu API
  Add the Windows Hypervisor Platform accelerator.
  tests/test-filter-redirector: move close()
  tests: use memfd in vhost-user-test
  vhost-user-test: make read-guest-mem setup its own qemu
  tests: keep compiling failing vhost-user tests
  Add memfd based hostmem
  memfd: add hugetlbsize argument
  memfd: add hugetlb support
  memfd: add error argument, instead of perror()
  cpus: join thread when removing a vCPU
  cpus: hvf: unregister thread with RCU
  cpus: tcg: unregister thread with RCU, fix exiting of loop on unplug
  cpus: dummy: unregister thread with RCU, exit loop on unplug
  cpus: kvm: unregister thread with RCU
  cpus: hax: register/unregister thread with RCU, exit loop on unplug
  ivshmem: Disable irqfd on device reset
  ivshmem: Improve MSI irqfd error handling
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	cpus.c
2018-02-07 20:40:36 +00:00
Justin Terry (VM)
19306806ae Add the WHPX acceleration enlightenments
Implements the WHPX accelerator cpu enlightenments to actually use the whpx-all
accelerator on Windows platforms.

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1516655269-1785-5-git-send-email-juterry@microsoft.com>
[Register/unregister VCPU thread with RCU. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:26 +01:00
Paolo Bonzini
dbadee4ff4 cpus: join thread when removing a vCPU
If no one joins the thread, its associated memory is leaked.

Reported-by: CheneyLin <linzc@zju.edu.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:25 +01:00
Paolo Bonzini
8178e6376f cpus: hvf: unregister thread with RCU
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:25 +01:00
Paolo Bonzini
9b0605f983 cpus: tcg: unregister thread with RCU, fix exiting of loop on unplug
Keep running until cpu_can_run(cpu) becomes false, for consistency
with other acceslerators.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:25 +01:00
Paolo Bonzini
d2831ab065 cpus: dummy: unregister thread with RCU, exit loop on unplug
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:25 +01:00
Paolo Bonzini
57615ed56c cpus: kvm: unregister thread with RCU
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:25 +01:00
Paolo Bonzini
9857c2d2f7 cpus: hax: register/unregister thread with RCU, exit loop on unplug
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:25 +01:00
Alistair Francis
493d89bf74 tcg: Replace fprintf(stderr, "*\n" with error_report()
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.

find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Stefan Weil <sw@weilnetz.de>

Conversions that aren't followed by exit() dropped, because they might
be inappropriate.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-14-armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2018-02-06 18:29:46 +01:00
Paolo Bonzini
db08b687cd cpus: unify qemu_*_wait_io_event
Except for round-robin TCG, every other accelerator is using more or
less the same code around qemu_wait_io_event_common.  The exception
is HAX, which also has to eat the dummy APC that is queued by
qemu_cpu_kick_thread.

We can add the SleepEx call to qemu_wait_io_event under "if
(!tcg_enabled())", since that is the condition that is used in
qemu_cpu_kick_thread, and unify the function for KVM, HAX, HVF and
multi-threaded TCG.  Single-threaded TCG code can also be simplified
since it is only used in the round-robin, sleep-if-all-CPUs-idle case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:51 +01:00
Pavel Dovgalyuk
b39e3f34c9 icount: fixed saving/restoring of icount warp timers
This patch adds saving and restoring of the icount warp
timers in the vmstate.
It is needed because there timers affect the virtual clock value.
Therefore determinism of the execution in icount record/replay mode
depends on determinism of the timers.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-01-16 14:54:51 +01:00
Sergio Andres Gomez Del Real
c97d6d2cdf i386: hvf: add code base from Google's QEMU repository
This file begins tracking the files that will be the code base for HVF
support in QEMU. This code base is part of Google's QEMU version of
their Android emulator, and can be found at
https://android.googlesource.com/platform/external/qemu/+/emu-master-dev

This code is based on Veertu Inc's vdhh (Veertu Desktop Hosted
Hypervisor), found at https://github.com/veertuinc/vdhh. Everything is
appropriately licensed under GPL v2-or-later, except for the code inside
x86_task.c and x86_task.h, which, deriving from KVM (the Linux kernel),
is licensed GPL v2-only.

This code base already implements a very great deal of functionality,
although Google's version removed from Vertuu's the support for APIC
page and hyperv-related stuff. According to the Android Emulator Release
Notes, Revision 26.1.3 (August 2017), "Hypervisor.framework is now
enabled by default on macOS for 32-bit x86 images to improve performance
and macOS compatibility", although we better use with caution for, as the
same Revision warns us, "If you experience issues with it specifically,
please file a bug report...". The code hasn't seen much update in the
last 5 months, so I think that we can further develop the code with
occasional visiting Google's repository to see if there has been any
update.

On top of Google's code, the following changes were made:

- add code to the configure script to support the --enable-hvf argument.
If the OS is Darwin, it checks for presence of HVF in the system. The
patch also adds strings related to HVF in the file qemu-options.hx.
QEMU will only support the modern syntax style '-M accel=hvf' no enable
hvf; the legacy '-enable-hvf' will not be supported.

- fix styling issues

- add glue code to cpus.c

- move HVFX86EmulatorState field to CPUX86State, changing the
the emulation functions to have a parameter with signature 'CPUX86State *'
instead of 'CPUState *' so we don't have to get the 'env'.

Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-2-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-3-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-5-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-6-Sergio.G.DelReal@gmail.com>
Message-Id: <20170905035457.3753-7-Sergio.G.DelReal@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-22 15:01:20 +01:00
Peter Xu
80ceb07a83 cpu: refactor cpu_address_space_init()
Normally we create an address space for that CPU and pass that address
space into the function.  Let's just do it inside to unify address space
creations.  It'll simplify my next patch to rename those address spaces.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171123092333.16085-3-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-21 09:30:31 +01:00
David Hildenbrand
ebd05fea9b cpus: make pause_all_cpus() play with SMP on single threaded TCG
pause_all_cpus() is sometimes called from a VCPU thread (e.g. s390x
during special reset). It cannot deal with multiple VCPUs per Thread
(single threaded TCG) yet.

Booting an s390x guest with -smp 2 and single threaded TCG from disk
currently fails. The DIAG 308 will issue a pause_all_cpus() and wait
forever for the CPUs to actually stop. But it is waiting for itself.

So let's stop all VCPUs belonging to the current thread. Factor out
stopping of a VCPU.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171129191215.11323-1-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-21 09:22:44 +01:00
Emilio G. Cota
3468b59e18 tcg: enable multiple TCG contexts in softmmu
This enables parallel TCG code generation. However, we do not take
advantage of it yet since tb_lock is still held during tb_gen_code.

In user-mode we use a single TCG context; see the documentation
added to tcg_region_init for the rationale.

Note that targets do not need any conversion: targets initialize a
TCGContext (e.g. defining TCG globals), and after this initialization
has finished, the context is cloned by the vCPU threads, each of
them keeping a separate copy.

TCG threads claim one entry in tcg_ctxs[] by atomically increasing
n_tcg_ctxs. Do not be too annoyed by the subsequent atomic_read's
of that variable and tcg_ctxs; they are there just to play nice with
analysis tools such as thread sanitizer.

Note that we do not allocate an array of contexts (we allocate
an array of pointers instead) because when tcg_context_init
is called, we do not know yet how many contexts we'll use since
the bool behind qemu_tcg_mttcg_enabled() isn't set yet.

Previous patches folded some TCG globals into TCGContext. The non-const
globals remaining are only set at init time, i.e. before the TCG
threads are spawned. Here is a list of these set-at-init-time globals
under tcg/:

Only written by tcg_context_init:
- indirect_reg_alloc_order
- tcg_op_defs
Only written by tcg_target_init (called from tcg_context_init):
- tcg_target_available_regs
- tcg_target_call_clobber_regs
- arm: arm_arch, use_idiv_instructions
- i386: have_cmov, have_bmi1, have_bmi2, have_lzcnt,
        have_movbe, have_popcnt
- mips: use_movnz_instructions, use_mips32_instructions,
        use_mips32r2_instructions, got_sigill (tcg_target_detect_isa)
- ppc: have_isa_2_06, have_isa_3_00, tb_ret_addr
- s390: tb_ret_addr, s390_facilities
- sparc: qemu_ld_trampoline, qemu_st_trampoline (build_trampolines),
         use_vis3_instructions

Only written by tcg_prologue_init:
- 'struct jit_code_entry one_entry'
- aarch64: tb_ret_addr
- arm: tb_ret_addr
- i386: tb_ret_addr, guest_base_flags
- ia64: tb_ret_addr
- mips: tb_ret_addr, bswap32_addr, bswap32u_addr, bswap64_addr

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24 13:53:42 -07:00
Emilio G. Cota
e8feb96fcc tcg: introduce regions to split code_gen_buffer
This is groundwork for supporting multiple TCG contexts.

The naive solution here is to split code_gen_buffer statically
among the TCG threads; this however results in poor utilization
if translation needs are different across TCG threads.

What we do here is to add an extra layer of indirection, assigning
regions that act just like pages do in virtual memory allocation.
(BTW if you are wondering about the chosen naming, I did not want
to use blocks or pages because those are already heavily used in QEMU).

We use a global lock to serialize allocations as well as statistics
reporting (we now export the size of the used code_gen_buffer with
tcg_code_size()). Note that for the allocator we could just use
a counter and atomic_inc; however, that would complicate the gathering
of tcg_code_size()-like stats. So given that the region operations are
not a fast path, a lock seems the most reasonable choice.

The effectiveness of this approach is clear after seeing some numbers.
I used the bootup+shutdown of debian-arm with '-tb-size 80' as a benchmark.
Note that I'm evaluating this after enabling per-thread TCG (which
is done by a subsequent commit).

* -smp 1, 1 region (entire buffer):
    qemu: flush code_size=83885014 nb_tbs=154739 avg_tb_size=357
    qemu: flush code_size=83884902 nb_tbs=153136 avg_tb_size=363
    qemu: flush code_size=83885014 nb_tbs=152777 avg_tb_size=364
    qemu: flush code_size=83884950 nb_tbs=150057 avg_tb_size=373
    qemu: flush code_size=83884998 nb_tbs=150234 avg_tb_size=373
    qemu: flush code_size=83885014 nb_tbs=154009 avg_tb_size=360
    qemu: flush code_size=83885014 nb_tbs=151007 avg_tb_size=370
    qemu: flush code_size=83885014 nb_tbs=151816 avg_tb_size=367

That is, 8 flushes.

* -smp 8, 32 regions (80/32 MB per region) [i.e. this patch]:

    qemu: flush code_size=76328008 nb_tbs=141040 avg_tb_size=356
    qemu: flush code_size=75366534 nb_tbs=138000 avg_tb_size=361
    qemu: flush code_size=76864546 nb_tbs=140653 avg_tb_size=361
    qemu: flush code_size=76309084 nb_tbs=135945 avg_tb_size=375
    qemu: flush code_size=74581856 nb_tbs=132909 avg_tb_size=375
    qemu: flush code_size=73927256 nb_tbs=135616 avg_tb_size=360
    qemu: flush code_size=78629426 nb_tbs=142896 avg_tb_size=365
    qemu: flush code_size=76667052 nb_tbs=138508 avg_tb_size=368

Again, 8 flushes. Note how buffer utilization is not 100%, but it
is close. Smaller region sizes would yield higher utilization,
but we want region allocation to be rare (it acquires a lock), so
we do not want to go too small.

* -smp 8, static partitioning of 8 regions (10 MB per region):
    qemu: flush code_size=21936504 nb_tbs=40570 avg_tb_size=354
    qemu: flush code_size=11472174 nb_tbs=20633 avg_tb_size=370
    qemu: flush code_size=11603976 nb_tbs=21059 avg_tb_size=365
    qemu: flush code_size=23254872 nb_tbs=41243 avg_tb_size=377
    qemu: flush code_size=28289496 nb_tbs=52057 avg_tb_size=358
    qemu: flush code_size=43605160 nb_tbs=78896 avg_tb_size=367
    qemu: flush code_size=45166552 nb_tbs=82158 avg_tb_size=364
    qemu: flush code_size=63289640 nb_tbs=116494 avg_tb_size=358
    qemu: flush code_size=51389960 nb_tbs=93937 avg_tb_size=362
    qemu: flush code_size=59665928 nb_tbs=107063 avg_tb_size=372
    qemu: flush code_size=38380824 nb_tbs=68597 avg_tb_size=374
    qemu: flush code_size=44884568 nb_tbs=79901 avg_tb_size=376
    qemu: flush code_size=50782632 nb_tbs=90681 avg_tb_size=374
    qemu: flush code_size=39848888 nb_tbs=71433 avg_tb_size=372
    qemu: flush code_size=64708840 nb_tbs=119052 avg_tb_size=359
    qemu: flush code_size=49830008 nb_tbs=90992 avg_tb_size=362
    qemu: flush code_size=68372408 nb_tbs=123442 avg_tb_size=368
    qemu: flush code_size=33555560 nb_tbs=59514 avg_tb_size=378
    qemu: flush code_size=44748344 nb_tbs=80974 avg_tb_size=367
    qemu: flush code_size=37104248 nb_tbs=67609 avg_tb_size=364

That is, 20 flushes. Note how a static partitioning approach uses
the code buffer poorly, leading to many unnecessary flushes.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24 13:53:42 -07:00
Alexey Kardashevskiy
b516572f31 memory: Get rid of address_space_init_shareable
Since FlatViews are shared now and ASes not, this gets rid of
address_space_init_shareable().

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-17-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-22 01:06:51 +02:00
Alistair Francis
3dc6f86936 Convert error_report() to warn_report()
Convert all uses of error_report("warning:"... to use warn_report()
instead. This helps standardise on a single method of printing warnings
to the user.

All of the warnings were changed using these two commands:
    find ./* -type f -exec sed -i \
      's|error_report(".*warning[,:] |warn_report("|Ig' {} +

Indentation fixed up manually afterwards.

The test-qdev-global-props test case was manually updated to ensure that
this patch passes make check (as the test cases are case sensitive).

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Suggested-by: Thomas Huth <thuth@redhat.com>
Cc: Jeff Cody <jcody@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Lieven <pl@kamp.de>
Cc: Josh Durgin <jdurgin@redhat.com>
Cc: "Richard W.M. Jones" <rjones@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Greg Kurz <groug@kaod.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Peter Chubb <peter.chubb@nicta.com.au>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Alexander Graf <agraf@suse.de>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Greg Kurz <groug@kaod.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed by: Peter Chubb <peter.chubb@data61.csiro.au>
Acked-by: Max Reitz <mreitz@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Message-Id: <e1cfa2cd47087c248dd24caca9c33d9af0c499b0.1499866456.git.alistair.francis@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-07-13 13:49:58 +02:00
Felipe Franciosi
90bb0c0421 cpus: reset throttle_thread_scheduled after sleep
Currently, the throttle_thread_scheduled flag is reset back to 0 before
sleeping (as part of the throttling logic). Given that throttle_timer
(well, any timer) may tick with a slight delay, it so happens that under
heavy throttling (ie. close or on CPU_THROTTLE_PCT_MAX) the tick may
schedule a further cpu_throttle_thread() work item after the flag reset,
but before the previous sleep completed. This results on the vCPU thread
sleeping continuously for potentially several seconds in a row.

The chances of that happening can be drastically minimised by resetting
the flag after the sleep.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Malcolm Crossley <malcolm@nutanix.com>
Message-Id: <1495229390-18909-1-git-send-email-felipe@nutanix.com>
Acked-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:03 +02:00
David Gibson
75e972dab5 migration: Mark CPU states dirty before incoming migration/loadvm
As a rule, CPU internal state should never be updated when
!cpu->kvm_vcpu_dirty (or the HAX equivalent).  If that is done, then
subsequent calls to cpu_synchronize_state() - usually safe and idempotent -
will clobber state.

However, we routinely do this during a loadvm or incoming migration.
Usually this is called shortly after a reset, which will clear all the cpu
dirty flags with cpu_synchronize_all_post_reset().  Nothing is expected
to set the dirty flags again before the cpu state is loaded from the
incoming stream.

This means that it isn't safe to call cpu_synchronize_state() from a
post_load handler, which is non-obvious and potentially inconvenient.

We could cpu_synchronize_all_state() before the loadvm, but that would be
overkill since a) we expect the state to already be synchronized from the
reset and b) we expect to completely rewrite the state with a call to
cpu_synchronize_all_post_init() at the end of qemu_loadvm_state().

To clear this up, this patch introduces cpu_synchronize_pre_loadvm() and
associated helpers, which simply marks the cpu state as dirty without
actually changing anything.  i.e. it says we want to discard any existing
KVM (or HAX) state and replace it with what we're going to load.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Dave Gilbert <dgilbert@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
2017-06-06 08:53:24 +10:00
Stefan Hajnoczi
ba9915e1f8 x86 and machine queue, 2017-05-11
Highlights:
 * New "-numa cpu" option
 * NUMA distance configuration
 * migration/i386 vmstatification
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZFLh3AAoJECgHk2+YTcWmppQQAJe9Y5a3VwHXqvHbwBHX2ysn
 RZDAUPd9DWpbM+UUydyKOVIZ7u5RXbbVq4E0NeCD8VYYd+grZB5Wo1cAzy3b4U2j
 2s+MDqaPMtZtGoqxTsyQOVoVxazT5Kf1zglK+iUEzik44J7LGdro+ty2Z7Ut2c11
 q9rE/GNS78czBm7c4lxgkxXW4N95K/tEGlLtDQ7uct//3U/ZimF+mO6GcbVFlOWT
 4iEbOz2sqvBVv22nLJRufiPgFNIW4hizAz5KBWxwGFCCKvT3N6yYNKKjzEpCw+jE
 lpjIRODU02yIZZZY841fLRtyrk7p4zORS8jRaHTdEJgb5bGc/YazxxVL8nzRQT1W
 VxFwAMd+UNrDkV24hpN++Ln2O+b3kwcGZ7uA/qu9d5WvSYUKXlHqcMJ35q6zuhAI
 /ecfYO7EZfVP86VjIt5IH04iV8RChA9Q6de+kQEFa6wHUxufeCOwCFqukGo8zj07
 plX8NcjnzYmSXKnYjHOHao4rKT+DiJhRB60rFiMeKP/qvKbZPjtgsIeonhHm53qZ
 /QwkhowahHKkpAnetIl0QHm8KS4YudAofMi/Fl+he4gRkEbSQVAo6iQb2L4cjcLC
 LNSDDsIVWGem4gCR+vcsFqB3lggRDfltHXm15JKh92UMpOr6RI6s8pD55T7EdnPC
 CfdxWB5kYM6/lLbOHj94
 =48wH
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'ehabkost/tags/x86-and-machine-pull-request' into staging

x86 and machine queue, 2017-05-11

Highlights:
* New "-numa cpu" option
* NUMA distance configuration
* migration/i386 vmstatification

# gpg: Signature made Thu 11 May 2017 08:16:07 PM BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# gpg: Note: This key has expired!
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* ehabkost/tags/x86-and-machine-pull-request: (29 commits)
  migration/i386: Remove support for pre-0.12 formats
  vmstatification: i386 FPReg
  migration/i386: Remove old non-softfloat 64bit FP support
  tests: check -numa node,cpu=props_list usecase
  numa: add '-numa cpu,...' option for property based node mapping
  numa: remove node_cpu bitmaps as they are no longer used
  numa: use possible_cpus for not mapped CPUs check
  machine: call machine init from wrapper
  numa: remove no longer need numa_post_machine_init()
  tests: numa: add case for QMP command query-cpus
  QMP: include CpuInstanceProperties into query_cpus output output
  virt-arm: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
  spapr: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
  pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
  numa: do default mapping based on possible_cpus instead of node_cpu bitmaps
  numa: mirror cpu to node mapping in MachineState::possible_cpus
  numa: add check that board supports cpu_index to node mapping
  virt-arm: add node-id property to CPU
  pc: add node-id property to CPU
  spapr: add node-id property to sPAPR core
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-15 14:12:03 +01:00
Igor Mammedov
afed5a5a70 QMP: include CpuInstanceProperties into query_cpus output output
if board supports CpuInstanceProperties, report them for
each CPU thread listed. Main motivation for this is to
provide these properties introspection via QMP interface
for using in test cases to verify numa node to cpu mapping,
which includes not only boards that support cpu hotplug
and have this info in query-hotpluggable-cpus (pc/spapr)
but also for boards that don't not support hotpluggable-cpus
but support numa mapping (virt-arm).

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1494415802-227633-12-git-send-email-imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:49 -03:00
Bharata B Rao
a3e53273ad cpus: Fix CPU unplug for MTTCG
Ensure that the unplugged CPU thread is destroyed and the waiting
thread is notified about it. This is needed for CPU unplug to work
correctly in MTTCG mode.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-11 09:45:15 +10:00
Alex Bennée
1d05906b95 cpus: call cpu_update_icount on read
This ensures each time the vCPU thread reads the icount we update the
master timer_state.qemu_icount field. This way as long as updates are
in BQL protected sections (which they should be) the main-loop can
never come to update the log and find time has gone backwards.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
eda5f7c6a1 cpu-exec: update icount after each TB_EXIT
There is no particular reason we shouldn't update the global system
icount time as we exit each TranslationBlock run. This ensures the
main-loop doesn't have to wait until we exit to the outer loop for
executed instructions to be credited to timer_state.

The prepare_icount_for_run function is slightly tweaked to match the
logic we run in cpu_loop_exec_tb.

Based on Paolo's original suggestion.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
512d3c8071 cpus: introduce cpu_update_icount helper
By holding off updates to timer_state.qemu_icount we can run into
trouble when the non-vCPU thread needs to know the time. This helper
ensures we atomically update timers_state.qemu_icount based on what
has been currently executed.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
e4cd96571f cpus: don't credit executed instructions before they have run
Outside of the vCPU thread icount time will only be tracked against
timers_state.qemu_icount. We no longer credit cycles until they have
completed the run. Inside the vCPU thread we adjust for passage of
time by looking at how many have run so far. This is only valid inside
the vCPU thread while it is running.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
0524838225 cpus: move icount preparation out of tcg_exec_cpu
As icount is only supported for single-threaded execution due to the
requirement for determinism let's remove it from the common
tcg_exec_cpu path.

Also remove the additional fiddling which shouldn't be required as the
icount counters should all be rectified as you enter the loop.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:33 +01:00
Alex Bennée
243c5f77f6 cpus: check cpu->running in cpu_get_icount_raw()
The lifetime of current_cpu is now the lifetime of the vCPU thread.
However get_icount_raw() can apply a fudge factor if called while code
is running to take into account the current executed instruction
count.

To ensure this is always the case we also check cpu->running.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-10 10:22:28 +01:00
Alex Bennée
bf51c7206f cpus: remove icount handling from qemu_tcg_cpu_thread_fn
We should never be running in multi-threaded mode with icount enabled.
There is no point calling handle_icount_deadline here so remove it and
assert !use_icount.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-10 10:14:50 +01:00
Nikunj A Dadhania
8695350357 cpus: fix wrong define name
While the configure script generates TARGET_SUPPORTS_MTTCG define, one
of the define is cpus.c is checking wrong name: TARGET_SUPPORT_MTTCG

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:14:28 +01:00
Pranith Kumar
8cfef89271 tcg: Add a new line after incompatibility warning
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-28 10:52:50 +01:00
Vincent Palatin
b3d3a426da hax: fix breakage in locking
use qemu_mutex_lock_iothread consistently in qemu_hax_cpu_thread_fn() as
done in other _thread_fn functions, instead of grabbing directly the
BQL. This way we ensure that iothread_locked is properly set.

On v2.9.0-rc0, QEMU was dying in an assertion in the mutex code when
running with '--enable-hax' either on OSX or Windows. This bug was triggered
since the code modification for multithreading added new usages of
qemu_mutex_iothread_locked.
This fixes the breakage on both platforms, I can now run again a full
Chromium OS image with HAX kernel acceleration.

Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Message-Id: <20170320101549.150076-1-vpalatin@chromium.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-20 12:24:43 +01:00
Paolo Bonzini
6b8f0187a4 icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
icount has become much slower after tcg_cpu_exec has stopped
using the BQL.  There is also a latent bug that is masked by
the slowness.

The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL
timer now has to wake up the I/O thread and wait for it.  The rendez-vous
is mediated by the BQL QemuMutex:

- handle_icount_deadline wakes up the I/O thread with BQL taken
- the I/O thread wakes up and waits on the BQL
- the VCPU thread releases the BQL a little later
- the I/O thread raises an interrupt, which calls qemu_cpu_kick
- the VCPU thread notices the interrupt, takes the BQL to
  process it and waits on it

All this back and forth is extremely expensive, causing a 6 to 8-fold
slowdown when icount is turned on.

One may think that the issue is that the VCPU thread is too dependent
on the BQL, but then the latent bug comes in.  I first tried removing
the BQL completely from the x86 cpu_exec, only to see everything break.
The only way to fix it (and make everything slow again) was to add a dummy
BQL lock/unlock pair.

This is because in -icount mode you really have to process the events
before the CPU restarts executing the next instruction.  Therefore, this
series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in
the vCPU thread when running in icount mode.

The required changes include:

- make the timer notification callback wake up TCG's single vCPU thread
  when run from another thread.  By using async_run_on_cpu, the callback
  can override all_cpu_threads_idle() when the CPU is halted.

- move handle_icount_deadline after qemu_tcg_wait_io_event, so that
  the timer notification callback is invoked after the dummy work item
  wakes up the vCPU thread

- make handle_icount_deadline run the timers instead of just waking the
  I/O thread.

- stop processing the timers in the main loop

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:51:34 +01:00
Paolo Bonzini
3f53bc61a4 cpus: define QEMUTimerListNotifyCB for QEMU system emulation
There is no change for now, because the callback just invokes
qemu_notify_event.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:28:29 +01:00
Alex Bennée
c34c762015 cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG
While we may fail the memory ordering check later that can be
confusing. So in cases where TARGET_SUPPORT_MTTCG has yet to be
defined we should say so specifically.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-09 10:38:16 +00:00
Alex Bennée
83fd9629a3 vl/cpus: be smarter with icount and MTTCG
The sense of the test was inverted. Make it simple, if icount is
enabled then we disabled MTTCG by default. If the user tries to force
MTTCG upon us then we tell them "no".

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-03-09 10:38:02 +00:00
Paolo Bonzini
18268b6016 KVM: move SIG_IPI handling to kvm-all.c
This lets us remove a bunch of CONFIG_LINUX defines.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
2ae41db262 KVM: do not use sigtimedwait to catch SIGBUS
Call kvm_on_sigbus_vcpu asynchronously from the VCPU thread.
Information for the SIGBUS can be stored in thread-local variables
and processed later in kvm_cpu_exec.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
a16fc07ebd cpus: reorganize signal handling code
Move the KVM "eat signals" code under CONFIG_LINUX, in preparation
for moving it to kvm-all.c; reraise non-MCE SIGBUS immediately,
without passing it to KVM.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
d98d407234 cpus: remove ugly cast on sigbus_handler
The cast is there because sigbus_handler is invoked via sigfd_handler.
But it feels just wrong to use struct qemu_signalfd_siginfo in the
prototype of a function that is passed to sigaction.

Instead, do a simple-minded conversion of qemu_signalfd_siginfo to
siginfo_t.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Pranith Kumar
08e73c48b0 tcg: handle EXCP_ATOMIC exception for system emulation
The patch enables handling atomic code in the guest. This should be
preferably done in cpu_handle_exception(), but the current assumptions
regarding when we can execute atomic sections cause a deadlock.

The current mechanism discards the flags which were set in atomic
execution. We ensure they are properly saved by calling the
cc->cpu_exec_enter/leave() functions around the loop.

As we are running cpu_exec_step_atomic() from the outermost loop we
need to avoid an abort() when single stepping over atomic code since
debug exception longjmp will point to the the setlongjmp in
cpu_exec(). We do this by setting a new jmp_env so that it jumps back
here on an exception.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
[AJB: tweak title, merge with new patches, add mmap_lock]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
CC: Paolo Bonzini <pbonzini@redhat.com>
2017-02-24 10:32:45 +00:00
Alex Bennée
372579427a tcg: enable thread-per-vCPU
There are a couple of changes that occur at the same time here:

  - introduce a single vCPU qemu_tcg_cpu_thread_fn

  One of these is spawned per vCPU with its own Thread and Condition
  variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
  single threaded function.

  - the TLS current_cpu variable is now live for the lifetime of MTTCG
    vCPU threads. This is for future work where async jobs need to know
    the vCPU context they are operating in.

The user to switch on multi-thread behaviour and spawn a thread
per-vCPU. For a simple test kvm-unit-test like:

  ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi

Will now use 4 vCPU threads and have an expected FAIL (instead of the
unexpected PASS) as the default mode of the test has no protection when
incrementing a shared variable.

We enable the parallel_cpus flag to ensure we generate correct barrier
and atomic code if supported by the front and backends. This doesn't
automatically enable MTTCG until default_mttcg_enabled() is updated to
check the configuration is supported.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[AJB: Some fixes, conditionally, commit rewording]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-02-24 10:32:45 +00:00
Alex Bennée
e5143e30fb tcg: remove global exit_request
There are now only two uses of the global exit_request left.

The first ensures we exit the run_loop when we first start to process
pending work and in the kick handler. This is just as easily done by
setting the first_cpu->exit_request flag.

The second use is in the round robin kick routine. The global
exit_request ensured every vCPU would set its local exit_request and
cause a full exit of the loop. Now the iothread isn't being held while
running we can just rely on the kick handler to push us out as intended.

We lightly re-factor the main vCPU thread to ensure cpu->exit_requests
cause us to exit the main loop and process any IO requests that might
come along. As an cpu->exit_request may legitimately get squashed
while processing the EXCP_INTERRUPT exception we also check
cpu->queued_work_first to ensure queued work is expedited as soon as
possible.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-02-24 10:32:45 +00:00
Jan Kiszka
8d04fb55de tcg: drop global lock during TCG code execution
This finally allows TCG to benefit from the iothread introduction: Drop
the global mutex while running pure TCG CPU code. Reacquire the lock
when entering MMIO or PIO emulation, or when leaving the TCG loop.

We have to revert a few optimization for the current TCG threading
model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
kicking it in qemu_cpu_kick. We also need to disable RAM block
reordering until we have a more efficient locking mechanism at hand.

Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
These numbers demonstrate where we gain something:

20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm

The guest CPU was fully loaded, but the iothread could still run mostly
independent on a second core. Without the patch we don't get beyond

32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm

We don't benefit significantly, though, when the guest is not fully
loading a host CPU.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
[FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[EGC: fixed iothread lock for cpu-exec IRQ handling]
Signed-off-by: Emilio G. Cota <cota@braap.org>
[AJB: -smp single-threaded fix, clean commit msg, BQL fixes]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
[PM: target-arm changes]
Acked-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-24 10:32:45 +00:00
Alex Bennée
791158d93b tcg: rename tcg_current_cpu to tcg_current_rr_cpu
..and make the definition local to cpus. In preparation for MTTCG the
concept of a global tcg_current_cpu will no longer make sense. However
we still need to keep track of it in the single-threaded case to be able
to exit quickly when required.

qemu_cpu_kick_no_halt() moves and becomes qemu_cpu_kick_rr_cpu() to
emphasise its use-case. qemu_cpu_kick now kicks the relevant cpu as
well as qemu_kick_rr_cpu() which will become a no-op in MTTCG.

For the time being the setting of the global exit_request remains.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2017-02-24 10:32:45 +00:00
Alex Bennée
6546706d28 tcg: add kick timer for single-threaded vCPU emulation
Currently we rely on the side effect of the main loop grabbing the
iothread_mutex to give any long running basic block chains a kick to
ensure the next vCPU is scheduled. As this code is being re-factored and
rationalised we now do it explicitly here.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2017-02-24 10:32:45 +00:00
KONRAD Frederic
8d4e9146b3 tcg: add options for enabling MTTCG
We know there will be cases where MTTCG won't work until additional work
is done in the front/back ends to support. It will however be useful to
be able to turn it on.

As a result MTTCG will default to off unless the combination is
supported. However the user can turn it on for the sake of testing.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[AJB: move to -accel tcg,thread=multi|single, defaults]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-02-24 10:32:45 +00:00
Claudio Imbrenda
2d76e82395 move vm_start to cpus.c
This patch:

* moves vm_start to cpus.c.
* exports qemu_vmstop_requested, since it's needed by vm_start.
* extracts vm_prepare_start from vm_start; it does what vm_start did,
  except restarting the cpus.
* vm_start now calls vm_prepare_start and then restarts the cpus.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Message-Id: <1487092068-16562-2-git-send-email-imbrenda@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-16 14:06:55 +01:00
Vincent Palatin
b0cb0a66d6 Plumb the HAXM-based hardware acceleration support
Use the Intel HAX is kernel-based hardware acceleration module for
Windows (similar to KVM on Linux).

Based on the "target/i386: Add Intel HAX to android emulator" patch
from David Chou <david.j.chou@intel.com>

Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Message-Id: <7b9cae28a0c379ab459c7a8545c9a39762bd394f.1484045952.git.vpalatin@chromium.org>
[Drop hax_populate_ram stub. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-19 22:07:46 +01:00
Vincent Palatin
b39466269b kvm: move cpu synchronization code
Move the generic cpu_synchronize_ functions to the common hw_accel.h header,
in order to prepare for the addition of a second hardware accelerator.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Message-Id: <f5c3cffe8d520011df1c2e5437bb814989b48332.1484045952.git.vpalatin@chromium.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-19 22:07:46 +01:00
Paolo Bonzini
14e6fe12a7 *_run_on_cpu: introduce run_on_cpu_data type
This changes the *_run_on_cpu APIs (and helpers) to pass data in a
run_on_cpu_data type instead of a plain void *. This is because we
sometimes want to pass a target address (target_ulong) and this fails on
32 bit hosts emulating 64 bit guests.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161027151030.20863-24-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 15:00:25 +01:00
Alex Bennée
12e9700d7a cpus: re-factor out handle_icount_deadline
In preparation for adding a MTTCG thread we re-factor out a bit of what
will be common code to handle the QEMU_CLOCK_VIRTUAL expiration.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-18-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 10:51:17 +01:00
Alex Bennée
c93bbbefca tcg: cpus rm tcg_exec_all()
In preparation for multi-threaded TCG we remove tcg_exec_all and move
all the CPU cycling into the main thread function. When MTTCG is enabled
we shall use a separate thread function which only handles one vCPU.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>

Message-Id: <20161027151030.20863-13-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 10:51:17 +01:00
Alex Bennée
1be7fcb8aa tcg: move tcg_exec_all and helpers above thread fn
This is a pure mechanical change in preparation for up-coming
re-factoring. Instead of a forward declaration for tcg_exec_all it and
the associated helper functions are moved in front of the call from
qemu_tcg_cpu_thread_fn.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-12-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 10:51:16 +01:00
Alex Bennée
e8faee06f3 cpus: make all_vcpus_paused() return bool
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>

Message-Id: <20161027151030.20863-2-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 10:24:45 +01:00
Richard Henderson
fdbc2b5722 tcg: Add EXCP_ATOMIC
When we cannot emulate an atomic operation within a parallel
context, this exception allows us to stop the world and try
again in a serial context.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
John Snow
22af08eacf qemu: use bdrv_flush_all for vm_stop et al
Reimplement bdrv_flush_all for vm_stop. In contrast to blk_flush_all,
bdrv_flush_all does not have device model restrictions. This allows
us to flush and halt unconditionally without error.

This allows us to do things like migrate when we have a device with
an open tray, but has a node that may need to be flushed, or nodes
that aren't currently attached to any device and need to be flushed.

Specifically, this allows us to migrate when we have a CDROM with
an open tray.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-09-29 14:13:37 +02:00
Pavel Dovgalyuk
6d0ceb80ff replay: allow replay stopping and restarting
This patch fixes bug with stopping and restarting replay
through monitor.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20160926080815.6992.71818.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:30 +02:00
Paolo Bonzini
ab129972c8 cpus-common: move exclusive work infrastructure from linux-user
This will serve as the base for async_safe_run_on_cpu.  Because
start_exclusive uses CPU_FOREACH, merge exclusive_lock with
qemu_cpu_list_lock: together with a call to exclusive_idle (via
cpu_exec_start/end) in cpu_list_add, this protects exclusive work
against concurrent CPU addition and removal.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:30 +02:00
Sergey Fedorov
d148d90ee8 cpus-common: move CPU work item management to common code
Make CPU work core functions common between system and user-mode
emulation. User-mode does not use run_on_cpu, so do not implement it.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1470158864-17651-10-git-send-email-alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:30 +02:00
Sergey Fedorov
a5403c69fc cpus: Rename flush_queued_work()
To avoid possible confusion, rename flush_queued_work() to
process_queued_cpu_work().

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1470158864-17651-6-git-send-email-alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:29 +02:00
Sergey Fedorov
fd38b25103 cpus: Move common code out of {async_, }run_on_cpu()
Move the code common between run_on_cpu() and async_run_on_cpu() into a
new function queue_work_on_cpu().

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1470158864-17651-4-git-send-email-alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:29 +02:00
Alex Bennée
e0eeb4a21a cpus: pass CPUState to run_on_cpu helpers
CPUState is a fairly common pointer to pass to these helpers. This means
if you need other arguments for the async_run_on_cpu case you end up
having to do a g_malloc to stuff additional data into the routine. For
the current users this isn't a massive deal but for MTTCG this gets
cumbersome when the only other parameter is often an address.

This adds the typedef run_on_cpu_func for helper functions which has an
explicit CPUState * passed as the first parameter. All the users of
run_on_cpu and async_run_on_cpu have had their helpers updated to use
CPUState where available.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[Sergey Fedorov:
 - eliminate more CPUState in user data;
 - remove unnecessary user data passing;
 - fix target-s390x/kvm.c and target-s390x/misc_helper.c]
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts)
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> (s390 parts)
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1470158864-17651-3-git-send-email-alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:29 +02:00
Peter Maydell
8212ff86f4 * minor patches here and there
* MTTCG: lock-free TB lookup
 * SCSI: bugfixes for MPTSAS, MegaSAS, LSI53c, vmw_pvscsi
 * buffer_is_zero rewrite (except for one patch)
 * chardev: qemu_chr_fe_write checks
 * checkpatch improvement for markdown preformatted text
 * default-configs cleanups
 * atomics cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJX2DP2AAoJEL/70l94x66DIBYH/2pW+/HYexCobNn9eVD0Wm08
 im0mRHIU0vjfTaeZSasJPXvA2FyYQLl9KnSFvUFcRiLILpp+hE3QdZ8o0QGlfAmE
 +5MWsPJDXMbOaCOfMKZpZvPfJ6q/lSTg6eiJTPiRgyU7fQgjMDAot1s44ETYGVRu
 myeheEvjSwm/aT9sRIUK6KC7LWXGHFYRYzYJDnvoN6svHZ10DcEDhve8bdmixFk0
 0zUY4RmPk8n46SntDG65tgAlKlzfSuPOesvbpcQIYe1H+r+uJt9BST7MjKdbdDQv
 b/LDzMx8CTbd2tDPL6JWgjBGBZ6SZ4Q6x0a45kzJRtkS+BPtNeGGzBVwULVN4RY=
 =eAJS
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* minor patches here and there
* MTTCG: lock-free TB lookup
* SCSI: bugfixes for MPTSAS, MegaSAS, LSI53c, vmw_pvscsi
* buffer_is_zero rewrite (except for one patch)
* chardev: qemu_chr_fe_write checks
* checkpatch improvement for markdown preformatted text
* default-configs cleanups
* atomics cleanups

# gpg: Signature made Tue 13 Sep 2016 18:14:30 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (58 commits)
  cutils: Add generic prefetch
  cutils: Add SSE4 version
  cutils: Add test for buffer_is_zero
  cutils: Remove ppc buffer zero checking
  cutils: Remove aarch64 buffer zero checking
  cutils: Rearrange buffer_is_zero acceleration
  cutils: Export only buffer_is_zero
  cutils: Remove SPLAT macro
  cutils: Move buffer_is_zero and subroutines to a new file
  ppc: do not redefine CPUPPCState
  x86/lapic: Load LAPIC state at post_load
  optionrom: do not rely on compiler's bswap optimization
  checkpatch: Fix whitespace checks for documentation code blocks
  atomics: Use __atomic_*_n() variant primitives
  atomics: Remove redundant barrier()'s
  kvm-all: drop kvm_setup_guest_memory
  i8257: Make device "i8257" unavailable with -device
  Revert "megasas: remove useless check for cmd->frame"
  char: convert qemu_chr_fe_write to qemu_chr_fe_write_all
  hw: replace most use of qemu_chr_fe_write with qemu_chr_fe_write_all
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

 Conflicts:
	cpus.c
	tests/Makefile.include
2016-09-15 10:24:22 +01:00
Cao jin
d90f3cca87 cpus: update comments
The returned value of cpu_get_clock() is plused with the offset,
so it is the time elapsed in virtual machine when vm is active.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc  Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1469790338-28990-4-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-13 19:08:44 +02:00
Cao jin
1d45cea549 cpus: rename local variable to meaningful one
The function actually returns monotonic time value in nanosecond,
the "ticks" is not suitable.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc  Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1469790338-28990-3-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-13 19:08:44 +02:00
Cao jin
3224e8786f timer/cpus: fix some typos and update some comments
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-09-13 18:12:34 +03:00
Emilio G. Cota
03719e44b6 seqlock: rename write_lock/unlock to write_begin/end
It is a more appropriate name, now that the mutex embedded
in the seqlock is gone.

Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-4-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 22:59:34 +00:00
Emilio G. Cota
ccdb3c1fc8 seqlock: remove optional mutex
This option is unused; besides, it bloats the struct when not needed.
Let's just let writers define their own locks elsewhere.

Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-3-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 22:59:34 +00:00
Bharata B Rao
2c579042e3 cpu: Add a sync version of cpu_remove()
This sync API will be used by the CPU hotplug code to wait for the CPU to
completely get removed before flagging the failure to the device_add
command.

Sync version of this call is needed to correctly recover from CPU
realization failures when ->plug() handler fails.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 14:17:05 +10:00
Gu Zheng
4c055ab54f cpu: Reclaim vCPU objects
In order to deal well with the kvm vcpus (which can not be removed without any
protection), we do not close KVM vcpu fd, just record and mark it as stopped
into a list, so that we can reuse it for the appending cpu hot-add request if
possible. It is also the approach that kvm guys suggested:
https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
               [- Explicit CPU_REMOVE() from qemu_kvm/tcg_destroy_vcpu()
                  isn't needed as it is done from cpu_exec_exit()
                - Use iothread mutex instead of global mutex during
                  destroy
                - Don't cleanup vCPU object from vCPU thread context
                  but leave it to the callers (device_add/device_del)]
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 14:03:59 +10:00
Bandan Das
1453e6627d cpus: call the core nmi injection function
We can call the common function here directly since
x86 specific actions will be taken care of by the arch
specific nmi handler

Signed-off-by: Bandan Das <bsd@redhat.com>
Message-Id: <1463761717-26558-4-git-send-email-bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:47 +02:00
Peter Maydell
a2d1761da1 cpus.c: Use pthread_sigmask() rather than sigprocmask()
On Linux, sigprocmask() and pthread_sigmask() are in practice the
same thing (they only set the signal mask for the calling thread),
but the documentation states that the behaviour of sigprocmask() in a
multithreaded process is undefined. Use pthread_sigmask() instead
(which is what we do in almost all places in QEMU that alter the
signal mask already).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1463420039-29761-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:45 +02:00
Paolo Bonzini
63c915526d cpu: move exec-all.h inclusion out of cpu.h
exec-all.h contains TCG-specific definitions.  It is not needed outside
TCG-specific files such as translate.c, exec.c or *helper.c.

One generic function had snuck into include/exec/exec-all.h; move it to
include/qom/cpu.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
33c11879fd qemu-common: push cpu.h inclusion out of qemu-common.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Alex Bennée
ccffff48c9 cpus: don't use atomic_read for vm_clock_warp_start
As vm_clock_warp_start is a 64 bit value this causes problems for the
compiler trying to come up with a suitable atomic operation on 32 bit
hosts. Because the variable is protected by vm_clock_seqlock, we check its
value inside a seqlock critical section.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1459780549-12942-2-git-send-email-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-04-05 11:46:52 +02:00
Rutuja Shah
73bcb24d93 Replaced get_tick_per_sec() by NANOSECONDS_PER_SECOND
This patch replaces get_ticks_per_sec() calls with the macro
NANOSECONDS_PER_SECOND. Also, as there are no callers, get_ticks_per_sec()
is then removed.  This replacement improves the readability and
understandability of code.

For example,

    timer_mod(fdctrl->result_timer,
	      qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + (get_ticks_per_sec() / 50));

NANOSECONDS_PER_SECOND makes it obvious that qemu_clock_get_ns
matches the unit of the expression on the right side of the plus.

Signed-off-by: Rutuja Shah <rutu.shah.26@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:17 +01:00
Max Reitz
da31d594cf block: Use blk_{commit,flush}_all() consistently
Replace bdrv_commmit_all() and bdrv_flush_all() by their BlockBackend
equivalents.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Pavel Dovgalyuk
e76d1798fa icount: decouple warp calls
qemu_clock_warp function is called to update virtual clock when CPU
is sleeping. This function includes replay checkpoint to make execution
deterministic in icount mode.
Record/replay module flushes async event queue at checkpoints.
Some of the events (e.g., block devices operations) include interaction
with hardware. E.g., APIC polled by block devices sets one of IRQ flags.
Flag to be set depends on currently executed thread (CPU or iothread).
Therefore in replay mode we have to process the checkpoints in the same thread
as they were recorded.
qemu_clock_warp function (and its checkpoint) may be called from different
thread. This patch decouples two different execution cases of this function:
call when CPU is sleeping from iothread and call from cpu thread to update
virtual clock.
First task is performed by qemu_start_warp_timer function. It sets warp
timer event to the moment of nearest pending virtual timer.
Second function (qemu_account_warp_timer) is called from cpu thread
before execution of the code. It advances virtual clock by adding the length
of period while CPU was sleeping.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20160310115609.4812.44986.stgit@PASHA-ISP>
[Update docs. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:45 +01:00