Some functions that were called from the dataplane code are now only used
locally:
virtio_blk_init_request()
virtio_blk_handle_request()
virtio_blk_submit_multireq()
since commit "03de2f527499 virtio-blk: do not use vring in dataplane", and
virtio_blk_free_request()
since commit "6aa46d8ff1ee virtio: move VirtQueueElement at the beginning
of the structs".
This patch converts them to static.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
During device reset or similar situations a VirtQueueElement needs to be
freed without pushing it onto the used ring or rewinding the virtqueue.
Extract a new function to do this.
Later patches add virtio_detach_element() calls to existing device so
that scatter-gather lists are unmapped and vq->inuse goes back to zero
during device reset. Currently some devices don't bother and simply
call g_free(elem) which is not a clean way to throw away a
VirtQueueElement.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Replace repeated pattern
for (i = 0; i < nb_numa_nodes; i++) {
if (test_bit(idx, numa_info[i].node_cpu)) {
...
break;
with a helper function to lookup numa node index for cpu.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add support for enabling the virtio 1.0 "emergency write"
(VIRTIO_CONSOLE_F_EMERG_WRITE) feature. The previous patch introduced
the plumbing required for this; now we expose the virtio feature to
the guest. The feature is disabled for compatibility machines to avoid
exposing a new feature to existing guests.
As required by the virtio 1.0 spec, the emergency write functionality
is available to the guest even if the guest doesn't negotatiate the
feature, as well as before feature negotation.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to linux kernel commit <89c1e79eb30> ("linux/bitmap.h: improve
BITMAP_{LAST,FIRST}_WORD_MASK"), these two macro could be improved.
This patch takes this change and also move them all in header file.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Almost all block devices are qdevified by now. This allows us to go back
from the BlockBackend to the DeviceState. xen_disk is the last device
that is missing. We'll remember in the BlockBackend if a xen_disk is
attached and can then disable any features that require going from a BB
to the DeviceState.
While at it, clearly mark the function used by xen_disk as legacy even
in its name, not just in TODO comments.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Recently we moved a few options from QemuOptsLists in blockdev.c to
bdrv_runtime_opts in block.c in order to make them accissble using
blockdev-add. However, this has the side effect that these options are
missing from query-command-line-options now, and libvirt consequently
disables the corresponding feature.
This problem was reported as a regression for the 'discard' option,
introduced in commit 818584a4. However, it is more general than that.
Fix it by adding bdrv_runtime_opts to the list of QemuOptsLists that are
returned in query-command-line-options. For the future, libvirt is
advised to use QMP schema introspection for block device options.
Reported-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
qemu_bh_delete is already clearing bh->scheduled at the same time
as it's setting bh->deleted. Since it's not using any memory
barriers, there is no synchronization going on for bh->deleted,
and this makes the bh->deleted checks superfluous in aio_compute_timeout,
aio_bh_poll and aio_ctx_check.
Just remove them, and put the (bh->scheduled && bh->deleted) combo
to work in a new function aio_bh_schedule_oneshot. The new function
removes the need to save the QEMUBH pointer between the creation
and the execution of the bottom half.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A couple of distributors are compiling their distributions
with "-mcpu=power8" for ppc64le these days, so the user sooner
or later runs into a crash there when not explicitely specifying
the "-cpu POWER8" option to QEMU (which is currently using POWER7
for the "pseries" machine by default). Due to this reason, the
linux-user target already switched to POWER8 a while ago (see commit
de3f1b9841). Since the softmmu target
of course has the same problem, we should switch there to POWER8 for
the newer machine types, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add a generic loader to QEMU which can be used to load images or set
memory values.
Internally inside QEMU this is a device. It is a strange device that
provides no hardware interface but allows QEMU to monkey patch memory
specified when it is created. To be able to do this it has a reset
callback that does the memory operations.
This device allows the user to monkey patch memory. To be able to do
this it needs a backend to manage the datas, the same as other
memory-related devices. In this case as the backend is so trivial we
have merged it with the frontend instead of creating and maintaining a
seperate backend.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-id: 10f2a9dce5e5e11b6c6d959415b0ad6ee22bcba5.1475195078.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ACPI Spec 6.0 introduces GIC Interrupt Translation Service Structure.
Here we add the definition of the Structure.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1474616617-366-8-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Introduce global kvm_msi_use_devid flag plus associated
kvm_msi_devid_required() macro. Passes the device ID,
if needed, while building the MSI route entry. Device IDs are
required by the ARM GICv3 ITS (IRQ remapping function is based on
this information).
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1474616617-366-5-git-send-email-eric.auger@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is the basic skeleton for both KVM and software-emulated ITS.
Since we already prepare status structure, we also introduce complete
VMState description. But, because we currently have no migratable
implementations, we also set unmigratable flag.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1474616617-366-3-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add the STM32F2xx ADC device. This device randomly
generates values on each read.
This also includes creating a hw/adc directory.
Signed-off-by: Alistair Francis <alistair@alistair23.me>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 3240e660adaf537f55a63ce06096e844aece8cda.1474742262.git.alistair@alistair23.me
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is a small helper that tries to fetch binary name for given
PID.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <4d75d475c1884f8e94ee8b1e57273ddf3ed68bf7.1474987617.git.mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
mux_chr_update_read_handler() is adding a new mux_cnt each time
mux_chr_update_read_handler() is called, it's not possible to actually
update the "child" chr callbacks that were set previously. This may lead
to crashes if the "child" chr is destroyed:
valgrind x86_64-softmmu/qemu-system-x86_64 -chardev
stdio,mux=on,id=char0 -mon chardev=char0,mode=control,default
when quitting:
==4306== Invalid read of size 8
==4306== at 0x8061D3: json_lexer_destroy (json-lexer.c:385)
==4306== by 0x7E39F8: json_message_parser_destroy (json-streamer.c:134)
==4306== by 0x3447F6: monitor_qmp_event (monitor.c:3908)
==4306== by 0x480153: mux_chr_send_event (qemu-char.c:630)
==4306== by 0x480694: mux_chr_event (qemu-char.c:734)
==4306== by 0x47F1E9: qemu_chr_be_event (qemu-char.c:205)
==4306== by 0x481207: fd_chr_close (qemu-char.c:1114)
==4306== by 0x481659: qemu_chr_close_stdio (qemu-char.c:1221)
==4306== by 0x486F07: qemu_chr_free (qemu-char.c:4146)
==4306== by 0x486F97: qemu_chr_delete (qemu-char.c:4154)
==4306== by 0x487E66: qemu_chr_cleanup (qemu-char.c:4678)
==4306== by 0x495A98: main (vl.c:4675)
==4306== Address 0x28439e90 is 112 bytes inside a block of size 240 free'd
==4306== at 0x4C2CD5A: free (vg_replace_malloc.c:530)
==4306== by 0x1E4CBF2D: g_free (in /usr/lib64/libglib-2.0.so.0.4800.2)
==4306== by 0x344DE9: monitor_cleanup (monitor.c:4058)
==4306== by 0x495A93: main (vl.c:4674)
==4306== Block was alloc'd at
==4306== at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==4306== by 0x1E4CBE18: g_malloc (in /usr/lib64/libglib-2.0.so.0.4800.2)
==4306== by 0x344BF8: monitor_init (monitor.c:4021)
==4306== by 0x49063C: mon_init_func (vl.c:2417)
==4306== by 0x7FC6DE: qemu_opts_foreach (qemu-option.c:1116)
==4306== by 0x4954E0: main (vl.c:4473)
Instead, keep the "child" chr associated with a particular idx so its
handlers can be updated and removed to avoid the crash.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161003094704.18087-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is a data race if the sequence is written concurrently to the
read. In C11 this has undefined behavior. Use atomic_set; the
read side is already using atomic_read.
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20160930213106.20186-6-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add some notes on the use of the relaxed atomic access helpers and their
importance for defined behaviour in C11's multi-threaded memory model.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20160930213106.20186-3-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Only very modern GCC's actually set this define when building with the
ThreadSanitizer so this little typo slipped though.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20160930213106.20186-2-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This interface will be used by HMP commands 'info irq' and 'info pic'.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <1474921408-24710-2-git-send-email-hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Current CPU definition for AMD Opteron third generation includes
features like SSE4a and LAHF_LM support in emulated CPUID. These
features are present in K8 rev.E or K10 CPUs and later. However,
current G3 family and model describe 2nd generation K8 cores instead.
This is incorrect but was considered harmless until our tests found a
problem with linux kernels >= 3.10 (and maybe earlier) which specifically
check for Opteron K8 model when parsing CPUID leaf 0x80000001:
http://lxr.free-electrons.com/source/arch/x86/kernel/cpu/amd.c?v=3.16#L552
This code will disable LAHF_LM feature in /proc/cpuinfo if model number
is inconsistent.
This change sets Opteron_G3 family/model/stepping to 16/2/3 which is
a proper Opteron 3rd generation 2350 CPU.
Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Fix a memory leak in ide_register_restart_cb() in hw/ide/core.c and add
idebus_unrealize() in hw/ide/qdev.c to have calls to
qemu_del_vm_change_state_handler() to deal with the dangling change
state handler during hot-unplugging ide devices which might lead to a
crash.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1474995212-10580-1-git-send-email-ashijeetacharya@gmail.com
[Minor whitespace fix --js]
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
the allocated stack will be adjusted to the minimum supported stack size
by the OS and rounded up to be a multiple of the system pagesize.
Additionally an architecture dependent guard page is added to the stack
to catch stack overflows.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This enables its use for nested child nodes. The compatibility
between the 'discard' and 'detect-zeroes' setting is checked in
bdrv_open_common() now as the former setting isn't available before
calling bdrv_open() any more.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Instead of modifying the new BDS after it has been opened, use the newly
supported 'detect-zeroes' option in bdrv_open_common() so that all
requirements are checked (detect-zeroes=unmap requires discard=unmap).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
We can teach Xen to drain and flush each device as it needs to, instead
of trying to flush ALL devices. This removes the last user of
blk_flush_all.
The function is therefore removed under the premise that any new uses
of blk_flush_all would be the wrong paradigm: either flush the single
device that requires flushing, or use an appropriate flush_all mechanism
from outside of the BlkBackend layer.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit fe1a9cbc moved the flush_all routine from the bdrv layer to the
block-backend layer. In doing so, however, the semantics of the routine
changed slightly such that flush_all now used blk_flush instead of
bdrv_flush.
blk_flush can fail if the attached device model reports that it is not
"available," (i.e. the tray is open.) This changed the semantics of
flush_all such that it can now fail for e.g. open CDROM drives.
Reintroduce bdrv_flush_all to regain the old semantics without having to
alter the behavior of blk_flush or blk_flush_all, which are already
'doing the right thing.'
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
See the doc comments for a description of this new coroutine API.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1474989516-18255-2-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This switches over spice (in opengl mode) to render DisplaySurface
updates into a opengl texture, using the helper functions in
ui/console-gl.c. With this patch applied spice (with gl=on) will
stop using qxl rendering ops, it will use dma-buf passing all the
time, i.e. for bios/bootloader (before virtio-gpu driver is loaded)
too.
This should improve performance even using spice (with gl=on) with
non-accelerated stdvga because we stop squeezing all display updates
through a unix/tcp socket and basically using a shared memory transport
instead.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1474617028-3979-3-git-send-email-kraxel@redhat.com
Keep track of gl_block state (added in bba19b8 console: block rendering
until client is done) in QemuConsole and allow to query it. This way
we can avoid state inconsistencies in case different code paths make use
of this.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1474617028-3979-2-git-send-email-kraxel@redhat.com
Obviously, we should write to '@target'.
Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Reviewed-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1473851019-7005-2-git-send-email-baiyaowei@cmss.chinamobile.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Copy data operated on during request from/to local buffers to/from
the grant references.
Before grant copy operation local buffers must be allocated what is
done by calling ioreq_init_copy_buffers. For the 'read' operation,
first, the qemu device invokes the read operation on local buffers
and on the completion grant copy is called and buffers are freed.
For the 'write' operation grant copy is performed before invoking
write by qemu device.
A new value 'feature_grant_copy' is added to recognize when the
grant copy operation is supported by a guest.
Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Functions of type FindSysbusDeviceFunc currently return an integer.
However, this return value is always ignored by the caller in
find_sysbus_device().
This changes the function type to return void, to avoid confusion over
the function semantics.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Those are unneeded now that CPUState nr_{cores,threads} is always
initialized.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of requiring users and management software to be aware of
required CPUID level/xlevel/xlevel2 values for each feature,
automatically increase those values when features need them.
This was already done for CPUID[7].EBX, and is now made generic
for all CPUID feature flags. Unit test included, to make sure we
don't break ABI on older machine-types and don't mess with the
CPUID level values if they are explicitly set by the user.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This patch fixes bug with stopping and restarting replay
through monitor.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20160926080815.6992.71818.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Set cpu->running without taking the cpu_list lock, only requiring it if
there is a concurrent exclusive section. This requires adding a new
field to CPUState, which records whether a running CPU is being counted
in pending_cpus.
When an exclusive section is started concurrently with cpu_exec_start,
cpu_exec_start can use the new field to determine if it has to wait for
the end of the exclusive section. Likewise, cpu_exec_end can use it to
see if start_exclusive is waiting for that CPU.
This a separate patch for easier bisection of issues.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use async_safe_run_on_cpu() to make tb_flush() thread safe. This is
possible now that code generation does not happen in the middle of
execution.
It can happen that multiple threads schedule a safe work to flush the
translation buffer. To keep statistics and debugging output sane, always
check if the translation buffer has already been flushed.
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
[AJB: minor re-base fixes]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1470158864-17651-13-git-send-email-alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is not necessary to hold qemu_cpu_list_mutex throughout the
exclusive section, because no other exclusive section can run
while pending_cpus != 0.
exclusive_idle() is called in cpu_exec_start(), and that prevents
any CPUs created after start_exclusive() from entering cpu_exec()
during an exclusive section.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will serve as the base for async_safe_run_on_cpu. Because
start_exclusive uses CPU_FOREACH, merge exclusive_lock with
qemu_cpu_list_lock: together with a call to exclusive_idle (via
cpu_exec_start/end) in cpu_list_add, this protects exclusive work
against concurrent CPU addition and removal.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make CPU work core functions common between system and user-mode
emulation. User-mode does not use run_on_cpu, so do not implement it.
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1470158864-17651-10-git-send-email-alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a mutex for the CPU list to system emulation, as it will be used to
manage safe work. Abstract manipulation of the CPU list in new functions
cpu_list_add and cpu_list_remove.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUState is a fairly common pointer to pass to these helpers. This means
if you need other arguments for the async_run_on_cpu case you end up
having to do a g_malloc to stuff additional data into the routine. For
the current users this isn't a massive deal but for MTTCG this gets
cumbersome when the only other parameter is often an address.
This adds the typedef run_on_cpu_func for helper functions which has an
explicit CPUState * passed as the first parameter. All the users of
run_on_cpu and async_run_on_cpu have had their helpers updated to use
CPUState where available.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[Sergey Fedorov:
- eliminate more CPUState in user data;
- remove unnecessary user data passing;
- fix target-s390x/kvm.c and target-s390x/misc_helper.c]
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts)
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> (s390 parts)
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1470158864-17651-3-git-send-email-alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Migrating a VM during reboot sometimes results in differences
between the source and destination in the SMRAM area.
This is because migration_bitmap_sync() only fetches from KVM
the dirty log of address_space_memory. SMRAM memory slots
are ignored and the modifications to SMRAM are not sent to the
destination.
Reported-by: He Rongguang <herongguang.he@huawei.com>
Reviewed-by: He Rongguang <herongguang.he@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As discussed on the list [1], having a comment stating that this file
is "public domain" is arguably wrong and not legally binding. This patch
replaces that comment with a clear GPLv2+ license as proposed in [2].
[1] http://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg06151.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg06217.html
Worth noting, compiler.h was originally created on 5c026320 by splitting
qemu-common.h. At the time, qemu-common.h was already GPLv2+.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Message-Id: <1474642971-11866-1-git-send-email-felipe@nutanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's 2.8 now, and maybe it's time to switch IOAPIC default version to
0x20.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1474608795-23058-1-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Jhash will be used by colo-compare and filter-rewriter
to save and lookup net connection info
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Add qemu_chr_add_handlers_full() API, we can use
this API pass in a GMainContext,make handler run
in the context rather than main_loop.
This comments from Daniel P . Berrange.
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This allows increasing the rx queue size up to 1024: unlike with tx,
guests don't put in huge S/G lists into RX so the risk of running into
the max 1024 limitation due to some off-by-one seems small.
It's helpful for users like OVS-DPDK which don't do any buffering on the
host - 1K roughly matches 500 entries in tun + 256 in the current rx
queue, which seems to work reasonably well. We could probably make do
with ~750 entries but virtio spec limits us to powers of two.
It might be a good idea to specify an s/g size limit in a future
version.
It also might be possible to make the queue size smaller down the road, 64
seems like the minimal value which will still work (as guests seem to
assume a queue full of 1.5K buffers is enough to process the largest
incoming packet, which is ~64K). No one actually asked for this, and
with virtio 1 guests can reduce ring size without need for host
configuration, so don't bother with this for now.
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>
Suggested-by: Patrik Hermansson <phermansson@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The new interface can be used to replace the old notify_started() and
notify_stopped(). Meanwhile it provides explicit flags so that IOMMUs
can know what kind of notifications it is requested for.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1474606948-14391-3-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
IOMMU Notifier list is used for notifying IO address mapping changes.
Currently VFIO is the only user.
However it is possible that future consumer like vhost would like to
only listen to part of its notifications (e.g., cache invalidations).
This patch introduced IOMMUNotifier and IOMMUNotfierFlag bits for a
finer grained control of it.
IOMMUNotifier contains a bitfield for the notify consumer describing
what kind of notification it is interested in. Currently two kinds of
notifications are defined:
- IOMMU_NOTIFIER_MAP: for newly mapped entries (additions)
- IOMMU_NOTIFIER_UNMAP: for entries to be removed (cache invalidates)
When registering the IOMMU notifier, we need to specify one or multiple
types of messages to listen to.
When notifications are triggered, its type will be checked against the
notifier's type bits, and only notifiers with registered bits will be
notified.
(For any IOMMU implementation, an in-place mapping change should be
notified with an UNMAP followed by a MAP.)
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1474606948-14391-2-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add IVRS table for AMD IOMMU. Generate IVRS or DMAR
depending on emulated IOMMU.
Signed-off-by: David Kiarie <davidkiarie4@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce PCI macros from for use by AMD IOMMU
Signed-off-by: David Kiarie <davidkiarie4@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU prints an error message and exits when the device enters an invalid
state. Terminating the process is heavy-handed. The guest may still be
able to function even if there is a bug in a virtio guest driver.
Moreover, exiting is a bug in nested virtualization where a nested guest
could DoS other nested guests by killing a pass-through virtio device.
I don't think this configuration is possible today but it is likely in
the future.
If the broken flag is set, do not process virtqueues or write back used
descriptors. The broken flag can be cleared again by resetting the
device.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
commit (14c985cff target-i386: present virtual L3 cache info for vcpus)
misplaced compat property putting it in new 2.8 machine type
which would effectively to disable feature until 2.9 is released.
Intent of commit probably should be to disable feature for 2.7
and older while allowing not yet released 2.8 to have feature
enabled by default.
Cc: qemu-stable@nongnu.org
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Since commit
bacc344c ("machine: add properties to compat_props incrementaly")
there is no need to chain per machine type compat macro.
Clean up places where it was done anyway so it will be
consistent and won't confuse contributors during addtion
of new machine types.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJX5RkyAAoJEH8JsnLIjy/WMnYP/i03Ii6nt8VY2PMX10NK3z+H
5QskSnGzBgQg0OVFPQkvV1nW8bXPN07n/L2Q0H9FzQXTUzCbAkQC4MkzBZTMgaUp
63XVRG151+AjqctvIncWQktUgMg6ywKBCug8G6gwMQ1BVsLerUJmE2tM0JGZF6WL
q+F6uTtVMLNKR6miaTsJtFJA+ts9gGEecGNITRByEWbtD7ESF8TOXhCfpV/Ni45s
5pMVOQT0o9jwgjekrUbJ9PrlxGSLfe/bi3ypqy6boe8EXQS6DImAiDKWkTQjGj2P
cGlZ0oxWHq0g/15sgZOzqDtOXqE02+RK9C1UcASgObfdUos6gFpi6G7cAffzP9aq
wzhi8u8Beth2WZP8tLLAJtoQjJhcjutWSw60HPhEcHZNkkaZF3KMXGDk8j1xp5EN
8VVkWVZr+1ZOP+HsHCNrpvag+IP49DJ5j50W+oVqSKSzG37AC6SxNo3PEsg1urAi
m5APFJbCgxMfDSnb8yw6ClBIe1IH624VMoELVzY1C9AqOnTxfJ08o4MbE8t1DEoa
dQV/R3Mrqt3W+YflWNPXzVhRxD/mZW+RXPlBARYvFzHW12fh1PcwFWrB+ivdqu+4
x0VvmZebJcEP+CxmRQOat80Zhos1Hs+ZyxUHyA5a0+Nt3YhK7k3sr4CQXxr5MIHA
1c/D8znmWf9VksmW33U5
=JG9f
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Fri 23 Sep 2016 12:59:46 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (33 commits)
block: Remove BB interface from blockdev-add/del
qemu-iotests/141: Avoid blockdev-add with id
block: Avoid printing NULL string in error messages
qemu-iotests/139: Avoid blockdev-add with id
qemu-iotests/124: Avoid blockdev-add with id
qemu-iotests/118: Avoid blockdev-add with id
qemu-iotests/117: Avoid blockdev-add with id
qemu-iotests/087: Avoid blockdev-add with id
qemu-iotests/081: Avoid blockdev-add with id
qemu-iotests/071: Avoid blockdev-add with id
qemu-iotests/067: Avoid blockdev-add with id
qemu-iotests/041: Avoid blockdev-add with id
qemu-iotests/118: Test media change with qdev name
block: Accept device model name for block_set_io_throttle
block: Accept device model name for blockdev-change-medium
block: Accept device model name for eject
block: Accept device model name for x-blockdev-remove-medium
block: Accept device model name for x-blockdev-insert-medium
block: Accept device model name for blockdev-open/close-tray
qdev-monitor: Add blk_by_qdev_id()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This pull request supersedes ppc-for-2.8-20160922. There was a clang
build error in that, and I've also added one extra patch in the new pull.
Included in this set of ppc and spapr patches are:
* TCG implementations for more POWER9 instructions
* Some preliminary XICS fixes in preparataion for the pnv machine type
* A significant ADB (Macintosh kbd/mouse) cleanup
* Some conversions to use trace instead of debug macros
* Fixes to correctly handle global TLB flush synchronization in
TCG. This is already a bug, but it will have much more impact
when we get MTTCG
* Add more qtest testcases for Power
* Some MAINTAINERS updates
* Assorted bugfixes
* Add the basics of NUMA associativity to the spapr PCI host bridge
This touches some test files and monitor.c which are technically
outside the ppc code, but coming through this tree because the changes
are primarily of interest to ppc.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJX5NZnAAoJEGw4ysog2bOSoLEP/1YpRFG/6gmiT+T+Btz1QYcd
eqrJkV63/rY/lvgZOvUBdqA/YKaBSWDOEByNFRZ+Grqz9h5zKrRcmM7IWdRWg+vG
gyrZUm1pscFG20iGNcenxB8mD0VMk7C77gnUlv12bo+mK+1D1i8eUfKLFqxb0kOx
JGIRQNG5orF5vZxsyjRPVpvMS9gNG90vrPIypux4ryozCVMWbrjXRZNsPQKz8wb9
UGcJIFB6R6JVbmBGchi434PEJkcdZzP/a0HvVSO51oGsFBnwYwQ7XVc3PyA4KCD7
tTbm6T2Rpdak3Pcd/nuzoXCMBCkh48XGKxZ+yPuLXGG5ZGIZ6rzlHPqBsEqqiLz5
DLzbsxKyLHX2Af87js4J9OXkoNQI4rVGurvNbkQ7IMQ2/Xt97kgUEgr3W0Vj+r82
bqIqWm4OdJ9cDzTGVlQ7l2vLv6RMe7DrkeWRNEKZZgfir7Hgj1gr79BOe96ETKBd
7r/1z0fBkZoWSq2OdjX8RouXMwd1Nq3FnqYv2BQ99rvM/AqpkY0HYsPIfUilHq6T
ZXhvm/4LIEev0F/GiJvV5jHHg637QS4QqdyglF8ODC8vSMvOThhL9Gj7EMgJs7hj
Ywt1B5y88//Zq4+IGVda98J5ynOZO1CArvzoYR5UMnWiq2K0Lxpq7wemE/finyIK
0jWLqlmCmYRzsS+oQEg/
=et1C
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.8-20160923' into staging
ppc patch queue 2016-09-23
This pull request supersedes ppc-for-2.8-20160922. There was a clang
build error in that, and I've also added one extra patch in the new pull.
Included in this set of ppc and spapr patches are:
* TCG implementations for more POWER9 instructions
* Some preliminary XICS fixes in preparataion for the pnv machine type
* A significant ADB (Macintosh kbd/mouse) cleanup
* Some conversions to use trace instead of debug macros
* Fixes to correctly handle global TLB flush synchronization in
TCG. This is already a bug, but it will have much more impact
when we get MTTCG
* Add more qtest testcases for Power
* Some MAINTAINERS updates
* Assorted bugfixes
* Add the basics of NUMA associativity to the spapr PCI host bridge
This touches some test files and monitor.c which are technically
outside the ppc code, but coming through this tree because the changes
are primarily of interest to ppc.
# gpg: Signature made Fri 23 Sep 2016 08:14:47 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.8-20160923: (45 commits)
spapr_pci: Add numa node id
monitor: fix crash for platforms without a CPU 0
linux-user: ppc64: fix ARCH_206 bit in AT_HWCAP
ppc/kvm: Mark 64kB page size support as disabled if not available
ppc/xics: An ICS with offset 0 is assumed to be uninitialized
ppc/xics: account correct irq status
Enable H_CLEAR_MOD and H_CLEAR_REF hypercalls on KVM/PPC64.
target-ppc: tlbie/tlbivax should have global effect
target-ppc: add flag in check_tlb_flush()
target-ppc: add TLB_NEED_LOCAL_FLUSH flag
spapr: Introduce sPAPRCPUCoreClass
target-ppc: implement darn instruction
target-ppc: add stxsi[bh]x instruction
target-ppc: add lxsi[bw]zx instruction
target-ppc: add xxspltib instruction
target-ppc: consolidate store conditional
target-ppc: move out stqcx impementation
target-ppc: consolidate load with reservation
target-ppc: convert st[16,32,64]r to use new macro
target-ppc: convert st64 to use new macro
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Version: GnuPG v2
iQEcBAABCAAGBQJX5LZ0AAoJEMo1YkxqkXHGmgAIAKeDTKx0sA76ewIvH9fKdmUu
NNatJw59XnVX8lpfOU5yISkJ4BD6oBdN7tbWaOW8yzcAeYu1Ff5iUu4LBEUFb7eW
g6zqUCV58XjaCTLmTiAfa19Exfnh6pXZlZMRP4Hr3vUVSCHFmC0EyTEllfHxU/jW
aPHtAEge/p6EDAHygHJBTSQzsaXRdyJNyt/AKPreDtblNRT8VgapCDzZQPcCVGH1
F9grWVu0B/VVDS0mfgSRhT0UeF/vtiikuRW92sC4woVVB+brJyG4VwGT8oeUN8RU
30/tGo5p9fpqef3iP669uUrloLfmWcKcIJuPfQ4ZUlZh8kIV+lWK9kZuTVgocGw=
=xLJw
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/famz/tags/various-pull-request' into staging
# gpg: Signature made Fri 23 Sep 2016 05:58:28 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/various-pull-request: (23 commits)
docker: exec $CMD
docker: Terminate instances at SIGTERM and SIGHUP
docker: Support showing environment information
docker: Print used options before doing configure
docker: Flatten default target list in test-quick
docker: Update fedora image to latest
docker: Generate /packages.txt in ubuntu image
docker: Generate /packages.txt in fedora image
docker: Generate /packages.txt in centos6 image
tests: Ignore test-uuid
Add UUID files to MAINTAINERS
tests: Add uuid tests
uuid: Tighten uuid parse
vl: Switch qemu_uuid to QemuUUID
configure: Remove detection code for UUID
tests: No longer dependent on CONFIG_UUID
crypto: Switch to QEMU UUID API
vpc: Use QEMU UUID API
vdi: Use QEMU UUID API
vhdx: Use QEMU UUID API
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# tests/Makefile.include
This finds the BlockBackend attached to the device model identified by
its qdev ID.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This finds a BlockBackend given the device model that is attached to it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This adds the "read-only" option to the QDict. One important effect of
this change is that when a child inherits options from its parent, the
existing "read-only" mode can be preserved if it was explicitly set
previously.
This addresses scenarios like this:
[E] <- [D] <- [C] <- [B] <- [A]
In this case, if we reopen [D] with read-only=off, and later reopen
[B], then [D] will not inherit read-only=on from its parent during the
bdrv_reopen_queue_child() stage.
The BDRV_O_RDWR flag is not removed yet, but its keep in sync with the
value of the "read-only" option.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is unnecessary and has been unused since 5433c24f0f.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Update all qemu_uuid users as well, especially get rid of the duplicated
low level g_strdup_printf, sscanf and snprintf calls with QEMU UUID API.
Since qemu_uuid_parse is quite tangled with qemu_uuid, its switching to
QemuUUID is done here too to keep everything in sync and avoid code
churn.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-Id: <1474432046-325-10-git-send-email-famz@redhat.com>
A number of different places across the code base use CONFIG_UUID. Some
of them are soft dependency, some are not built if libuuid is not
available, some come with dummy fallback, some throws runtime error.
It is hard to maintain, and hard to reason for users.
Since UUID is a simple standard with only a small number of operations,
it is cleaner to have a central support in libqemuutil. This patch adds
qemu_uuid_* functions that all uuid users in the code base can
rely on. Except for qemu_uuid_generate which is new code, all other
functions are just copy from existing fallbacks from other files.
Note that qemu_uuid_parse is moved without updating the function
signature to use QemuUUID, to keep this patch simple.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-Id: <1474432046-325-2-git-send-email-famz@redhat.com>
This adds a numa id property to a PHB to allow linking passed PCI device
to CPU/memory. It is up to the management stack to do CPU/memory pinning
to the node with the actual PCI device.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[dwg: Renamed property from "node" to "numa_node" to match the similar
one in the pxb device]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This will make life easier for dealing with dynamically configured
ICSes such as PHB3
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Each spapr cpu core type defines an instance_init routine which just
populates the CPU class name. This can be done in the class_init
commonly for all core types which simplifies the registration.
This is inspired by how PowerNV core types are registered.
Certain types of spapr cpu cores ('host' and generic type based on host
CPU) are initialized in target-ppc/kvm.c. To convert these type
registrations to use class_init, we need to expose
spapr_cpu_core_class_init() outside of spapr_cpu_core.c.
Commit d11b268e17 added a generic sPAPR CPU core family
type to support cases like POWER8 CPU type on POWER8E host CPU.
Switching to class_init would fix such scenarios to use the right
CPU thread type instead of defaulting to host-powerpc64-cpu.
In an unrelated cleanup, fix a typo in .get_hotplug_handler routine.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add the adb-keys.h file. It maps ADB transition key codes with values.
Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add a first test to validate the protocol:
- rtas/get-time-of-day compares the time
from the guest with the time from the host.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Whilst according to the Zynq TRM this device covers a register region of
0x000 - 0x120. The register region is also shared with XADCIF prefix
registers at 0x100 and above. Due to how the devcfg and the xadc devices
are implemented in QEMU these are separate models with individual mmio
regions. As such the region registered by the devcfg overlaps with the
xadc when initialized in a machine model (e.g. xilinx-zynq-a9).
This patch fixes up the incorrect region size, where
XLNX_ZYNQ_DEVCFG_R_MAX is missing its '/ 4' causing it to be 0x460 in
size. As well as setting the region size to the 0x0 - 0x100 region so
that an xadc device instance can be registered in the correct region to
pair with the devcfg device instance.
Mapping with XLNX_ZYNQ_DEVCFG_R_MAX = 0x118:
dev: xlnx.ps7-dev-cfg, id ""
mmio 00000000f8007000/0000000000000460
dev: xlnx,zynq-xadc, id ""
mmio 00000000f8007100/0000000000000020
Mapping with XLNX_ZYNQ_DEVCFG_R_MAX = 0x100 / 4:
dev: xlnx.ps7-dev-cfg, id ""
mmio 00000000f8007000/0000000000000100
dev: xlnx,zynq-xadc, id ""
mmio 00000000f8007100/0000000000000020
Signed-off-by: Nathan Rossi <nathan@nathanrossi.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20160921180911.32289-1-nathan@nathanrossi.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a new function load_image_targphys_as() that allows the caller
to specify an AddressSpace to use when loading a targphys. The
original load_image_targphys() function doesn't have any change in
functionality.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 87de45de7acf02cbe6bae9d6c4d6fb8f3aba4f61.1474331683.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a new function load_uimage_as() that allows the caller to
specify an AddressSpace to use when loading the uImage. The
original load_uimage() function doesn't have any change in
functionality.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1254092e6b80d3cd3cfabafe165d56a96c54c0b5.1474331683.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a new function load_elf_as() that allows the caller to specify an
AddressSpace to use when loading the ELF. The original load_elf()
function doesn't have any change in functionality.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 8b5cefecdf56fba4ccdff2db880f0b6b264cf16f.1474331683.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When loading ROMs allow the caller to specify an AddressSpace to use for
the load.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 85f86b94ea94879e7ce8b12e85ac8de26658f7eb.1474331683.git.alistair.francis@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the caller didn't specify an architecture for the ELF machine
the load_elf() function will auto detect it based on the ELF file.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: f2d70b47fcad31445f947f8817a0e146d80a046b.1474331683.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Cadence GEM hardware allows incoming data to be 'screened' based on some
register values. Add support for these screens.
We also need to increase the max regs to avoid compilation failures. These new
registers are implemented in the next patch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 73e69a8ad9fa2763e9f68f71eaf2469dd5744fcc.1469727764.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Cadence GEM hardware supports N number priority queues, this patch is a
step towards that by adding the property to set the queues. At the moment
behaviour doesn't change as we only use queue 0.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 6543ec0d0c4bfd2678d0ed683efb197e91b17733.1469727764.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some of the timer devices may behave differently from what ptimer
provides. Introduce ptimer policy feature that allows ptimer users to
change default and wrong timer behaviour, for example to continuously
trigger periodic timer when load value is equal to "0".
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Message-id: 994cd608ec392da6e58f0643800dda595edb9d97.1473252818.git.digetx@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Configure the size of the RAM of the SOC using a property to propagate
the value down to the memory controller from the board level.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1473438177-26079-14-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There is no need to do this at each reset as the RAM size will not
change.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1473438177-26079-12-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Based on previous work done by Andrew Jeffery <andrew@aj.id.au>.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1473438177-26079-9-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This gives some explanation behind the magic number 0x120CE416.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1473438177-26079-8-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Let's define an object class for each Aspeed SoC we support. A
AspeedSoCInfo struct gathers the SoC specifications which can later be
used by an instance of the class or by a board using the SoC.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1473438177-26079-4-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is a name replacement to prepare ground for other SoCs.
Let's also remove the AST2400_SMC_BASE definition from the address
space mappings, as it is not used. This controller was removed from
the Aspeed SoC AST2500, so this provides us a better common base for
the address space mapping on both SoCs.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1473438177-26079-3-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Let's prepare for new Aspeed SoCs and rename the ast2400 file to a
more generic one. There are no changes in the code apart from the
header file include.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1473438177-26079-2-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Extend the current module interface to allow for block drivers to be
loaded dynamically on request. The only block drivers that can be
converted into modules are the drivers that don't perform any init
operation except for registering themselves.
In addition, only the protocol drivers are being modularized, as they
are the only ones which see significant performance benefits. The format
drivers do not generally link to external libraries, so modularizing
them is of no benefit from a performance perspective.
All the necessary module information is located in a new structure found
in module_block.h
This spoils the purpose of 5505e8b76f (block/dmg: make it modular).
Before this patch, if module build is enabled, block-dmg.so is linked to
libbz2, whereas the main binary is not. In downstream, theoretically, it
means only the qemu-block-extra package depends on libbz2, while the
main QEMU package needn't to. With this patch, we (temporarily) change
the case so that the main QEMU depends on libbz2 again.
Signed-off-by: Marc Marí <markmb@redhat.com>
Signed-off-by: Colin Lord <clord@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1471008424-16465-4-git-send-email-clord@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
[mreitz: Do a signed comparison against the length of
block_driver_modules[], so it will not cause a compile error when
empty]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Currently when timing the pbkdf algorithm a fixed key
size of 32 bytes is used. This results in inaccurate
timings for certain hashes depending on their digest
size. For example when using sha1 with aes-256, this
causes us to measure time for the master key digest
doing 2 sha1 operations per iteration, instead of 1.
Instead we should pass in the desired key size to the
timing routine that matches the key size that will be
used for real later.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qcrypto_pbkdf_count_iters method uses a 64 bit int
but then checks its value against INT32_MAX before
returning it. This bounds check is premature, because
the calling code may well scale the iteration count
by some value. It is thus better to return a 64-bit
integer and let the caller do range checking.
For consistency the qcrypto_pbkdf method is also changed
to accept a 64bit int, though this is somewhat academic
since nettle is limited to taking an 'int' while gcrypt
is limited to taking a 'long int'.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
cpu model was merged with 2.8, it is wrong to abuse ri_allowed which
was enabled with 2.7.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
The return address argument to the softmmu template helpers was
confused. In the legacy case, we wanted to indicate that there
is no return address, and so passed in NULL. However, we then
immediately subtracted GETPC_ADJ from NULL, resulting in a non-zero
value, indicating the presence of an (invalid) return address.
Push the GETPC_ADJ subtraction down to the only point it's required:
immediately before use within cpu_restore_state_from_tb, after all
NULL pointer checks have been completed.
This makes GETPC and GETRA identical. Remove GETRA as the lesser
used macro, replacing all uses with GETPC.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Currently, devices are plugged before features are negotiated.
If the backend doesn't support VIRTIO_F_VERSION_1, the transport
needs to rewind some settings.
This is the case for CCW, for which a post_plugged callback had
been introduced, where max_rev field is just updated if
VIRTIO_F_VERSION_1 is not supported by the backend.
For PCI, implementing post_plugged would be much more
complicated, so it needs to know whether the backend supports
VIRTIO_F_VERSION_1 at plug time.
Currently, nothing is done for PCI. Modern capabilities get
exposed to the guest even if VIRTIO_F_VERSION_1 is not supported
by the backend, which confuses the guest.
This patch replaces existing post_plugged solution with an
approach that fits with both transports.
Features negotiation is performed before ->device_plugged() call.
A pre_plugged callback is introduced so that the transports can
set their supported features.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> [ccw]
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Unused function declarations were found using a simple gcc plugin and
manually verified by grepping the sources.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
All operations that take a floatx80 as an operand need to have their
inputs checked for malformed encodings. In all of these cases, use the
function floatx80_invalid_encoding to perform the check. If an invalid
operand is found, raise an invalid operation exception, and then return
either NaN (for fp-typed results) or the integer indefinite value (the
minimum representable signed integer value, for int-typed results).
For the non-quiet comparison operations, this touches adjacent code in
order to pass style checks.
Signed-off-by: Andrew Dutcher <andrew@andrewdutcher.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1471392895-17324-1-git-send-email-andrew@andrewdutcher.com
[PMM: changed "1 << 63" to "1ULL << 63" to fix compile errors]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1472496380-19706-6-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the two users don't make use of the returned offset,
beyond ensuring that the entire buffer is zero, consider the
can_use_buffer_find_nonzero_offset and buffer_find_nonzero_offset
functions internal.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1472496380-19706-4-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Load the LAPIC state during post_load (rather than when the CPU
starts).
This allows an interrupt to be delivered from the ioapic to
the lapic prior to cpu loading, in particular the RTC that starts
ticking as soon as we load it's state.
Fixes a case where Windows hangs after migration due to RTC interrupts
disappearing.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the __atomic_*_n() primitives which take the value as argument. It
is not necessary to store the value locally before calling the
primitive, hence saving us a stack store and load.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160829171701.14025-1-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove the redundant barrier() after the fence as agreed in previous
discussion here:
https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg00489.html
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160824204424.14041-3-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_setup_guest_memory only does "madvise to QEMU_MADV_DONTFORK" and
is only called by ram_block_add, which actually is duplicate code.
Bonus: add simple comment for kvm_has_sync_mmu to make life easier.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1473662096-32598-1-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The comments is outdated. The patch has following changes:
1. tense correction.
2. all clock time value is returned in nanoseconds, so, they are same in
precision.
3. virtual clock doesn't use cpu cycles.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1469790338-28990-2-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When invalidating a translation block, set an invalid flag into the
TranslationBlock structure first. It is also necessary to check whether
the target TB is still valid after acquiring 'tb_lock' but before calling
tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in
future. Note that we don't have to check 'last_tb'; an already invalidated
TB will not be executed anyway and it is thus safe to patch it.
Suggested-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
instead of accessing tqe_prev field dircetly outside
of queue.h use macros to check if element is in list
and make sure that afer element is removed from list
tqe_prev field could be used to do the same check.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1469450832-84343-1-git-send-email-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Right after main_loop ends, we release various things but keep iothread
alive. The latter is not prepared to the sudden change of resources.
Specifically, after bdrv_close_all(), virtio-scsi dataplane get a
surprise at the empty BlockBackend:
(gdb) bt
at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:543
at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:577
It is because the d->conf.blk->root is set to NULL, then
blk_get_aio_context() returns qemu_aio_context, whereas s->ctx is still
pointing to the iothread:
hw/scsi/virtio-scsi.c:543:
if (s->dataplane_started) {
assert(blk_get_aio_context(d->conf.blk) == s->ctx);
}
To fix this, let's stop iothreads before doing bdrv_close_all().
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1473326931-9699-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Auto complete mirror job in background to prevent from
blocking synchronously
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com>
Message-id: 1469602913-20979-7-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Normal backup(sync='none') workflow:
step 1. NBD peformance I/O write from client to server
qcow2_co_writev
bdrv_co_writev
...
bdrv_aligned_pwritev
notifier_with_return_list_notify -> backup_do_cow
bdrv_driver_pwritev // write new contents
step 2. drive-backup sync=none
backup_do_cow
{
wait_for_overlapping_requests
cow_request_begin
for(; start < end; start++) {
bdrv_co_readv_no_serialising //read old contents from Secondary disk
bdrv_co_writev // write old contents to hidden-disk
}
cow_request_end
}
step 3. Then roll back to "step 1" to write new contents to Secondary disk.
And for replication, we must make sure that we only read the old contents from
Secondary disk in order to keep contents consistent.
1) Replication workflow of Secondary
virtio-blk
^
-------> 1 NBD |
|| server 3 replication
|| ^ ^
|| | backing backing |
|| Secondary disk 6<-------- hidden-disk 5 <-------- active-disk 4
|| | ^
|| '-------------------------'
|| drive-backup sync=none 2
Hence, we need these interfaces to implement coarse-grained serialization between
COW of Secondary disk and the read operation of replication.
Example codes about how to use them:
*#include "block/block_backup.h"
static coroutine_fn int xxx_co_readv()
{
CowRequest req;
BlockJob *job = secondary_disk->bs->job;
if (job) {
backup_wait_for_overlapping_requests(job, start, end);
backup_cow_request_begin(&req, job, start, end);
ret = bdrv_co_readv();
backup_cow_request_end(&req);
goto out;
}
ret = bdrv_co_readv();
out:
return ret;
}
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1469602913-20979-4-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Implement the new virtio sockets device for host<->guest communication
using the Sockets API. Most of the work is done in a vhost kernel
driver so that virtio-vsock can hook into the AF_VSOCK address family.
The QEMU vhost-vsock device handles configuration and live migration
while the rx/tx happens in the vhost_vsock.ko Linux kernel driver.
The vsock device must be given a CID (host-wide unique address):
# qemu -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3 ...
For more information see:
http://qemu-project.org/Features/VirtioVsock
[Endianness fixes and virtio-ccw support by Claudio Imbrenda
<imbrenda@linux.vnet.ibm.com>]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[mst: rebase to master]
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtqueue_discard() requires a VirtQueueElement but virtio-balloon does
not migrate its in-use element. Introduce a new function that is
similar to virtqueue_discard() but doesn't require a VirtQueueElement.
This will allow virtio-balloon to access element again after migration
with the usual proviso that the guest may have modified the vring since
last time.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Roman Kagan <rkagan@virtuozzo.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently each VQ Notification Virtio Capability is allocated
on a different page. The idea is to enable split drivers within
guests, however there are no known plans to do that.
The allocation will result in a 8MB BAR, more than various
guest firmwares pre-allocates for PCI Bridges hotplug process.
Reserve 4 bytes per VQ by default and add a new parameter
"page-per-vq" to be used with split drivers.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some software algorithms are based on the hardware's cache info, for example,
for x86 linux kernel, when cpu1 want to wakeup a task on cpu2, cpu1 will trigger
a resched IPI and told cpu2 to do the wakeup if they don't share low level
cache. Oppositely, cpu1 will access cpu2's runqueue directly if they share llc.
The relevant linux-kernel code as bellow:
static void ttwu_queue(struct task_struct *p, int cpu)
{
struct rq *rq = cpu_rq(cpu);
......
if (... && !cpus_share_cache(smp_processor_id(), cpu)) {
......
ttwu_queue_remote(p, cpu); /* will trigger RES IPI */
return;
}
......
ttwu_do_activate(rq, p, 0); /* access target's rq directly */
......
}
In real hardware, the cpus on the same socket share L3 cache, so one won't
trigger a resched IPIs when wakeup a task on others. But QEMU doesn't present a
virtual L3 cache info for VM, then the linux guest will trigger lots of RES IPIs
under some workloads even if the virtual cpus belongs to the same virtual socket.
For KVM, there will be lots of vmexit due to guest send IPIs.
The workload is a SAP HANA's testsuite, we run it one round(about 40 minuates)
and observe the (Suse11sp3)Guest's amounts of RES IPIs which triggering during
the period:
No-L3 With-L3(applied this patch)
cpu0: 363890 44582
cpu1: 373405 43109
cpu2: 340783 43797
cpu3: 333854 43409
cpu4: 327170 40038
cpu5: 325491 39922
cpu6: 319129 42391
cpu7: 306480 41035
cpu8: 161139 32188
cpu9: 164649 31024
cpu10: 149823 30398
cpu11: 149823 32455
cpu12: 164830 35143
cpu13: 172269 35805
cpu14: 179979 33898
cpu15: 194505 32754
avg: 268963.6 40129.8
The VM's topology is "1*socket 8*cores 2*threads".
After present virtual L3 cache info for VM, the amounts of RES IPIs in guest
reduce 85%.
For KVM, vcpus send IPIs will cause vmexit which is expensive, so it can cause
severe performance degradation. We had tested the overall system performance if
vcpus actually run on sparate physical socket. With L3 cache, the performance
improves 7.2%~33.1%(avg:15.7%).
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This will used by the next patch.
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Simplify a bit the code by using g_strdup_printf() and store it in a
non-const value so casting is no longer needed, and ownership is
clearer.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Further cleanup would need to call qemu_free_irq() at the appropriate
time, but for now this silences ASAN about direct leaks.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
machine_class_base_init() member name is allocated by
machine_class_base_init(), but not freed by
machine_class_finalize(). Simply freeing there doesn't work,
because DEFINE_PC_MACHINE() overwrites it with a literal string.
Fix DEFINE_PC_MACHINE() not to overwrite it, and add the missing
free to machine_class_finalize().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
qemu_irq is already a pointer, no need to have an extra pointer level.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The isa_register_portio_list() function allocates ioports
data/state. Let's keep the reference to this data on some owner. This
isn't enough to fix leaks, but at least, ASAN stops complaining of
direct leaks. Further cleanup would require calling
portio_list_del/destroy().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Those functions are only available since glib 2.28.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
This is my first pull request for the newly opened qemu-2.8 tree. It
contains a heap of things that were too late for 2.7 and have been
queued for a while. In particular:
* A number of preliminary patches for the powernv machine type
* A substantial cleanup of exception handling which will be
necessary to support running a TCG with hypervisor
facilities
* A start on support for POWER9
* Some TCG implementations for new POWER9 instructions
* Some TCG and related cleanups in preparation for POWER9
* Some assorted TCG optimizations
* An implementation of the H_CHANGE_LOGICAL_LAN_MAC hypercall
which allows the MAC address to be changed on the PAPR virtual
NIC.
* Add some extra test cases for several machines (this isn't
strictly in the ppc code, but is most value to ppc)
NOTE: This pull request supersedes ppc-for-2.8-20160906, which had
some problems. Changes:
* Dropped BenH's lmw/stmw speedups, which break for
qemu-system-ppc64 on BE hosts
* A small fix to Thomas' serial output test to avoid a warning on
the isapc machine type.
* Some trivial checkpatch fixes
Note that some of the patches in this series still have large numbers
of checkpatch warnings. This is because they're moving existing code
that predates most of the checkpatch style conventions.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJXz68XAAoJEGw4ysog2bOS4TgQAObm4jNLFSbYEWyy3ZEY9p7Y
K98+l3UAtzjDRirv5msBIdtH4j0DDRqUmCCHKKwHSIf6CAgC9z00Yjd2KZy6DFiO
5gHIQnJbvSqB/0HrDmpvOCSBboSm3CW0ZRex6pvTD/7OFHGd5KzhZ8Q+KxITePEi
aYjD6qubYWr22GwXiip+a7EgJ46vtEh/R6WS0Cp1FGQmtJeMCFtsyHtZaZy8t73f
UXzUktMjoMV7RT74EcLdkbl672MWJ6tTQRJAg4C5YCW9yclioXQqk9ARZyWkFpXB
cgJyEC3l4sjFU5VLYVxm0OLqS4QpMb2B3Cg2HtUuBdUnBsZ4NH4oKcI9GH/EVj5t
rt5Xx1u04H/tZVJkCXc6/QKrUh82MnYPylbHXDsv8Xo3Bdy7h8bgQYLscA6w7r53
PAgVfrdIWbXmbz7VjpPmZsuk0d2B5CiA2jhDRwv0LTv2LZxwd+AxymwaaSbZiMCP
A6U7aMt4fjBFnK5FsTFMwGz/8wr4QfPvmtPUSVBJhvPfeq6CrieK8xaqR/RuQgcm
lwGDo9RCk66MoM325+1xzQnzzrOubn4RGIMCoCJAIMWQ4n4eYBp04y0FB4UAkjSU
h2c0SnnMoyUnb/BICezGTnA8St9EDYDx5/e2HLo9wkXJAvlV3C+RRtFOvuPDpmDq
3qSELiaINaNhJWGdMFHt
=T1rz
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.8-20160907' into staging
ppc patch queue for 2016-Sep-7
This is my first pull request for the newly opened qemu-2.8 tree. It
contains a heap of things that were too late for 2.7 and have been
queued for a while. In particular:
* A number of preliminary patches for the powernv machine type
* A substantial cleanup of exception handling which will be
necessary to support running a TCG with hypervisor
facilities
* A start on support for POWER9
* Some TCG implementations for new POWER9 instructions
* Some TCG and related cleanups in preparation for POWER9
* Some assorted TCG optimizations
* An implementation of the H_CHANGE_LOGICAL_LAN_MAC hypercall
which allows the MAC address to be changed on the PAPR virtual
NIC.
* Add some extra test cases for several machines (this isn't
strictly in the ppc code, but is most value to ppc)
NOTE: This pull request supersedes ppc-for-2.8-20160906, which had
some problems. Changes:
* Dropped BenH's lmw/stmw speedups, which break for
qemu-system-ppc64 on BE hosts
* A small fix to Thomas' serial output test to avoid a warning on
the isapc machine type.
* Some trivial checkpatch fixes
Note that some of the patches in this series still have large numbers
of checkpatch warnings. This is because they're moving existing code
that predates most of the checkpatch style conventions.
# gpg: Signature made Wed 07 Sep 2016 07:09:27 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.8-20160907: (64 commits)
tests: Check serial output of firmware boot of some machines
tests: Resort check-qtest entries in Makefile.include
spapr: implement H_CHANGE_LOGICAL_LAN_MAC h_call
ppc: Improve a few more helper flags
ppc: Improve the exception helpers flags
ppc: Improve flags for helpers loading/writing the time facilities
ppc: Don't generate dead code on unconditional branches
ppc: Stop dumping state on all exceptions in linux-user
ppc: Fix catching some segfaults in user mode
ppc: Fix macio ESCC legacy mapping
hw/ppc: add a ppc_create_page_sizes_prop() helper routine
hw/ppc: use error_report instead of fprintf
ppc: Rename #include'd .c files to .inc.c
target-ppc: add extswsli[.] instruction
target-ppc: add vsrv instruction
target-ppc: add vslv instruction
target-ppc: add vcmpnez[b,h,w][.] instructions
target-ppc: add vabsdu[b,h,w] instructions
target-ppc: add dtstsfi[q] instructions
target-ppc: implement branch-less divd[o][.]
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The exact same routine will be used in PowerNV.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_pci would also be a good candidate but the macro _FDT is
slightly different. It returns and does not exit.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The uboot in the previous release of the SDK was using a hardcoded
value for memory size. This is not true anymore, the value is now
retrieved from the memory controller.
Below is a model for this device, only supporting unlock and
configuration. Without it, we endup running a guest with 64MB, which
is a bit low nowdays. It uses a 'silicon-rev' property and ram_size to
build a default value. Some bits should be linked to SCU strapping
registers but it seems a bit complex to add for the current need.
The model is ready for the AST2500 SOC.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJXzpyJAAoJEH8JsnLIjy/WwToQAJ29bQ8dbVxybQtApZn0l3DH
aBcguj822Vqa+KaxOLAfzkxmG5MurIvWzRQD1BvjxaprRykB+hDh4oAJCmVjfedP
B28h24TUF+w8WIbpxf9weQFNpsT2Ire8ZySc0JZhpYMqxXCqy6NzDs98sjedDC0O
jNbfic1L+yEpZumVE0Fzr4/YgPumt7wP0X42nb6G8R+VlChm3nweNCFF7hNQvTuB
GNNbd9ckUS0BTcQazm04yRR/WzXW6uFqa00QeWsNGGd1mmZ0kUxiqxVgx/fuBMrL
yC4LxFit7eNRoeVqu/nu8GsG+2Ol5zsalfJKFcoWmpg8pygOayc5SXecRUZRw7tg
3oB7ZijbrBUFlr4y6cNVCGPtRluQshpLGHlgo68ulEIlHprqECwgPIdoOPr0bs+v
Gb8ho2Y+lrISPIsjYWK5UFSmZf0SIBGILZUSD3lzQ+oOHXGKbdPAaFvSUqXENHSN
xjtMYjr5t+NjrNNd2Q+VUJPlimHGw5jAowjsQSTk3ndcvJYeIVs+AwLqNTKc3dY4
Oxx1IZ2RngDC63PmZUgh2Bs8pwFg7HaZJejmtq5jY8eHJZM/QMkCJX9TSRCZsMRB
n0GxfCYabX526h3Yo94d74s5xRnHaC+Lem8PU/VEGR3/dMn21jf/PI9e+/0BnBTC
iET2gkU70c4lunWkFHMd
=LZpU
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Tue 06 Sep 2016 11:38:01 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (36 commits)
block: Allow node name for 'qemu-io' HMP command
qemu-iotests: Log QMP traffic in debug mode
block jobs: Improve error message for missing job ID
coroutine: Assert that no locks are held on termination
coroutine: Let CoMutex remember who holds it
qcow2: fix iovec size at qcow2_co_pwritev_compressed
test-coroutine: Fix coroutine pool corruption
qemu-iotests: add vmdk for test backup compression in 055
qemu-iotests: test backup compression in 055
blockdev-backup: added support for data compression
drive-backup: added support for data compression
block: simplify blockdev-backup
block: simplify drive-backup
block/io: turn on dirty_bitmaps for the compressed writes
block: remove BlockDriver.bdrv_write_compressed
qcow: cleanup qcow_co_pwritev_compressed to avoid the recursion
qcow: add qcow_co_pwritev_compressed
vmdk: add vmdk_co_pwritev_compressed
qcow2: cleanup qcow2_co_pwritev_compressed to avoid the recursion
qcow2: add qcow2_co_pwritev_compressed
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Let's provide a standardized interface to baseline two CPU models, to
create a third, compatible one. This is especially helpful when two
CPU models are not identical, but a CPU model is required that is
guaranteed to run under both configurations, where the original models run.
"query-cpu-model-baseline" takes two CPU models and returns a third,
compatible model. The result will always be a static CPU model.
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Message-Id: <20160905085244.99980-28-dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>