Commit Graph

388 Commits

Author SHA1 Message Date
Paolo Bonzini
de7ea885c5 kvm: Switch to unlocked MMIO
Do not take the BQL before dispatching MMIO requests of KVM VCPUs.
Instead, address_space_rw will do it if necessary. This enables completely
BQL-free MMIO handling in KVM mode for upcoming devices with fine-grained
locking.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-10-git-send-email-pbonzini@redhat.com>
2015-07-01 15:45:51 +02:00
Jan Kiszka
80b7d2efb6 kvm: Switch to unlocked PIO
Do not take the BQL before dispatching PIO requests of KVM VCPUs.
Instead, address_space_rw will do it if necessary. This enables
completely BQL-free PIO handling in KVM mode for upcoming devices with
fine-grained locking.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-8-git-send-email-pbonzini@redhat.com>
2015-07-01 15:45:51 +02:00
Jan Kiszka
4b8523ee89 kvm: First step to push iothread lock out of inner run loop
This opens the path to get rid of the iothread lock on vmexits in KVM
mode. On x86, the in-kernel irqchips has to be used because we otherwise
need to synchronize APIC and other per-cpu state accesses that could be
changed concurrently.

Regarding pre/post-run callbacks, s390x and ARM should be fine without
specific locking as the callbacks are empty. MIPS and POWER require
locking for the pre-run callback.

For the handle_exit callback, it is non-empty in x86, POWER and s390.
Some POWER cases could do without the locking, but it is left in
place for now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-7-git-send-email-pbonzini@redhat.com>
2015-07-01 15:45:51 +02:00
马文霜
bdf026317d Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES
Last month, we experienced several guests crash(6cores-8cores), qemu logs
display the following messages:

qemu-system-x86_64: /build/qemu-2.1.2/kvm-all.c:976:
kvm_irqchip_commit_routes: Assertion `ret == 0' failed.

After analysis and verification, we can confirm it's irq-balance
daemon(in guest) leads to the assertion failure. Start a 8 core guest with
two disks, execute the following scripts will reproduce the BUG quickly:

irq_affinity.sh
========================================================================

vda_irq_num=25
vdb_irq_num=27
while [ 1 ]
do
    for irq in {1,2,4,8,10,20,40,80}
        do
            echo $irq > /proc/irq/$vda_irq_num/smp_affinity
            echo $irq > /proc/irq/$vdb_irq_num/smp_affinity
            dd if=/dev/vda of=/dev/zero bs=4K count=100 iflag=direct
            dd if=/dev/vdb of=/dev/zero bs=4K count=100 iflag=direct
        done
done
========================================================================

QEMU setup static irq route entries in kvm_pc_setup_irq_routing(), PIC and
IOAPIC share the first 15 GSI numbers, take up 23 GSI numbers, but take up
38 irq route entries. When change irq smp_affinity in guest, a dynamic route
entry may be setup, the current logic is: if allocate GSI number succeeds,
a new route entry can be added. The available dynamic GSI numbers is
1021(KVM_MAX_IRQ_ROUTES-23), but available irq route entries is only
986(KVM_MAX_IRQ_ROUTES-38), GSI numbers greater than route entries.
irq-balance's behavior will eventually leads to total irq route entries
exceed KVM_MAX_IRQ_ROUTES, ioctl(KVM_SET_GSI_ROUTING) fail and
kvm_irqchip_commit_routes() trigger assertion failure.

This patch fix the BUG.

Signed-off-by: Wenshuang Ma <kevinnma@tencent.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-01 15:45:50 +02:00
Paolo Bonzini
1bfbac4ee1 kvm: remove special handling of DIRTY_MEMORY_MIGRATION in the dirty log mask
One recent example is commit 4cc856f (kvm-all: Sync dirty-bitmap from
kvm before kvm destroy the corresponding dirty_bitmap, 2015-04-02).
Another performance problem is that KVM keeps tracking dirty pages
after a failed live migration, which causes bad performance due to
disallowing huge page mapping.

Thanks to the previous patch, KVM can now stop hooking into
log_global_start/stop.  This simplifies the KVM code noticeably.

Reported-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Reported-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
ea8cb1a8d9 kvm: accept non-mapped memory in kvm_dirty_pages_log_change
It is okay if memory is not mapped into the guest but has dirty logging
enabled.  When this happens, KVM will not do anything and only accesses
from the host will be logged.

This can be triggered by iofuzz.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
b2dfd71c48 memory: prepare for multiple bits in the dirty log mask
When the dirty log mask will also cover other bits than DIRTY_MEMORY_VGA,
some listeners may be interested in the overall zero/non-zero value of
the dirty log mask; others may be interested in the value of single bits.

For this reason, always call log_start/log_stop if bits have respectively
appeared or disappeared, and pass the old and new values of the dirty log
mask so that listeners can distinguish the kinds of change.

For example, KVM checks if dirty logging used to be completely disabled
(in log_start) or is now completely disabled (in log_stop).  On the
other hand, Xen has to check manually if DIRTY_MEMORY_VGA changed,
since that is the only bit it cares about.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
2d1a35bef0 memory: differentiate memory_region_is_logging and memory_region_get_dirty_log_mask
For now memory regions only track DIRTY_MEMORY_VGA individually, but
this will change soon.  To support this, split memory_region_is_logging
in two functions: one that returns a given bit from dirty_log_mask,
and one that returns the entire mask.  memory_region_is_logging gets an
extra parameter so that the compiler flags misuse.

While VGA-specific users (including the Xen listener!) will want to keep
checking that bit, KVM and vhost check for "any bit except migration"
(because migration is handled via the global start/stop listener
callbacks).

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:58 +02:00
Laurent Vivier
dae02ba55a ppc: add helpful message when KVM fails to start VCPU
On POWER8 systems, KVM checks if VCPU is running on primary threads,
and that secondary threads are offline. If this is not the case,
ioctl() fails with errno set to EBUSY.

QEMU aborts with a non explicit error message:
$ ./qemu-system-ppc64 --nographic -machine pseries,accel=kvm
error: kvm run failed Device or resource busy

To help user to diagnose the problem, this patch adds an informative
error message.

There is no easy way to check if SMT is enabled before starting the VCPU,
and as this case is the only one setting errno to EBUSY, we just check
the errno value to display a message.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <1431976007-20503-1-git-send-email-lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:58 +02:00
Eric Auger
1850b6b7d0 kvm: introduce kvm_arch_msi_data_to_gsi
On ARM the MSI data corresponds to the shared peripheral interrupt (SPI)
ID. This latter equals to the SPI index + 32. to retrieve the SPI index,
matching the gsi, an architecture specific function is introduced.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-02 14:56:25 +01:00
Paolo Bonzini
4c66375252 kvm: add support for memory transaction attributes
Let kvm_arch_post_run convert fields in the kvm_run struct to MemTxAttrs.
These are then passed to address_space_rw.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-30 16:55:32 +02:00
Thomas Huth
03a96b83b5 kvm: Silence warning from valgrind
valgrind complains here about uninitialized bytes with the following message:

==17814== Syscall param ioctl(generic) points to uninitialised byte(s)
==17814==    at 0x466A780: ioctl (in /usr/lib64/power8/libc-2.17.so)
==17814==    by 0x100735B7: kvm_vm_ioctl (kvm-all.c:1920)
==17814==    by 0x10074583: kvm_set_ioeventfd_mmio (kvm-all.c:574)

Let's fix it by using a proper struct initializer in kvm_set_ioeventfd_mmio().

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1430153944-24368-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-30 16:55:16 +02:00
Cornelia Huck
2c80e996e4 kvm: better advice for failed s390x startup
If KVM_CREATE failed on s390x, we print a hint to enable the switch_amode
kernel parameter. This only applies to old kernels, and only if the
error was -EINVAL. Moreover, with new kernels, the most likely reason
for -EINVAL is that pgstes were not enabled.

Let's update the error message to give a better hint on where things
may need fixing.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-04-30 13:21:42 +02:00
Stefan Hajnoczi
bd2a88840e Convert ffs() != 0 callers to ctz32()
There are a number of ffs(3) callers that do roughly:

  bit = ffs(val);
  if (bit) {
      do_something(bit - 1);
  }

This pattern can be converted to ctz32() like this:

  zeroes = ctz32(val);
  if (zeroes != 32) {
      do_something(zeroes);
  }

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-6-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Peter Maydell
5c9eb0286c exec.c: Make address_space_rw take transaction attributes
Make address_space_rw take transaction attributes, rather
than always using the 'unspecified' attributes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:24 +01:00
zhanghailiang
4cc856faba kvm-all: Sync dirty-bitmap from kvm before kvm destroy the corresponding dirty_bitmap
Sometimes, we destroy the dirty_bitmap in kvm_memory_slot before any sync action
occur, this bit in dirty_bitmap will be missed, and which will lead the corresponding
dirty pages to be missed in migration.

This usually happens when do migration during VM's Start-up or Reboot.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
[Use s->migration_log instead of exec.c's in_migration. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-02 15:58:37 +02:00
Greg Kurz
b680c5ba54 kvm: fix ioeventfd endianness on bi-endian architectures
KVM expects host endian values. Hosts that don't use the default endianness
need to negate the swap performed in adjust_endianness().

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Message-Id: <20150313212337.31142.3991.stgit@bahia.local>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-18 12:07:30 +01:00
Dominik Dingel
d0a073a186 kvm: encapsulate HAS_DEVICE for vm attrs
More and more virtual machine specifics between kvm and qemu will be
transferred with vm attributes.
So we encapsulate the common logic in a generic function.

Additionally we need only to check during initialization if kvm supports
virtual machine attributes.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Message-Id: <1426164834-38648-2-git-send-email-jfrei@linux.vnet.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-03-16 10:15:45 +01:00
Marcel Apfelbaum
b16565b396 kvm: add machine state to kvm_arch_init
Needed to query machine's properties.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-11 18:16:17 +01:00
Marcel Apfelbaum
446f16a690 machine: query kernel-irqchip property
Running
    x86_64-softmmu/qemu-system-x86_64 -machine pc,kernel_irqchip=on -enable-kvm
leads to crash:
    qemu-system-x86_64: qemu/util/qemu-option.c:387: qemu_opt_get_bool_helper:
    Assertion `opt->desc && opt->desc->type == QEMU_OPT_BOOL' failed.  Aborted
    (core dumped)

This happens because the commit e79d5a6 ("machine: remove qemu_machine_opts
global list") removed the global option descriptions and moved them to
MachineState's QOM properties.

Fix this by querying machine properties through designated wrappers.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-03-11 18:14:23 +01:00
Radim Krčmář
8c1ac475e3 fix GCC 5.0.0 logical-not-parentheses warnings
man gcc:
  Warn about logical not used on the left hand side operand of a
  comparison.  This option does not warn if the RHS operand is of a
  boolean type.

By preferring bool over int where sensible, but without modifying any
depending code, make GCC happy in cases like this,
  qemu-img.c: In function ‘compare_sectors’:
  qemu-img.c:992:39: error: logical not is only applied to the left hand
  side of comparison [-Werror=logical-not-parentheses]
           if (!!memcmp(buf1, buf2, 512) != res) {

hw/ide/core.c:1836 doesn't throw an error,
  assert(!!s->error == !!(s->status & ERR_STAT));
even thought the second operand is int (and first hunk of this patch has
a very similar case), maybe GCC developers still have a little faith in
C programmers.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-03-10 08:15:34 +03:00
Markus Armbruster
9834fe02b0 kvm: g_malloc() can't fail, bury dead error handling
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-02-10 09:27:20 +03:00
Frank Blaschka
9e03a0405d kvm: extend kvm_irqchip_add_msi_route to work on s390
on s390 MSI-X irqs are presented as thin or adapter interrupts
for this we have to reorganize the routing entry to contain
valid information for the adapter interrupt code on s390.
To minimize impact on existing code we introduce an architecture
function to fixup the routing entry.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-01-12 10:14:04 +01:00
Christian Borntraeger
e9af2fef24 coverity/s390x: avoid false positive in kvm_irqchip_add_adapter_route
Paolo Bonzini reported that Coverity reports an uninitialized pad value.
Let's use a designated initializer for kvm_irq_routing_entry to avoid
this false positive. This is similar to kvm_irqchip_add_msi_route and
other users of kvm_irq_routing_entry.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15 12:21:01 +01:00
Christian Borntraeger
d229b985b5 valgrind: avoid false positives in KVM_GET_DIRTY_LOG ioctl
struct kvm_dirty_log contains padding fields that trigger false
positives in valgrind. Let's use a designated initializer to avoid
false positives from valgrind/memcheck.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15 12:21:01 +01:00
Eric Auger
f41389ae3c KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE checks
Compute kvm_irqfds_allowed by checking the KVM_CAP_IRQFD extension.
Remove direct settings in architecture specific files.

Add a new kvm_resamplefds_allowed variable, initialized by
checking the KVM_CAP_IRQFD_RESAMPLE extension. Add a corresponding
kvm_resamplefds_enabled() function.

A special notice for s390 where KVM_CAP_IRQFD was not immediatly
advirtised when irqfd capability was introduced in the kernel.
KVM_CAP_IRQ_ROUTING was advertised instead.

This was fixed in "KVM: s390: announce irqfd capability",
ebc3226202d5956a5963185222982d435378b899 whereas irqfd support
was brought in 84223598778ba08041f4297fda485df83414d57e,
"KVM: s390: irq routing for adapter interrupts".  Both commits
first appear in 3.15 so there should not be any kernel
version impacted by this QEMU modification.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15 12:21:01 +01:00
Igor Mammedov
b8865591d4 pc: kvm: check if KVM has free memory slots to avoid abort()
When more memory devices are used than available
KVM memory slots, QEMU crashes with:

kvm_alloc_slot: no free slot available
Aborted (core dumped)

Fix this by checking that KVM has a free slot before
attempting to map memory in guest address space.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-23 12:11:29 +02:00
Alexander Graf
f2a64032a1 kvm: Fix memory slot page alignment logic
Memory slots have to be page aligned to get entered into KVM. There
is existing logic that tries to ensure that we pad memory slots that
are not page aligned to the biggest region that would still fit in the
alignment requirements.

Unfortunately, that logic is broken. It tries to calculate the start
offset based on the region size.

Fix up the logic to do the thing it was intended to do and document it
properly in the comment above it.

With this patch applied, I can successfully run an e500 guest with more
than 3GB RAM (at which point RAM starts overlapping subpage memory regions).

Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-20 14:52:01 +01:00
Paolo Bonzini
9d1c35dfc9 kvm fix compilation with GCC 4.3.4
As usual, SLES11's GCC complained about double typedefs:

/home/cohuck/git/qemu/kvm-all.c:110: error: redefinition of typedef ‘KVMState’
/home/cohuck/git/qemu/include/sysemu/kvm.h:161: error: previous declaration of ‘KVMState’ was here

Reported-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-10 14:07:08 +01:00
Eduardo Habkost
fc02086b5a kvm: Make KVMState be the TYPE_KVM_ACCEL instance struct
Now that we create an accel object before calling machine_init, we can
simply use the accel object to save all KVMState data, instead of
allocationg KVMState manually.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-09 15:36:15 +02:00
Eduardo Habkost
f6a1ef6440 accel: Pass MachineState object to accel init functions
Most of the machine options and machine state information is in the
MachineState object, not on the MachineClass. This will allow init
functions to use the MachineState object directly instead of
qemu_get_machine_opts() or the current_machine global.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-09 12:57:10 +02:00
Eduardo Habkost
0d15da8e6f accel: Rename 'init' method to 'init_machine'
Today, all accelerator init functions affect some global state:
* tcg_init() calls tcg_exec_init() and affects globals such as tcg_tcx,
  page size globals, and possibly others;
* kvm_init() changes the kvm_state global, cpu_interrupt_handler, and possibly
  others;
* xen_init() changes the xen_xc global, and registers a change state handler.

With the new accelerator QOM classes, initialization may now be split in two
steps:
* instance_init() will do basic initialization that doesn't affect any global
  state and don't need MachineState or MachineClass data. This will allow
  probing code to safely create multiple accelerator objects on the fly just
  for reporting host/accelerator capabilities, for example.
* accel_init_machine()/init_machine() will save the accelerator object in
  MachineState, and do initialization steps which still affect global state,
  machine state, or that need data from MachineClass or MachineState.

To clarify the difference between those two steps, rename init() to
init_machine().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:16 +02:00
Eduardo Habkost
782c3f2939 accel: Move KVM accel registration to kvm-all.c
Note that this has an user-visible side-effect: instead of reporting
"KVM is not supported for this target", QEMU binaries not supporting KVM
will report "kvm accelerator does not exist".

As kvm_availble() always return 1 when CONFIG_KVM is enabled, we don't
need to set AccelClass.available anymore. kvm_enabled() is not being
completely removed yet only because qmp_query_kvm() still uses it.

This also allows us to make kvm_init() static.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-04 08:59:15 +02:00
Christian Borntraeger
541be9274e kvm/valgrind: don't mark memory as initialized
since commit 7dda5dc82a ("migration: initialize RAM to zero") the
guest memory is defined zero. No need to call valgrind on guest memory.
This reverts commit 62fe83318d ("qemu: Use valgrind annotations to
mark kvm guest memory as defined") thus speeding up kvm start if
<includedir>/valgrind/valgrind.h is available.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-26 13:35:08 +02:00
Marcelo Tosatti
de9d61e83d Introduce cpu_clean_all_dirty
Introduce cpu_clean_all_dirty, to force subsequent cpu_synchronize_all_states
to read in-kernel register state.

Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-16 11:04:09 +02:00
Peter Maydell
2b31cd4e08 - Memory: improve error reporting and avoid crashes on hotplug
- Build: fixing block/iscsi.so and ranlib warnings on Mac OS X
 - Migration fixes for x86
 - The odd KVM patch.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJUEXeWAAoJEBvWZb6bTYby4AwP/0Hh55A7QzkkzZ66y65zM+G5
 dsgRcLjufHSRQHoNQqm6LOcicV3Ygc/X644EY6jnZCZxFh/fsWuTPqUDGxLAnxEc
 2V0PkLRIScAMOPezzxvRy6/9hkG+UYM3ZOL5D9yxA9pGuBtttw7tkts19Vqf9WZc
 NYG5TBDuEGM1c596Zpo7t10m+Oiw+Jyi5luLXsb4lh5ikdFPDrtJaf0AnFvR+ym0
 HXlj2K/0vHNowUeLoo+oWnZsW8mLE6OyJhgfo1tJtsH1BR+lQJnBnQ4moq4Sl/Wz
 +iht/4gtz34XwLILokFR6yiNrPe+MIryyv+FYxOD5loIdGVDtKMx30UkIE2/D933
 6/n5i3GBLi9JapeT9gkKTxk/UVRPzJ1PK07RWevgNZNQyTGKAUGp+p48nSzMYX7V
 7GFSy3Q8uqOR8g9n+t+RURxkoMNbhhw7v53Z3PPXPCALCMDzg9RARlW/nkfiExcZ
 oThUjE/8xfMTQlN1SO5HTyQXEkYjtknZhfC7/KFvkWYMbCG0KBTf212Md0zlTNkj
 +C6r8Gq4ZWVIc07QyKkoCMxB+a9Uhvy4T1PKuSlm6iu94zUgZRhdf/PlOXimhFqH
 9GL67Tv15kpj05xCS6jDXjeMZ416/UKw91OcsiT1UUHcq7/rc+GBycd0ngV1UgnQ
 di5V12IVt8JwdzFxMeCT
 =GIKW
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

- Memory: improve error reporting and avoid crashes on hotplug
- Build: fixing block/iscsi.so and ranlib warnings on Mac OS X
- Migration fixes for x86
- The odd KVM patch.

# gpg: Signature made Thu 11 Sep 2014 11:21:10 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream: (21 commits)
  gdbstub: init mon_chr through qemu_chr_alloc
  pckbd: adding new fields to vmstate
  mc146818rtc: add missed field to vmstate
  piix: do not set irq while loading vmstate
  serial: fixing vmstate for save/restore
  parallel: adding vmstate for save/restore
  fdc: adding vmstate for save/restore
  cpu: init vmstate for ticks and clock offset
  apic_common: vapic_paddr synchronization fix
  vl: use QLIST_FOREACH_SAFE to visit change state handlers
  exec: add parameter errp to gethugepagesize
  exec: report error when memory < hpagesize
  hostmem-ram: don't exit qemu if size of memory-backend-ram is way too big
  memory: add parameter errp to memory_region_init_rom_device
  memory: add parameter errp to memory_region_init_ram
  exec: add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr
  rules.mak: Fix DSO build by pulling in archive symbols
  util: Don't link host-utils.o if it's empty
  util: Move general qemu_getauxval to util/getauxval.c
  trace: Only link generated-tracers.o with "simple" backend
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12 16:55:49 +01:00
Paolo Bonzini
a85e130e01 kvm: do not abort if KVM_RUN fails
Just go to the internal error runstate.  This lets you use the "x",
"dump-guest-memory" or "info register" commands.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:13:05 +02:00
Alexander Graf
7d0a07fa92 KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd
We now can call KVM_CHECK_EXTENSION on the kvm fd or on the vm fd, whereas
the vm version is more accurate when it comes to PPC KVM.

Add a helper to make the vm version available that falls back to the non-vm
variant if the vm one is not available yet to stay compatible.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
David Hildenbrand
c8e2085d8e kvm: run cpu state synchronization on target vcpu thread
As already done for kvm_cpu_synchronize_state(), let's trigger
kvm_arch_put_registers() via run_on_cpu() for kvm_cpu_synchronize_post_reset()
and kvm_cpu_synchronize_post_init().

This way, we make sure that the register synchronizing ioctls are
called from the proper vcpu thread; this avoids calls to
synchronize_rcu() in the kernel.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-20 15:21:00 +02:00
Chen Gang
dc54e25253 kvm-all: Use 'tmpcpu' instead of 'cpu' in sub-looping to avoid 'cpu' be NULL
If kvm_arch_remove_sw_breakpoint() in CPU_FOREACH() always be fail, it
will let 'cpu' NULL. And the next kvm_arch_remove_sw_breakpoint() in
QTAILQ_FOREACH_SAFE() will get NULL parameter for 'cpu'.

And kvm_arch_remove_sw_breakpoint() can assumes 'cpu' must never be NULL,
so need define additional temporary variable for 'cpu' to avoid the case.

Cc: qemu-stable@nongnu.org
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-22 12:38:17 +02:00
Peter Maydell
d70a319b8d Merge remote-tracking branch 'remotes/kvm/uq/master' into staging
* remotes/kvm/uq/master:
  hw/mips: malta: Don't boot from flash with KVM T&E
  MAINTAINERS: Add entry for MIPS KVM
  target-mips: Enable KVM support in build system
  hw/mips: malta: Add KVM support
  hw/mips: In KVM mode, inject IRQ2 (I/O) interrupts via ioctls
  target-mips: Call kvm_mips_reset_vcpu() from mips_cpu_reset()
  target-mips: kvm: Add main KVM support for MIPS
  kvm: Allow arch to set sigmask length
  target-mips: get_physical_address: Add KVM awareness
  target-mips: get_physical_address: Add defines for segment bases
  hw/mips: Add API to convert KVM guest KSEG0 <-> GPA
  hw/mips/cputimer: Don't start periodic timer in KVM mode
  target-mips: Reset CPU timer consistently
  KVM: Fix GSI number space limit

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-20 19:25:18 +01:00
Peter Maydell
0a99aae5fa pc,pci,virtio,hotplug fixes, enhancements
numa work by Hu Tao and others
 memory hotplug by Igor
 vhost-user by Nikolay, Antonios and others
 guest virtio announcements by Jason
 qtest fixes by Sergey
 qdev hotplug fixes by Paolo
 misc other fixes mostly by myself
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTowW+AAoJECgfDbjSjVRpnMMH/jp3sKGzJumYLbi5ihjmYyND
 jYd6ySXoVAjUTgaCvdje5srisOap8pbc783kQvQS4CeWsjZ5Vvh+PZjkBPIqF1pD
 celxGQ43CY7QSUWq+02Dg9VIUwLwZqdKlxNsV01FligQn+ZBQ6sQ6ksWx7oGzqRt
 5/HMZykbwUvSk/4xGUaMn2+/4uhQ0Wz5EsCkv9L/u8kS72k6ldc/tCGZMzBUNHTM
 rW5FPYwMQP0MXgGTXnlLEQjJ7Lozc66IaMZoHw/a/aGSIxdag9Otj0ADuXq6yZaV
 Xi4O/EOJWd1JpSG7w8LOyIZNakpHkU43fmJCLzBjDAupHeRp57TcW5ox4PJYAtg=
 =Oxdt
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc,pci,virtio,hotplug fixes, enhancements

numa work by Hu Tao and others
memory hotplug by Igor
vhost-user by Nikolay, Antonios and others
guest virtio announcements by Jason
qtest fixes by Sergey
qdev hotplug fixes by Paolo
misc other fixes mostly by myself

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* remotes/mst/tags/for_upstream: (109 commits)
  numa: use RAM_ADDR_FMT with ram_addr_t
  qapi/string-output-visitor: fix bugs
  tests: simplify code
  qapi: fix input visitor bugs
  acpi: rephrase comment
  qmp: add ACPI_DEVICE_OST event handling
  qmp: add query-acpi-ospm-status command
  acpi: implement ospm_status() method for PIIX4/ICH9_LPC devices
  acpi: introduce TYPE_ACPI_DEVICE_IF interface
  qmp: add query-memory-devices command
  numa: handle mmaped memory allocation failure correctly
  pc: acpi: do not hardcode preprocessor
  qmp: clean out whitespace
  qdev: recursively unrealize devices when unrealizing bus
  qdev: reorganize error reporting in bus_set_realized
  qapi: fix build on glib < 2.28
  qapi: make string output visitor parse int list
  qapi: make string input visitor parse int list
  tests: fix memory leak in test of string input visitor
  hmp: add info memdev
  ...

Conflicts:
	include/hw/i386/pc.h
[PMM: fixed minor conflict in pc.h]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-20 18:01:24 +01:00
Pranavkumar Sawargaonkar
99040447ce kvm: Handle exit reason KVM_EXIT_SYSTEM_EVENT
In-kernel PSCI v0.2 emulation of KVM ARM/ARM64 forwards SYSTEM_OFF
and SYSTEM_RESET function calls to QEMU using KVM_EXIT_SYSTEM_EVENT
exit reason.

This patch updates kvm_cpu_exec() to handle KVM_SYSTEM_EVENT_SHUTDOWN
and KVM_SYSTEM_EVENT_RESET system-level events from QEMU-side.

Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402901605-24551-4-git-send-email-pranavkumar@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-19 18:06:25 +01:00
Nikolay Nikolaev
69e03ae64b Add kvm_eventfds_enabled function
Add a function to check if the eventfd capability is present in KVM in
the host kernel.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-19 16:41:54 +03:00
James Hogan
aed6efb90c kvm: Allow arch to set sigmask length
MIPS/Linux is unusual in having 128 signals rather than just 64 like
most other architectures. This means its sigmask is 16 bytes instead of
8, so allow arches to override the sigmask->len value passed to the
KVM_SET_SIGNAL_MASK ioctl in kvm_set_signal_mask() by calling
kvm_set_sigmask_len() from kvm_arch_init(). Otherwise default to 8
bytes.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:58:43 +02:00
Alexander Graf
00008418aa KVM: Fix GSI number space limit
KVM tells us the number of GSIs it can handle inside the kernel. That value is
basically KVM_MAX_IRQ_ROUTES. However when we try to set the GSI mapping table,
it checks for

    r = -EINVAL;
    if (routing.nr >= KVM_MAX_IRQ_ROUTES)
        goto out;

erroring out even when we're only using all of the GSIs. To make sure we never
hit that limit, let's reduce the number of GSIs we get from KVM by one.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:54:18 +02:00
Peter Maydell
9f0355b590 Merge remote-tracking branch 'remotes/kvm/uq/master' into staging
* remotes/kvm/uq/master:
  kvm: Fix eax for cpuid leaf 0x40000000
  kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation
  kvm: Enable -cpu option to hide KVM
  kvm: Ensure negative return value on kvm_init() error handling path
  target-i386: set CC_OP to CC_OP_EFLAGS in cpu_load_eflags
  target-i386: get CPL from SS.DPL
  target-i386: rework CPL checks during task switch, preparing for next patch
  target-i386: fix segment flags for SMM and VM86 mode
  target-i386: Fix vm86 mode regression introduced in fd460606fd.
  kvm_stat: allow choosing between tracepoints and old stats
  kvmclock: Ensure time in migration never goes backward

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-05 19:16:28 +01:00
Eduardo Habkost
0e1dac6c41 kvm: Ensure negative return value on kvm_init() error handling path
We need to ensure ret < 0 when going through the error path, or QEMU may
try to run the half-initialized VM and crash.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-30 22:28:55 +02:00
Cornelia Huck
d426d9fba8 s390x/virtio-ccw: wire up irq routing and irqfds
Make use of the new s390 adapter irq routing support to enable real
in-kernel irqfds for virtio-ccw with adapter interrupts.

Note that s390 doesn't provide the common KVM_CAP_IRQCHIP capability, but
rather needs KVM_CAP_S390_IRQCHIP to be enabled. This is to ensure backward
compatibility.

Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-05-20 13:05:58 +02:00
Paolo Bonzini
50a2c6e55f kvm: reset state from the CPU's reset method
Now that we have a CPU object with a reset method, it is better to
keep the KVM reset close to the CPU reset.  Using qemu_register_reset
as we do now keeps them far apart.

With this patch, PPC no longer calls the kvm_arch_ function, so
it can get removed there.  Other arches call it from their CPU
reset handler, and the function gets an ARMCPU/X86CPU/S390CPU.

Note that ARM- and s390-specific functions are called kvm_arm_*
and kvm_s390_*, while x86-specific functions are called kvm_arch_*.
That follows the convention used by the different architectures.
Changing that is the topic of a separate patch.

Reviewed-by: Gleb Natapov <gnatapov@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-13 13:12:40 +02:00
Cornelia Huck
ada4135f84 kvm: make one_reg helpers available for everyone
s390x introduced helper functions for getting/setting one_regs with
commit 860643bc. However, nothing about these is s390-specific.

Alexey Kardashevskiy had already posted a general version, so let's
merge the two patches and massage the code a bit.

CC: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-13 13:11:08 +02:00
Marcel Apfelbaum
f1e298794d machine: Replace QEMUMachine by MachineClass in accelerator configuration
This minimizes QEMUMachine usage, as part of machine QOM-ification.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-05-05 19:08:49 +02:00
Michael Tokarev
50212d6346 Revert "fix return check for KVM_GET_DIRTY_LOG ioctl"
This reverts commit b533f658a9.

The original code was wrong, because effectively it ignored errors
from kernel, because kernel does not return -1 on error case but
returns -errno, and does not return -EPERM for this particular ioctl.
But in some cases kernel actually returned unsuccessful result,
namely, when the dirty bitmap in requested slot does not exist
it returns -ENOENT.  With new code this condition becomes an
error when it shouldn't be.

Revert that patch instead of fixing it properly this late in the
release process.  I disagree with this approach, but let's make
things move _somewhere_, instead of arguing endlessly whch of
the 2 proposed fixes is better.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-id: 1397477644-902-1-git-send-email-mjt@msgid.tls.msk.ru
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-04-14 15:40:02 +01:00
Mario Smarduch
b533f658a9 fix return check for KVM_GET_DIRTY_LOG ioctl
Fix return condition check from kvm_vm_ioctl(s, KVM_GET_DIRTY_LOG, &d) to
handle internal failures or no support for memory slot dirty bitmap.
Otherwise the ioctl succeeds and continues with migration.
Addresses BUG# 1294227

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-03-27 19:22:48 +04:00
Peter Maydell
2396187076 Merge remote-tracking branch 'remotes/kvm/uq/master' into staging
* remotes/kvm/uq/master:
  target-i386: bugfix of Intel MPX
  file_ram_alloc: unify mem-path,mem-prealloc error handling
  kvm-all: exit in case max vcpus exceeded

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-03-11 19:39:17 +00:00
Aneesh Kumar K.V
135a129a1c kvm: Add a new machine option kvm-type
Targets like ppc64 support different types of KVM, one which use
hypervisor mode and the other which doesn't. Add a new machine
option kvm-type that helps in selecting the respective ones
We also add a new QEMUMachine callback get_vm_type that helps
in mapping the string representation of kvm type specified.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[agraf: spelling fixes, use error_report(), use qemumachine.h]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-03-05 03:06:24 +01:00
Marcelo Tosatti
9ba3cf540f kvm-all: exit in case max vcpus exceeded
Rather than fall back to TCG (so the user has to discover
whats happening, in case of no access to qemu stdout/stderr).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-27 19:30:51 +01:00
Peter Maydell
9fbee91a13 Merge remote-tracking branch 'remotes/kvm/uq/master' into staging
* remotes/kvm/uq/master:
  KVM: Use return value for error print

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-27 16:00:31 +00:00
Christoffer Dall
0a6a7ccaae kvm: Common device control API functions
Introduces two simple functions:
    int kvm_device_ioctl(int fd, int type, ...);
    int kvm_create_device(KVMState *s, uint64_t type, bool test);

These functions wrap the basic ioctl-based interactions with KVM in a
way similar to other KVM ioctl wrappers.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1392687720-26806-4-git-send-email-christoffer.dall@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-26 17:20:00 +00:00
Christoffer Dall
d6032e06d1 kvm: Introduce kvm_arch_irqchip_create
Introduce kvm_arch_irqchip_create an arch-specific hook in preparation
for architecture-specific use of the device control API to create IRQ
chips.

Following patches will implement the ARM irqchip create method to prefer
the device control API over the older KVM_CREATE_IRQCHIP API.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1392687720-26806-3-git-send-email-christoffer.dall@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-26 17:20:00 +00:00
Alexander Graf
521f438e36 KVM: Use return value for error print
Commit 94ccff13 introduced a more verbose failure message and retry
operations on KVM VM creation. However, it ended up using a variable
for its failure message that hasn't been initialized yet.

Fix it to use the value it meant to set.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-21 11:19:34 +01:00
Peter Maydell
bc1c72171c Merge remote-tracking branch 'remotes/kvm/uq/master' into staging
* remotes/kvm/uq/master:
  target-i386: Move KVM default-vendor hack to instance_init
  target-i386: Don't change x86_def_t struct on cpu_x86_register()
  target-i386: Eliminate CONFIG_KVM #ifdefs
  kvm: add support for hyper-v timers
  kvm: make hyperv vapic assist page migratable
  kvm: make hyperv hypercall and guest os id MSRs migratable.
  kvm: make availability of Hyper-V enlightenments dependent on KVM_CAP_HYPERV
  KVM: fix coexistence of KVM and Hyper-V leaves
  kvm: print suberror on all internal errors
  target-i386: kvm_check_features_against_host(): Kill feature word array
  target-i386: kvm_cpu_fill_host(): Fill feature words in a loop
  target-i386: kvm_cpu_fill_host(): Set all feature words at end of function
  target-i386: kvm_cpu_fill_host(): No need to check xlevel2
  target-i386: kvm_cpu_fill_host(): No need to check CPU vendor
  target-i386: kvm_cpu_fill_host(): No need to check level
  target-i386: kvm_cpu_fill_host(): Kill unused code

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-06 23:51:24 +00:00
Radim Krčmář
977c7b6d89 kvm: print suberror on all internal errors
KVM introduced internal error exit reason and suberror at the same time,
and later extended it with internal error data.
QEMU does not report suberror on hosts between these two events because
we check for the extension. (half a year in 2009, but it is misleading)

Fix by removing KVM_CAP_INTERNAL_ERROR_DATA condition on printf.

(partially improved by bb44e0d12d and ba4047cf84 in the past)

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-03 17:33:54 +01:00
Peter Maydell
850bbe1b94 vfio-pci updates include:
- Destroy MemoryRegions on device teardown
  - Print warnings around PCI option ROM failures
  - Skip bogus mappings from 64bit BAR sizing
  - Act on DMA mapping failures
  - Fix alignment to avoid MSI-X table mapping
  - Fix debug macro typo
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJS58xzAAoJECObm247sIsiuqIP/24fhvWwRRwgc/gFpQ/7KbeP
 nUkQdx6TEOK6f0+HOQOD9HtQAiSytJhB6jZQVYfN8MWrp3QzizGH56enR6FOj86C
 9/bnpWYvjDpO/CkdxOdl7aEX/S4LMjoz/2+UmulTxrpt5cpJrsnBTqEuBiMfqwDP
 A50DHHk89uEZWaDkZvdUOc8cOFNAw4p+c20UY6DWIi/SSJlD2BPdH2FhxWE2sT9Z
 60qShybMBTYBYdom3PURzIX8g+DyY9Kzn0d2g108ereaVvYjjwOLcB3HM4W/dSJ6
 rKNGQH+lZHmL4LpWflXlqkZPn31OhOkkks6xncLnfGztRz8ghBs/wj9M+wd8hFKT
 tTf+iwmTCMFF/Lg8RIK4zUMmWgeGq0KN+9bgtJ02nuXe2q782xghxjowJPkbkuHr
 Z+ja+xmL84MyU0UD66oOubxO1RmkBPAzagBjzkJEzj/Kl0JgTqFLYhdo1bQJ1KlV
 3AbnhyR+7p+eHh7AayymhLbwdW1p6qNnEKEKc2yofsV58u4j5teed6WL/QOQQsR2
 lAdR2o6/BjELVQ/ibsbR6yqQxGoG2Y/FdPwiA3IiN+g5dw/oiYOEOHvTHAVB3f4C
 gMPCny+qO4v27Am0iuNrlA6udpc8zNMcgwrVFU+ZmCvePU08+6Jvu8PwgJLruSTx
 l29PsKGSk8fa4a2d0DJO
 =J3ON
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-pci-for-qemu-20140128.0' into staging

vfio-pci updates include:
 - Destroy MemoryRegions on device teardown
 - Print warnings around PCI option ROM failures
 - Skip bogus mappings from 64bit BAR sizing
 - Act on DMA mapping failures
 - Fix alignment to avoid MSI-X table mapping
 - Fix debug macro typo

# gpg: Signature made Tue 28 Jan 2014 15:27:47 GMT using RSA key ID 3BB08B22
# gpg: Can't check signature: public key not found

* remotes/awilliam/tags/vfio-pci-for-qemu-20140128.0:
  vfio: correct debug macro typo
  vfio: fix mapping of MSIX bar
  kvm: initialize qemu_host_page_size
  vfio-pci: Fail initfn on DMA mapping errors
  vfio: Filter out bogus mappings
  vfio: Do not reattempt a failed rom read
  vfio: warn if host device rom can't be read
  vfio: Destroy memory regions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-01 21:08:06 +00:00
Anthony Liguori
0169c51155 Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
* qemu-kvm/uq/master:
  kvm: always update the MPX model specific register
  KVM: fix addr type for KVM_IOEVENTFD
  KVM: Retry KVM_CREATE_VM on EINTR
  mempath prefault: fix off-by-one error
  kvm: x86: Separately write feature control MSR on reset
  roms: Flush icache when writing roms to guest memory
  target-i386: clear guest TSC on reset
  target-i386: do not special case TSC writeback
  target-i386: Intel MPX

Conflicts:
	exec.c

aliguori: fix trivial merge conflict in exec.c

Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-24 15:52:44 -08:00
Alexey Kardashevskiy
47c16ed56a kvm: initialize qemu_host_page_size
There is a HOST_PAGE_ALIGN macro which makes sense for KVM accelerator
but it uses qemu_host_page_size/qemu_host_page_mask which initialized
for TCG only.

This moves qemu_host_page_size/qemu_host_page_mask initialization from
TCG's page_init() and adds a call for it from kvm_init().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-01-17 11:12:07 -07:00
Alexey Kardashevskiy
584f2be79d KVM: fix addr type for KVM_IOEVENTFD
The @addr here is a guest physical address and can easily be bigger
than 4G.

This changes uint32_t to hwaddr.

Cc: qemu-stable@nongnu.org
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-15 12:58:48 +01:00
thomas knych
94ccff1338 KVM: Retry KVM_CREATE_VM on EINTR
Upstreaming this change from Android (https://android-review.googlesource.com/54211).

On heavily loaded machines with many VM instances we see KVM_CREATE_VM
failing with EINTR on this path:

kvm_dev_ioctl_create_vm -> kvm_create_vm -> kvm_init_mmu_notifier -> mmu_notifier_register ->  do_mmu_notifier_register -> mm_take_all_locks

which checks if any signals have been raised while it was attaining locks
and returns EINTR.  Retrying the system call greatly improves reliability.

Cc: qemu-stable@nongnu.org
Signed-off-by: thomas knych <thomaswk@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-15 12:58:41 +01:00
Juan Quintela
5ff7fb77b3 memory: move bitmap synchronization to its own function
We want to have all the functions that handle directly the dirty
bitmap near.  We will change it later.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:55 +01:00
Juan Quintela
c9dd46fc0d kvm: refactor start address calculation
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:55 +01:00
Juan Quintela
747afd5bcd kvm: use directly cpu_physical_memory_* api for tracking dirty pages
Performance is important in this function, and we want to optimize even further.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:55 +01:00
Alex Williamson
fb541ca59c kvm: Query KVM for available memory slots
KVM reports the number of available memory slots (KVM_CAP_NR_MEMSLOTS)
using the extension interface.  Both x86 and s390 implement this, ARM
and powerpc do not yet enable it.  Convert the static slots array to
be dynamically allocated, supporting more slots when available.
Default to 32 when KVM_CAP_NR_MEMSLOTS is not implemented.  The
motivation for this change is to support more assigned devices, where
memory mapped PCI MMIO BARs typically take one slot each.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-11-25 11:28:34 +01:00
Anthony Liguori
3e4be9c297 Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
# By Alexey Kardashevskiy (3) and others
# Via Paolo Bonzini
* qemu-kvm/uq/master:
  target-i386: add feature kvm_pv_unhalt
  linux-headers: update to 3.12-rc1
  target-i386: forward CPUID cache leaves when -cpu host is used
  linux-headers: update to 3.11
  kvm: fix traces to use %x instead of %d
  kvmvapic: Clear also physical ROM address when entering INACTIVE state
  kvmvapic: Enter inactive state on hardware reset
  kvmvapic: Catch invalid ROM size
  kvm irqfd: support direct msimessage to irq translation
  fix steal time MSR vmsd callback to proper opaque type
  kvm: warn if num cpus is greater than num recommended
  cpu: Move cpu state syncs up into cpu_dump_state()
  exec: always use MADV_DONTFORK

Message-id: 1379694292-1601-1-git-send-email-pbonzini@redhat.com
2013-09-23 11:52:49 -05:00
Alexey Kardashevskiy
76fe21deda kvm irqfd: support direct msimessage to irq translation
On PPC64 systems MSI Messages are translated to system IRQ in a PCI
host bridge. This is already supported for emulated MSI/MSIX but
not for irqfd where the current QEMU allocates IRQ numbers from
irqchip and maps MSIMessages to IRQ in the host kernel.

This adds a new direct mapping flag which tells
the kvm_irqchip_add_msi_route() function that a new VIRQ
should not be allocated, instead the value from MSIMessage::data
should be used. It is up to the platform code to make sure that
this contains a valid IRQ number as sPAPR does in spapr_pci.c.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-20 12:37:52 +02:00
Andrew Jones
670436ced0 kvm: warn if num cpus is greater than num recommended
The comment in kvm_max_vcpus() states that it's using the recommended
procedure from the kernel API documentation to get the max number
of vcpus that kvm supports. It is, but by always returning the
maximum number supported. The maximum number should only be used
for development purposes. qemu should check KVM_CAP_NR_VCPUS for
the recommended number of vcpus. This patch adds a warning if a user
specifies a number of cpus between the recommended and max.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-09-20 12:37:52 +02:00
Markus Armbruster
91138037cb exec: Simplify the guest physical memory allocation hook
Make it a generic hook rather than a KVM hook.  Less code and
ifdeffery.

Since the only user of the hook is old S390 KVM, there's hope we can
get rid of it some day.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-5-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Andreas Färber
bdc44640cb cpu: Use QTAILQ for CPU list
Introduce CPU_FOREACH(), CPU_FOREACH_SAFE() and CPU_NEXT() shorthand
macros.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-09-03 12:25:55 +02:00
Anthony Liguori
b5d54bd421 Merge remote-tracking branch 'qemu-kvm/uq/master' into stable-1.5
* qemu-kvm/uq/master:
  kvm-stub: fix compilation
  kvm: shorten the parameter list for get_real_device()
  kvm: i386: fix LAPIC TSC deadline timer save/restore
  kvm-all.c: max_cpus should not exceed KVM vcpu limit
  kvm: Simplify kvm_handle_io
  kvm: x86: fix setting IA32_FEATURE_CONTROL with nested VMX disabled
  kvm: add KVM_IRQFD_FLAG_RESAMPLE support
  kvm: migrate vPMU state
  target-i386: remove tabs from target-i386/cpu.h
  Initialize IA32_FEATURE_CONTROL MSR in reset and migration

Conflicts:
	target-i386/cpu.h
	target-i386/kvm.c

aliguori: fixup trivial conflicts due to whitespace and added cpu
          argument

Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-08-29 17:21:51 -05:00
Marcelo Tosatti
7dc5252685 kvm-all.c: max_cpus should not exceed KVM vcpu limit
maxcpus, which specifies the maximum number of hotpluggable CPUs,
should not exceed KVM's vcpu limit.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
[Reword message. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-20 18:38:35 +02:00
Jan Kiszka
354678c5ce kvm: Simplify kvm_handle_io
Now that cpu_in/out is just a wrapper around address_space_rw, we can
also call the latter directly. As host endianness == guest endianness,
there is no need for the memory access helpers st*_p/ld*_p as well.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-20 18:37:17 +02:00
Vincenzo Maffione
ca916d3729 kvm: add KVM_IRQFD_FLAG_RESAMPLE support
Added an EventNotifier* parameter to
kvm-all.c:kvm_irqchip_add_irqfd_notifier(), in order to give KVM
another eventfd to be used as "resamplefd". See the documentation
in the linux kernel sources in Documentation/virtual/kvm/api.txt
(section 4.75) for more details.
When the added parameter is passed NULL, the behaviour of the
function is unchanged with respect to the previous versions.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-09 21:19:54 +02:00
Stefan Weil
38e478eccf kvm: Change prototype of kvm_update_guest_debug()
Passing a CPUState pointer instead of a CPUArchState pointer eliminates
the last target dependent data type in sysemu/kvm.h.

It also simplifies the code.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-26 23:05:31 +02:00
Andreas Färber
6227881415 kvm: Change kvm_{insert,remove}_breakpoint() argument to CPUState
CPUArchState is no longer directly used since converting CPU loops to
CPUState.

Prepares for changing GDBState::c_cpu to CPUState.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-23 02:41:32 +02:00
Andreas Färber
ed2803da58 cpu: Move singlestep_enabled field from CPU_COMMON to CPUState
Prepares for changing cpu_single_step() argument to CPUState.

Acked-by: Michael Walle <michael@walle.cc> (for lm32)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-23 02:41:32 +02:00
Anthony Liguori
51455c59dd QOM CPUState refactorings
* Fix for OpenRISCCPU subclasses
 * Fix for gdbstub CPU selection
 * Move linux-user CPU functions into new header
 * CPUState part 10 refactoring: first_cpu, next_cpu, cpu_single_env et al.
 * Fix some targets to consistently inline TCG code generation
 * Centrally log CPU reset
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJR3VkXAAoJEPou0S0+fgE/KFQP/3eUyCzZ6QmUG3gmrnfYRDMH
 uwMstD1JRUc5kTEC2bMtld8zZKwx2kxMJpe5fizig8GaLka0J5U2wyvwskkX27ag
 7ouNwFdD/dOmvaKfcqHYKbA3CTuIrbnMm7nzrXpLnWXCiMlW1XmXttQsb3hoAjjt
 asFxQIHONNIgqpcJBrz/C6XX2bEkLra4s2QlXPE5Bl3QkKTtK9+NYahHtgIk3Y7Y
 fqbAxebNGh9eZ9PKjPExhNBZ17Yi4ciM7UB7yrXFYOfwKSpmmTsJdu/m776b1oAK
 c/zWO0uea+sLsMnibnSD1foeeZJItDQDRid+PjC44zB5kS8pkPcT5+TVB04Zilap
 rhNF2Fox+fe8eIc/2WuY3ZGchVjrD/EPbFFCCRQ/qI3Nb98WfLCDu3pAP1hRdo+p
 P6qCH5JmWYcR+2gp8MHY0NtqcklL8A2HpQTRvX1mUliMJbE+unanT4nmKolOTYrm
 +6jvp72GkmqqaLQDQ0d8ig/GmcI9QeftSFD5Y8p5prPsMkQbOAbOUSBlPgwY+Syl
 QmP8xNNzbj00UF8GvRL/m9O75geis/I+op5E7hJqaO5U1yd+ww5Z1EFvDEkUOeYu
 BclqCg1jTnzBzE/FaRP0NWFAUDR+4Z0tumdRES1cDfaMJr3+pYT7y8tjVZn7PEvn
 Ljq+/pyyiunG3Mbvw2o8
 =lFBU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'afaerber/tags/qom-cpu-for-anthony' into staging

QOM CPUState refactorings

* Fix for OpenRISCCPU subclasses
* Fix for gdbstub CPU selection
* Move linux-user CPU functions into new header
* CPUState part 10 refactoring: first_cpu, next_cpu, cpu_single_env et al.
* Fix some targets to consistently inline TCG code generation
* Centrally log CPU reset

# gpg: Signature made Wed 10 Jul 2013 07:52:39 AM CDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found

# By Andreas Färber (41) and others
# Via Andreas Färber
* afaerber/tags/qom-cpu-for-anthony: (43 commits)
  cpu: Move reset logging to CPUState
  target-ppc: Change LOG_MMU_STATE() argument to CPUState
  target-i386: Change LOG_PCALL_STATE() argument to CPUState
  log: Change log_cpu_state[_mask]() argument to CPUState
  target-i386: Change do_smm_enter() argument to X86CPU
  target-i386: Change do_interrupt_all() argument to X86CPU
  target-xtensa: Change gen_intermediate_code_internal() arg to XtensaCPU
  target-unicore32: Change gen_intermediate_code_internal() signature
  target-sparc: Change gen_intermediate_code_internal() argument to SPARCCPU
  target-sh4: Change gen_intermediate_code_internal() argument to SuperHCPU
  target-s390x: Change gen_intermediate_code_internal() argument to S390CPU
  target-ppc: Change gen_intermediate_code_internal() argument to PowerPCCPU
  target-mips: Change gen_intermediate_code_internal() argument to MIPSCPU
  target-microblaze: Change gen_intermediate_code_internal() argument types
  target-m68k: Change gen_intermediate_code_internal() argument to M68kCPU
  target-lm32: Change gen_intermediate_code_internal() argument to LM32CPU
  target-i386: Change gen_intermediate_code_internal() argument to X86CPU
  target-cris: Change gen_intermediate_code_internal() argument to CRISCPU
  target-arm: Change gen_intermediate_code_internal() argument to ARMCPU
  target-alpha: Change gen_intermediate_code_internal() argument to AlphaCPU
  ...
2013-07-10 10:54:16 -05:00
Andreas Färber
182735efaf cpu: Make first_cpu and next_cpu CPUState
Move next_cpu from CPU_COMMON to CPUState.
Move first_cpu variable to qom/cpu.h.

gdbstub needs to use CPUState::env_ptr for now.
cpu_copy() no longer needs to save and restore cpu_next.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Rebased, simplified cpu_copy()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-09 21:32:54 +02:00
Andreas Färber
1d5791f4a6 kvm: Change kvm_remove_all_breakpoints() argument to CPUState
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-09 21:20:28 +02:00
Andreas Färber
80b7cd7354 kvm: Free current_cpu identifier
Since CPU loops are done as last step in kvm_{insert,remove}_breakpoint()
and kvm_remove_all_breakpoints(), we do not need to distinguish between
invoking CPU and iterated CPUs and can thereby free the identifier for
use as a global variable.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-09 21:20:28 +02:00
Markus Armbruster
36ad0e948e Fix -machine options accel, kernel_irqchip, kvm_shadow_mem
Multiple -machine options with the same ID are merged.  All but the
one without an ID are to be silently ignored.

In most places, we query these options with a null ID.  This is
correct.

In some places, we instead query whatever options come first in the
list.  This is wrong.  When the -machine processed first happens to
have an ID, options are taken from that ID, and the ones specified
without ID are silently ignored.

Example:

    $ upstream-qemu -nodefaults -S -display none -monitor stdio -machine id=foo -machine accel=kvm,usb=on
    $ upstream-qemu -nodefaults -S -display none -monitor stdio -machine id=foo,accel=kvm,usb=on -machine accel=xen
    $ upstream-qemu -nodefaults -S -display none -monitor stdio -machine accel=xen -machine id=foo,accel=kvm,usb=on

    $ qemu-system-x86_64 -nodefaults -S -display none -monitor stdio -machine accel=kvm,usb=on
    QEMU 1.5.50 monitor - type 'help' for more information
    (qemu) info kvm
    kvm support: enabled
    (qemu) info usb
    (qemu) q
    $ qemu-system-x86_64 -nodefaults -S -display none -monitor stdio -machine id=foo -machine accel=kvm,usb=on
    QEMU 1.5.50 monitor - type 'help' for more information
    (qemu) info kvm
    kvm support: disabled
    (qemu) info usb
    (qemu) q
    $ qemu-system-x86_64 -nodefaults -S -display none -monitor stdio -machine id=foo,accel=kvm,usb=on -machine accel=xen
    QEMU 1.5.50 monitor - type 'help' for more information
    (qemu) info kvm
    kvm support: enabled
    (qemu) info usb
    USB support not enabled
    (qemu) q
    $ qemu-system-x86_64 -nodefaults -S -display none -monitor stdio -machine accel=xen -machine id=foo,accel=kvm,usb=on
    xc: error: Could not obtain handle on privileged command interface (2 = No such file or directory): Internal error
    xen be core: can't open xen interface
    failed to initialize Xen: Operation not permitted

Option usb is queried correctly, and the one without an ID wins,
regardless of option order.

Option accel is queried incorrectly, and which one wins depends on
option order and ID.

Affected options are accel (and its sugared forms -enable-kvm and
-no-kvm), kernel_irqchip, kvm_shadow_mem.

Additionally, option kernel_irqchip is normally on by default, except
it's off when no -machine options are given.  Bug can't bite, because
kernel_irqchip is used only when KVM is enabled, KVM is off by
default, and enabling always creates -machine options.  Downstreams
that enable KVM by default do get bitten, though.

Use qemu_get_machine_opts() to fix these bugs.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1372943363-24081-5-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-07-09 13:38:58 -05:00
Anthony Liguori
c3ab4c9cf2 Merge remote-tracking branch 'bonzini/iommu-for-anthony' into staging
# By Paolo Bonzini (50) and others
# Via Paolo Bonzini
* bonzini/iommu-for-anthony: (66 commits)
  exec: change some APIs to take AddressSpaceDispatch
  exec: remove cur_map
  exec: put memory map in AddressSpaceDispatch
  exec: separate current radix tree from the one being built
  exec: move listener from AddressSpaceDispatch to AddressSpace
  memory: move MemoryListener declaration earlier
  exec: separate current memory map from the one being built
  exec: change well-known physical sections to macros
  qom: Use atomics for object refcounting
  memory: add reference counting to FlatView
  memory: use a new FlatView pointer on every topology update
  memory: access FlatView from a local variable
  add a header file for atomic operations
  hw/[u-x]*: pass owner to memory_region_init* functions
  hw/t*: pass owner to memory_region_init* functions
  hw/s*: pass owner to memory_region_init* functions
  hw/p*: pass owner to memory_region_init* functions
  hw/n*: pass owner to memory_region_init* functions
  hw/m*: pass owner to memory_region_init* functions
  hw/i*: pass owner to memory_region_init* functions
  ...

Message-id: 1372950842-32422-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-07-07 11:19:28 -05:00
Paolo Bonzini
dfde4e6e1a memory: add ref/unref calls
Add ref/unref calls at the following places:

- places where memory regions are stashed by a listener and
  used outside the BQL (including in Xen or KVM).

- memory_region_find callsites

- creation of aliases and containers (only the aliased/contained
  region gets a reference to avoid loops)

- around calls to del_subregion/add_subregion, where the region
  could disappear after the first call

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:45 +02:00
Michael S. Tsirkin
40509f7f52 kvm: skip system call when msi route is unchanged
Some guests do a large number of mask/unmask
calls which currently trigger expensive route update
system calls.
Detect that route in unchanged and skip the system call.

Reported-by: "Zhanghaoyu (A)" <haoyu.zhang@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-07-03 10:38:20 +02:00
Michael S. Tsirkin
0fbc207403 kvm: zero-initialize KVM_SET_GSI_ROUTING input
kvm_add_routing_entry makes an attempt to
zero-initialize any new routing entry.
However, it fails to initialize padding
within the u field of the structure
kvm_irq_routing_entry.

Other functions like kvm_irqchip_update_msi_route
also fail to initialize the padding field in
kvm_irq_routing_entry.

It's better to just make sure all input is initialized.

Once it is, we can also drop complex field by field assignment and just
do the simple *a = *b to update a route entry.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-07-03 10:38:19 +02:00
Amos Kong
fa4ba923bd kvm: add detail error message when fail to add ioeventfd
I try to hotplug 28 * 8 multiple-function devices to guest with
old host kernel, ioeventfds in host kernel will be exhausted, then
qemu fails to allocate ioeventfds for blk/nic devices.

It's better to add detail error here.

Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-07-03 10:38:19 +02:00
Alexander Graf
cb925cf923 KVM: PIC: Only commit irq routing when necessary
The current logic updates KVM's view of our interrupt map every time we
change it. While this is nice and bullet proof, it slows things down
badly for me. QEMU spends about 3 seconds on every start telling KVM what
news it has on its routing maps.

Instead, let's just synchronize the whole irq routing map as a whole when
we're done constructing it. For things that change during runtime, we can
still update the routing table on demand.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-01 01:11:14 +02:00
Alexander Graf
d07cc1f12d KVM: MSI: Swap payload to native endianness
The usual MSI injection mechanism writes msi.data into memory using an
le32 wrapper. So on big endian guests, this swaps msg.data into the
expected byte order.

For irqfd however, we don't swap the payload right now, rendering
in-kernel MPIC emulation broken on PowerPC.

Swap msg.data to the correct endianness whenever we touch it.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-01 01:11:14 +02:00
Alexander Graf
7b7745930b KVM: Export kvm_init_irq_routing
On PPC, we can have different types of interrupt controllers, so we really
only know that we are going to use one when we created it.

Export kvm_init_irq_routing() to common code, so that we don't have to call
kvm_irqchip_create().

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-01 01:11:14 +02:00
Alexander Graf
215e79c01c KVM: Don't assume that mpstate exists with in-kernel PIC always
On PPC, we don't support MP state. So far it's not necessary and I'm
not convinced yet that we really need to support it ever.

However, the current idle logic in QEMU assumes that an in-kernel PIC
also means we support MP state. This assumption is not true anymore.

Let's split up the two cases into two different variables. That way
PPC can expose an in-kernel PIC, while not implementing MP state.

Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Jan Kiszka <jan.kiszka@siemens.com>
2013-07-01 01:11:13 +02:00
Andreas Färber
1458c36313 kvm: Change kvm_cpu_exec() argument to CPUState
It no longer uses CPUArchState.

Prepares for changing qemu_kvm_cpu_thread_fn() opaque to CPUState.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-28 13:25:12 +02:00
Andreas Färber
5326ab5567 kvm: Change kvm_handle_internal_error() argument to CPUState
It no longer uses CPUArchState.

Prepares for changing kvm_cpu_exec() argument to CPUState.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-28 13:25:12 +02:00