Commit Graph

1142 Commits

Author SHA1 Message Date
Peter Xu
15c3850325 migration: move skip_section_footers
Move it into MigrationState, revert its meaning and renaming it to
send_section_footer, with a property bound to it. Same trick is played
like previous patches.

Removing savevm_skip_section_footers().

Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1498536619-14548-9-git-send-email-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-06-28 11:18:39 +02:00
Peter Xu
71dd4c1a56 migration: move skip_configuration out
It was in SaveState but now moved to MigrationState altogether, reverted
its meaning, then renamed to "send_configuration". Again, using
HW_COMPAT_2_3 for old PC/SPAPR machines, and accel_register_prop() for
xen_init().

Removing savevm_skip_configuration().

Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1498536619-14548-8-git-send-email-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-06-28 11:18:38 +02:00
Peter Xu
5272298c48 migration: move global_state.optional out
Put it into MigrationState then we can use the properties to specify
whether to enable storing global state.

Removing global_state_set_optional() since now we can use HW_COMPAT_2_3
for x86/power, and AccelClass.global_props for Xen.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1498536619-14548-6-git-send-email-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-06-28 11:18:38 +02:00
Peter Maydell
84e3d0725b QAPI patches for 2017-06-09
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZSRWrAAoJEDhwtADrkYZTVOQP/RK8br2A1Cn7LeVG6jnKz5hJ
 OqyII77x8I2RachWvnwxQeHPMEVDuz3y2WjL+80s6peXlpR6y/13w7oX0f6aiJuo
 +T9khTqMv2I7HsM5UCsXAJpFPHT7r90b4x8nstY80YLGe7lA7L6yk6PGyCxHThwA
 mOiTKDw6/Xb/yZGrS2Favrun7juNpAs0Ec1IAkaA8xsEgVkd6tDv281rmHqvibl/
 //90VfJp3nHFZ12FCQ1HzA42Eigtmo/fIk9LnAzBoYG0zw0cnzjuv0BNzs/JwuUZ
 /VskeD1cViQ4yzFnPpjOavjYjTN854/JTJzm7gZ7dTQ6/l3ykoY6NDE8p1BLuHlC
 p2RKkg20EeZlpOEtMQ4g6iyG6EUxaKcEiXmQ31LqN/LJwxTYbo5B5nCHMjrt4gxe
 MqFBJQSNsJ7QjZ7Qa7pADMCi/G0m7/0dN8vBqSr4vcbLVvdbw/yb/9s33wXGrUj1
 PyXM2ymi+vvSqcXtNXKshsJLxJSJxO1tm2tRIANDTabQ00yxs8dOYnQnbQFR94fp
 6nrE2PnjZqgqk69aNDJEbngj6Tgx44nyTr1+Q17juZf9nTCE5QmBE1J0IRoykCJn
 E8+T63ZxtIxVV2yLi5xBjmZaZtPyJRGGeUXunA10SuWrHzupEcBuhFhFYd2MFM5L
 fsojALN2K3Gdx2+CmAo2
 =O9Vv
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-06-09-v2' into staging

QAPI patches for 2017-06-09

# gpg: Signature made Tue 20 Jun 2017 13:31:39 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2017-06-09-v2: (41 commits)
  tests/qdict: check more get_try_int() cases
  console: use get_uint() for "head" property
  i386/cpu: use get_uint() for "min-level"/"min-xlevel" properties
  numa: use get_uint() for "size" property
  pnv-core: use get_uint() for "core-pir" property
  pvpanic: use get_uint() for "ioport" property
  auxbus: use get_uint() for "addr" property
  arm: use get_uint() for "mp-affinity" property
  xen: use get_uint() for "max-ram-below-4g" property
  pc: use get_uint() for "hpet-intcap" property
  pc: use get_uint() for "apic-id" property
  pc: use get_uint() for "iobase" property
  acpi: use get_uint() for "pci-hole*" properties
  acpi: use get_uint() for various acpi properties
  acpi: use get_uint() for "acpi-pcihp-io*" properties
  platform-bus: use get_uint() for "addr" property
  bcm2835_fb: use {get, set}_uint() for "vcram-size" and "vcram-base"
  aspeed: use {set, get}_uint() for "ram-size" property
  pcihp: use get_uint() for "bsel" property
  pc-dimm: make "size" property uint64
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-22 11:34:39 +01:00
Marc-André Lureau
4ccd89d294 xen: use get_uint() for "max-ram-below-4g" property
TYPE_PC_MACHINE's property PC_MACHINE_MAX_RAM_BELOW_4G's getter and
setter pc_machine_get_max_ram_below_4g() and
pc_machine_set_max_ram_below_4g() use visit_type_size()

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-34-marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:33 +02:00
Marc-André Lureau
5d7fb0f254 pc: use get_uint() for "hpet-intcap" property
TYPE_HPET's property HPET_INTCAP is defined with DEFINE_PROP_UINT32().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-33-marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:33 +02:00
Marc-André Lureau
c7b4efb4a0 pc: use get_uint() for "apic-id" property
TYPE_X86_CPU's property "apic-id" is defined with DEFINE_PROP_UINT32().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-32-marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:33 +02:00
Marc-André Lureau
1ea1572adf pc: use get_uint() for "iobase" property
TYPE_ISA_FDC's property "iobase" is defined with DEFINE_PROP_UINT32().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-31-marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:33 +02:00
Marc-André Lureau
605553654f acpi: use get_uint() for "pci-hole*" properties
Those properties use visit_type_uint*()

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-30-marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:33 +02:00
Marc-André Lureau
b81bdbf3c7 acpi: use get_uint() for various acpi properties
PIIX4: piix4_pm_add_propeties() defines these with
object_property_add_uint*_ptr().

Q35: ich9_lpc_add_properties() and ich9_pm_add_properties() define them
similarly, except for ACPI_PM_PROP_GPE0_BLK().  That one's getter
ich9_pm_get_gpe0_blk() uses visit_type_uint32().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-29-marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:33 +02:00
Marc-André Lureau
35f91e5069 acpi: use get_uint() for "acpi-pcihp-io*" properties
Those are defined with object_property_add_uint16_ptr()

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-28-marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:32 +02:00
Marc-André Lureau
446de8b68a qdev: Use appropriate getter/setters type
Based on the underlying type of the data accessed, use the appropriate
getters/setters:

* AcpiPmInfo members s3_disabled, s4_disabled are bool, member s4_val is
  an uint8_t

* Property ACPI_PCIHP_IO_PROP is defined with
  object_property_add_uint32_ptr()

* Property PCIE_HOST_MCFG_SIZE is implemented with visit_type_uint64()

* PCIDevice property "addr" is backed by PCIDevice member devfn, which
  is an int32_t

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-20-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[More verbose commit message]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:32 +02:00
Marc-André Lureau
5923f85fb8 qapi: update the qobject visitor to use QNUM_U64
Switch to use QNum/uint where appropriate to remove i64 limitation.

The input visitor will cast i64 input to u64 for compatibility
reasons (existing json QMP client already use negative i64 for large
u64, and expect an implicit cast in qemu).

Note: before the patch, uint64_t values above INT64_MAX are sent over
json QMP as negative values, e.g. UINT64_MAX is sent as -1. After the
patch, they are sent unmodified.  Clearly a bug fix, but we have to
consider compatibility issues anyway.  libvirt should cope fine,
because its parsing of unsigned integers accepts negative values
modulo 2^64.  There's hope that other clients will, too.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170607163635.17635-12-marcandre.lureau@redhat.com>
[check_native_list() tweaked for consistency with signed case]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:31 +02:00
Marc-André Lureau
01b2ffcedd qapi: merge QInt and QFloat in QNum
We would like to use a same QObject type to represent numbers, whether
they are int, uint, or floats. Getters will allow some compatibility
between the various types if the number fits other representations.

Add a few more tests while at it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-7-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[parse_stats_intervals() simplified a bit, comment in
test_visitor_in_int_overflow() tidied up, suppress bogus warnings]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:31 +02:00
Stefan Hajnoczi
7f3cf2d6e7 hw/i386: fix nvdimm check error path
Commit e987c37aee ("hw/i386: check if
nvdimm is enabled before plugging") introduced a check to reject nvdimm
hotplug if -machine pc,nvdimm=on was not given.

This check executes after pc_dimm_memory_plug() has already completed
and does not reverse the effect of this function in the case of failure.

Perform the check before calling pc_dimm_memory_plug().  This fixes the
following abort:

  $ qemu -M accel=kvm -m 1G,slots=4,maxmem=8G \
         -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G
  (qemu) device_add nvdimm,memdev=mem1
  nvdimm is not enabled: missing 'nvdimm' in '-M'
  (qemu) device_add nvdimm,memdev=mem1
  Core dumped

The backtrace is:

  #0  0x00007fffdb5b191f in raise () at /lib64/libc.so.6
  #1  0x00007fffdb5b351a in abort () at /lib64/libc.so.6
  #2  0x00007fffdb5a9da7 in __assert_fail_base () at /lib64/libc.so.6
  #3  0x00007fffdb5a9e52 in  () at /lib64/libc.so.6
  #4  0x000055555577a5fa in qemu_ram_set_idstr (new_block=0x555556747a00, name=<optimized out>, dev=dev@entry=0x555556705590) at qemu/exec.c:1709
  #5  0x0000555555a0fe86 in vmstate_register_ram (mr=mr@entry=0x55555673a0e0, dev=dev@entry=0x555556705590) at migration/savevm.c:2293
  #6  0x0000555555965088 in pc_dimm_memory_plug (dev=dev@entry=0x555556705590, hpms=hpms@entry=0x5555566bb0e0, mr=mr@entry=0x555556705630, align=<optimized out>, errp=errp@entry=0x7fffffffc660)
      at hw/mem/pc-dimm.c:110
  #7  0x000055555581d89b in pc_dimm_plug (errp=0x7fffffffc6c0, dev=0x555556705590, hotplug_dev=<optimized out>) at qemu/hw/i386/pc.c:1713
  #8  0x000055555581d89b in pc_machine_device_plug_cb (hotplug_dev=<optimized out>, dev=0x555556705590, errp=0x7fffffffc6c0) at qemu/hw/i386/pc.c:2004
  #9  0x0000555555914da6 in device_set_realized (obj=<optimized out>, value=<optimized out>, errp=0x7fffffffc7e8) at hw/core/qdev.c:926

Cc: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-16 18:44:56 +03:00
Peter Xu
e7a3b91fdf intel_iommu: cleanup vtd_interrupt_remap_msi()
Move the memcpy upper into where needed, then share the trace so that we
trace every correct remapping.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-16 18:44:56 +03:00
Peter Xu
b9313021f3 intel_iommu: cleanup vtd_{do_}iommu_translate()
First, let vtd_do_iommu_translate() return a status, so that we
explicitly knows whether error occured. Meanwhile, we make sure that
IOMMUTLBEntry is filled in in that.

Then, cleanup vtd_iommu_translate a bit. So even with PT we'll get a log
now. Also, remove useless assignments.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-16 18:44:55 +03:00
Peter Xu
7feb51b709 intel_iommu: switching the rest DPRINTF to trace
We have converted many of the DPRINTF() into traces. This patch does the
last 100+ ones.

To debug VT-d when error happens, let's try enable:

  -trace enable="vtd_err*"

This should works just like the old GENERAL but of course better, since
we don't need to recompile.

Similar rules apply to the other modules. I was trying to make the
prefix good enough for sub-module debugging.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-16 18:44:55 +03:00
Juan Quintela
c4b63b7cc5 migration: Move remaining exported functions to migration/misc.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-06-13 11:00:45 +02:00
Juan Quintela
84a899de8c migration: create global_state.c
It don't belong anywhere else, just the global state where everybody
can stick other things.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-06-13 11:00:45 +02:00
Denis Plotnikov
e2b6c1712e kvmclock: update system_time_msr address forcibly
Do an update of system_time_msr address every time before reading
the value of tsc_timestamp from guest's kvmclock page.

There is no other code paths which ensure that qemu has an up-to-date
value of system_time_msr. So, force this update on guest's tsc_timestamp
reading.

This bug causes effect on those nested setups which turn off TPR access
interception for L2 guests and that access being intercepted by L0 doesn't
show up in L1.
Linux bootstrap initiate kvmclock before APIC initializing causing TPR access.
That's why on L1 guests, having TPR interception turned on for L2, the effect
of the bug is not revealed.

This patch fixes this problem by making sure it knows the correct
system_time_msr address every time it is needed.

Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <1496054944-25623-1-git-send-email-dplotnikov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Igor Mammedov
d41f3e750d numa: make sure that all cpus have has_node_id set if numa is enabled
It fixes/add missing _PXM object for non mapped CPU (x86)
and missing fdt node (virt-arm).

It ensures that possible_cpus contains complete mapping if
numa is enabled by the time machine_init() is executed.

As result non completely mapped CPUs:
 1) appear in ACPI/fdt blobs
 2) QMP query-hotpluggable-cpus command shows bound nodes for such CPUs
 3) allows to drop checks for has_node_id in numa only code,
   reducing number of invariants incomplete mapping could produce
 4) moves fixup/implicit node init from runtime numa_cpu_pre_plug()
   (when CPU object is created) to machine_numa_finish_init() which
   helps to fix [1, 2] and make possible_cpus complete source
   of numa mapping available even before CPUs are created.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1496161442-96665-4-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:08 -03:00
Igor Mammedov
a0ceb640d0 numa: consolidate cpu_preplug fixups/checks for pc/arm/spapr
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1496161442-96665-2-git-send-email-imammedo@redhat.com>
[ehabkost: Fix indentation]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:08 -03:00
Marc-André Lureau
f664b88247 Remove/replace sysemu/char.h inclusion
Those are apparently unnecessary includes.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:52 +04:00
Stefan Hajnoczi
a3203e7dd3 pci, virtio, vhost: fixes
A bunch of fixes all over the place. Most notably this fixes
 the new MTU feature when using vhost.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZK2bwAAoJECgfDbjSjVRpNBgIALmNG7VaixhNUlnfX1n1JBnh
 +HBP2zNfvi0q5roBuPFmlziKa3IBHb2Fcte4nb6QxmPg+uoaj39AOzfrrvz210kR
 h2j5Qk2bCdMeWBpxI+xDDScwi/Im23Y6KN1eZyMekFr2CaSGiqOHZPPdbsyEcHPB
 VylM0uHqSTZL5JAAzEuYlH+LLfPu91HoxMsIAdNuQX+qKyM2DZ4eICBQ0zA73USt
 OduZltcRMk7UpvQMqY+2iaEXapXQQEUGrP2Mo8ZyqeIl2ItC33GspqBQIKjuZdrr
 tpr/T1VWsLdZnURZXyELrFqrErDXvKaP9HROwvyLyYPXZF+pJ3LA7TopS5UmfNQ=
 =Z4xG
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'mst/tags/for_upstream' into staging

pci, virtio, vhost: fixes

A bunch of fixes all over the place. Most notably this fixes
the new MTU feature when using vhost.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 29 May 2017 01:10:24 AM BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* mst/tags/for_upstream:
  acpi-test: update expected files
  pc: ACPI BIOS: use highest NUMA node for hotplug mem hole SRAT entry
  vhost-user: pass message as a pointer to process_message_reply()
  virtio_net: Bypass backends for MTU feature negotiation
  intel_iommu: turn off pt before 2.9
  intel_iommu: support passthrough (PT)
  intel_iommu: allow dev-iotlb context entry conditionally
  intel_iommu: use IOMMU_ACCESS_FLAG()
  intel_iommu: provide vtd_ce_get_type()
  intel_iommu: renaming context entry helpers
  x86-iommu: use DeviceClass properties
  memory: remove the last param in memory_region_iommu_replay()
  memory: tune last param of iommu_ops.translate()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-30 14:15:04 +01:00
Stefan Hajnoczi
d0eda02938 QAPI patches for 2017-05-23
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZJB4MAAoJEDhwtADrkYZTjhkP+wRaiZj9h4IJvcWoNEzfyuA1
 kd7+Kx6QgfCmZE9vL2/mlOFddWL0fPtPffL/ZRu5UNgIILaCSPFsGkOGvXLZhaUW
 he5sqLCqMc2mxgB98HpbT0dzt0cOSCjdM5BxkFXeq/yPoDa0IiZiD8cpvj+FVwKi
 D0qGdrKKGCR3RteL4gr/kaXY/LXAZfuEjbAtylQx1aMHJ6CKmdSIVVVU2JJVIYhQ
 +dT/Xst0PSkJYk90wgmwpzPCqKR/N5zHFe8CyUoE67FxBhegdw19O3wlzU9DJ3N5
 8Az+fbEjifWoMytTZR4H3snPJGwl6wxsh2UVj9SMCvebc0y278UPlGqiszvWBepa
 1iZHHULH+yygHyUmX6CxjHOUW498ES2KGHx7qJJe8ebeJ4XuU7JcE+Sf4GQEAm8Q
 p6P5s3qXpuVjekCjmerUAtybr+hxEQC9fbAGqPq+r489jwjvUiETrMLbmEHyy/Xa
 fSUaW+f5kGI0GJS9FYcbcMy9w2130lTK2k4bZM0mSVlSsHA7W0GBDnzxUDtxo6uH
 oqMQgKIFWOBU5GkRUiL43vpiTIpiLCuG6PbQlgefQRPWdoODVxykuu2bq5hVaax8
 8XMkkq7isG/J5esFc55L1qEUyrUDtVYx/LiHj0XXJikkGirXtp7b7l/TmFLZGsex
 UWWzFRbZnCVf2CKwdV6h
 =DNqn
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'armbru/tags/pull-qapi-2017-05-23' into staging

QAPI patches for 2017-05-23

# gpg: Signature made Tue 23 May 2017 12:33:32 PM BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* armbru/tags/pull-qapi-2017-05-23:
  qapi-schema: Remove obsolete note from ObjectTypeInfo
  block: Use QDict helpers for --force-share
  shutdown: Expose bool cause in SHUTDOWN and RESET events
  shutdown: Add source information to SHUTDOWN and RESET
  shutdown: Preserve shutdown cause through replay
  shutdown: Prepare for use of an enum in reset/shutdown_request
  shutdown: Simplify shutdown_signal
  sockets: Plug memory leak in socket_address_flatten()
  scripts/qmp/qom-set: fix the value argument passed to srv.command()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-30 09:33:40 +01:00
Ladi Prosek
ede24a0264 pc: ACPI BIOS: use highest NUMA node for hotplug mem hole SRAT entry
For reasons unknown, Windows won't online all memory, both at command
line and hot-plugged later, unless the hotplug mem hole SRAT entry
specifies a node greater than or equal to the ones where memory is
added.

Using the highest node on the machine makes recent versions of Windows
happy.

With this example command line:
  ... \
  -m 1024,slots=4,maxmem=32G \
  -numa node,nodeid=0 \
  -numa node,nodeid=1 \
  -numa node,nodeid=2 \
  -numa node,nodeid=3 \
  -object memory-backend-ram,size=1G,id=mem-mem1 \
  -device pc-dimm,id=dimm-mem1,memdev=mem-mem1,node=1

Windows reports a total of 1G of RAM without this commit and the expected
2G with this commit.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
2017-05-29 03:07:57 +03:00
Peter Xu
dbaabb25f4 intel_iommu: support passthrough (PT)
Hardware support for VT-d device passthrough. Although current Linux can
live with iommu=pt even without this, but this is faster than when using
software passthrough.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Liu, Yi L <yi.l.liu@linux.intel.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
f80c98740e intel_iommu: allow dev-iotlb context entry conditionally
When device-iotlb is not specified, we should fail this check. A new
function vtd_ce_type_check() is introduced.

While I'm at it, clean up the vtd_dev_to_context_entry() a bit - replace
many "else if" usage into direct if check. That'll make the logic more
clear.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
5a38cb5940 intel_iommu: use IOMMU_ACCESS_FLAG()
We have that now, so why not use it.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
127ff5c356 intel_iommu: provide vtd_ce_get_type()
Helper to fetch VT-d context entry type.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
8f7d7161dd intel_iommu: renaming context entry helpers
The old names are too long and less ordered. Let's start to use
vtd_ce_*() as a pattern.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
0b77d30a43 x86-iommu: use DeviceClass properties
No reason to keep tens of lines if we can do it actually far shorter.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
bf55b7afce memory: tune last param of iommu_ops.translate()
This patch converts the old "is_write" bool into IOMMUAccessFlags. The
difference is that "is_write" can only express either read/write, but
sometimes what we really want is "none" here (neither read nor write).
Replay is an good example - during replay, we should not check any RW
permission bits since thats not an actual IO at all.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Eric Blake
cf83f14005 shutdown: Add source information to SHUTDOWN and RESET
Time to wire up all the call sites that request a shutdown or
reset to use the enum added in the previous patch.

It would have been less churn to keep the common case with no
arguments as meaning guest-triggered, and only modified the
host-triggered code paths, via a wrapper function, but then we'd
still have to audit that I didn't miss any host-triggered spots;
changing the signature forces us to double-check that I correctly
categorized all callers.

Since command line options can change whether a guest reset request
causes an actual reset vs. a shutdown, it's easy to also add the
information to reset requests.

Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> [ppc parts]
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> [SPARC part]
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x parts]
Message-Id: <20170515214114.15442-5-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-23 13:28:17 +02:00
Eric Blake
aedbe19297 shutdown: Prepare for use of an enum in reset/shutdown_request
We want to track why a guest was shutdown; in particular, being able
to tell the difference between a guest request (such as ACPI request)
and host request (such as SIGINT) will prove useful to libvirt.
Since all requests eventually end up changing shutdown_requested in
vl.c, the logical change is to make that value track the reason,
rather than its current 0/1 contents.

Since command-line options control whether a reset request is turned
into a shutdown request instead, the same treatment is given to
reset_requested.

This patch adds an internal enum ShutdownCause that describes reasons
that a shutdown can be requested, and changes qemu_system_reset() to
pass the reason through, although for now nothing is actually changed
with regards to what gets reported.  The enum could be exported via
QAPI at a later date, if deemed necessary, but for now, there has not
been a request to expose that much detail to end clients.

For the most part, we turn 0 into SHUTDOWN_CAUSE_NONE, and 1 into
SHUTDOWN_CAUSE_HOST_ERROR; the only specific case where we have enough
information right now to use a different value is when we are reacting
to a host signal.  It will take a further patch to edit all call-sites
that can trigger a reset or shutdown request to properly pass in any
other reasons; this patch includes TODOs to point such places out.

qemu_system_reset() trades its 'bool report' parameter for a
'ShutdownCause reason', with all non-zero values having the same
effect; this lets us get rid of the weird #defines for VMRESET_*
as synonyms for bools.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170515214114.15442-3-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-23 13:28:17 +02:00
Juan Quintela
68ba3b0743 migration: migration.h was not needed
This files don't use any function from migration.h, so drop it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-18 19:20:59 +02:00
Stefan Hajnoczi
adb354dd1e pci, virtio, vhost: fixes
A bunch of fixes that missed the release.
 Most notably we are reverting shpc back to enabled by default state
 as guests uses that as an indicator that hotplug is supported
 (even though it's unused). Unfortunately we can't fix this
 on the stable branch since that would break migration.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZHMOuAAoJECgfDbjSjVRp/5IH/3kOa7yV3KUi4QVbQV7WwBH3
 LK+/jwIz4UhOZn+bS4qi+gjN6aFhNoBNDFmYsRTWKKdLMvZvkRBMDcv8DMIKeAyl
 kG/ispv8VI+GY/CRKnqzPm0FSulv8WPRryxkdGzK4oHiMv+4FpFR0v/n9NRHjwTA
 XNJ4k33IqBldXyZwwAzP5dT019EMvbn4bNrkLzlcF2w8mTWPf43eX/kIkRX0cAys
 5IVTQVGEOwpnyV0jxJDP+aoVMrqv8xl88LLuRpTgWUo0UnxXL5/GZQOCCUN6DQ7M
 BOLmyyP9mT9k8iUI+fQsDxAtY7cL9torq+p985nQdH0nxmI3GCoufn9aJG0J9yc=
 =d34x
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'mst/tags/for_upstream' into staging

pci, virtio, vhost: fixes

A bunch of fixes that missed the release.
Most notably we are reverting shpc back to enabled by default state
as guests uses that as an indicator that hotplug is supported
(even though it's unused). Unfortunately we can't fix this
on the stable branch since that would break migration.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 17 May 2017 10:42:06 PM BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* mst/tags/for_upstream:
  exec: abstract address_space_do_translate()
  pci: deassert intx when pci device unrealize
  virtio: allow broken device to notify guest
  Revert "hw/pci: disable pci-bridge's shpc by default"
  acpi-defs: clean up open brace usage
  ACPI: don't call acpi_pcihp_device_plug_cb on xen
  iommu: Don't crash if machine is not PC_MACHINE
  pc: add 2.10 machine type
  pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot
  libvhost-user: fix crash when rings aren't ready
  hw/virtio: fix vhost user fails to startup when MQ
  hw/arm/virt: generate 64-bit addressable ACPI objects
  hw/acpi-defs: replace leading X with x_ in FADT field names

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 10:01:08 +01:00
Eduardo Habkost
040e99686a kvmvapic: Remove user_creatable flag
The kvmvapic device is only usable when created by
apic_common_realize(), not using -device. Remove the
user_creatable flag from the device class.

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-10-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
6c4672cae7 ioapic: Remove user_creatable flag
An ioapic device is already created by the q35 initialization
code, and using "-device ioapic" or "-device kvm-ioapic" will
always fail with "Only 1 ioapics allowed". Remove the
user_creatable flag from the ioapic device classes.

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-9-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
642c1e0546 kvmclock: Remove user_creatable flag
kvmclock should be used by guests only when the appropriate CPUID
feature flags are set on the VCPU, and it is automatically
created by kvmclock_create() when those feature flags are set.
This means creating a kvmclock device using -device is useless.
Remove user_creatable from its device class.

Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Thomas Huth <thuth@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-8-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
8ab5700ca0 iommu: Remove FIXME comment about user_creatable=true
amd-iommu and intel-iommu are really meant to be used with
-device, so they need user_creatable=true. Remove the FIXME
comment.

Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-5-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
e4f4fb1eca sysbus: Set user_creatable=false by default on TYPE_SYS_BUS_DEVICE
commit 33cd52b5d7 unset
cannot_instantiate_with_device_add_yet in TYPE_SYSBUS, making all
sysbus devices appear on "-device help" and lack the "no-user"
flag in "info qdm".

To fix this, we can set user_creatable=false by default on
TYPE_SYS_BUS_DEVICE, but this requires setting
user_creatable=true explicitly on the sysbus devices that
actually work with -device.

Fortunately today we have just a few has_dynamic_sysbus=1
machines: virt, pc-q35-*, ppce500, and spapr.

virt, ppce500, and spapr have extra checks to ensure just a few
device types can be instantiated:

* virt supports only TYPE_VFIO_CALXEDA_XGMAC, TYPE_VFIO_AMD_XGBE.
* ppce500 supports only TYPE_ETSEC_COMMON.
* spapr supports only TYPE_SPAPR_PCI_HOST_BRIDGE.

This patch sets user_creatable=true explicitly on those 4 device
classes.

Now, the more complex cases:

pc-q35-*: q35 has no sysbus device whitelist yet (which is a
separate bug). We are in the process of fixing it and building a
sysbus whitelist on q35, but in the meantime we can fix the
"-device help" and "info qdm" bugs mentioned above. Also, despite
not being strictly necessary for fixing the q35 bug, reducing the
list of user_creatable=true devices will help us be more
confident when building the q35 whitelist.

xen: We also have a hack at xen_set_dynamic_sysbus(), that sets
has_dynamic_sysbus=true at runtime when using the Xen
accelerator. This hack is only used to allow xen-backend devices
to be dynamically plugged/unplugged.

This means today we can use -device with the following 22 device
types, that are the ones compiled into the qemu-system-x86_64 and
qemu-system-i386 binaries:

* allwinner-ahci
* amd-iommu
* cfi.pflash01
* esp
* fw_cfg_io
* fw_cfg_mem
* generic-sdhci
* hpet
* intel-iommu
* ioapic
* isabus-bridge
* kvmclock
* kvm-ioapic
* kvmvapic
* SUNW,fdtwo
* sysbus-ahci
* sysbus-fdc
* sysbus-ohci
* unimplemented-device
* virtio-mmio
* xen-backend
* xen-sysdev

This patch adds user_creatable=true explicitly to those devices,
temporarily, just to keep 100% compatibility with existing
behavior of q35. Subsequent patches will remove
user_creatable=true from the devices that are really not meant to
user-creatable on any machine, and remove the FIXME comment from
the ones that are really supposed to be user-creatable. This is
being done in separate patches because we still don't have an
obvious list of devices that will be whitelisted by q35, and I
would like to get each device reviewed individually.

Cc: Alexander Graf <agraf@suse.de>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Beniamino Galvani <b.galvani@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Frank Blaschka <frank.blaschka@de.ibm.com>
Cc: Gabriel L. Somlo <somlo@cmu.edu>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Pierre Morel <pmorel@linux.vnet.ibm.com>
Cc: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-arm@nongnu.org
Cc: qemu-block@nongnu.org
Cc: qemu-ppc@nongnu.org
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Cc: sstabellini@kernel.org
Cc: Thomas Huth <thuth@redhat.com>
Cc: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Acked-by: John Snow <jsnow@redhat.com>
Acked-by: Juergen Gross <jgross@suse.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-3-ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[ehabkost: Small changes at sysbus_device_class_init() comments]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:01 -03:00
Eduardo Habkost
e90f2a8c3e qdev: Replace cannot_instantiate_with_device_add_yet with !user_creatable
cannot_instantiate_with_device_add_yet was introduced by commit
efec3dd631 to replace no_user. It was
supposed to be a temporary measure.

When it was introduced, we had 54
cannot_instantiate_with_device_add_yet=true lines in the code.
Today (3 years later) this number has not shrunk: we now have
57 cannot_instantiate_with_device_add_yet=true lines. I think it
is safe to say it is not a temporary measure, and we won't see
the flag go away soon.

Instead of a long field name that misleads people to believe it
is temporary, replace it a shorter and less misleading field:
user_creatable.

Except for code comments, changes were generated using the
following Coccinelle patch:

  @@
  expression DC;
  @@
  (
  -DC->cannot_instantiate_with_device_add_yet = false;
  +DC->user_creatable = true;
  |
  -DC->cannot_instantiate_with_device_add_yet = true;
  +DC->user_creatable = false;
  )

  @@
  typedef ObjectClass;
  expression dc;
  identifier class, data;
  @@
   static void device_class_init(ObjectClass *class, void *data)
   {
   ...
   dc->hotpluggable = true;
  +dc->user_creatable = true;
   ...
   }

  @@
  @@
   struct DeviceClass {
   ...
  -bool cannot_instantiate_with_device_add_yet;
  +bool user_creatable;
   ...
  }

  @@
  expression DC;
  @@
  (
  -!DC->cannot_instantiate_with_device_add_yet
  +DC->user_creatable
  |
  -DC->cannot_instantiate_with_device_add_yet
  +!DC->user_creatable
  )

Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Thomas Huth <thuth@redhat.com>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-2-ehabkost@redhat.com>
[ehabkost: kept "TODO remove once we're there" comment]
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:00 -03:00
Stefano Stabellini
1ff7c5986a xen/mapcache: store dma information in revmapcache entries for debugging
The Xen mapcache is able to create long term mappings, they are called
"locked" mappings. The third parameter of the xen_map_cache call
specifies if a mapping is a "locked" mapping.

>From the QEMU point of view there are two kinds of long term mappings:

[a] device memory mappings, such as option roms and video memory
[b] dma mappings, created by dma_memory_map & friends

After certain operations, ballooning a VM in particular, Xen asks QEMU
kindly to destroy all mappings. However, certainly [a] mappings are
present and cannot be removed. That's not a problem as they are not
affected by balloonning. The *real* problem is that if there are any
mappings of type [b], any outstanding dma operations could fail. This is
a known shortcoming. In other words, when Xen asks QEMU to destroy all
mappings, it is an error if any [b] mappings exist.

However today we have no way of distinguishing [a] from [b]. Because of
that, we cannot even print a decent warning.

This patch introduces a new "dma" bool field to MapCacheRev entires, to
remember if a given mapping is for dma or is a long term device memory
mapping. When xen_invalidate_map_cache is called, we print a warning if
any [b] mappings exist. We ignore [a] mappings.

Mappings created by qemu_map_ram_ptr are assumed to be [a], while
mappings created by address_space_map->qemu_ram_ptr_length are assumed
to be [b].

The goal of the patch is to make debugging and system understanding
easier.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2017-05-16 11:49:09 -07:00
Igor Mammedov
ea2650724c pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1494415802-227633-9-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:49 -03:00
Igor Mammedov
93b2a8cb0b pc: add node-id property to CPU
it will allow switching from cpu_index to property based
numa mapping in follow up patches.

PS:
patch changes default value of CPUState::numa_node from 0
to CPU_UNSET_NUMA_NODE_ID. The only place for x86 that
would affected is monitor's 'infor numa' command which
uses that field. However legacy 0 value is still preserved
by pc_cpu_pre_plug() in this patch if user/numa.c hasn't
set it explicitly, so there is no change in behavior.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1494415802-227633-4-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:48 -03:00
Igor Mammedov
ea089eebbd numa: move source of default CPUs to NUMA node mapping into boards
Originally CPU threads were by default assigned in
round-robin fashion. However it was causing issues in
guest since CPU threads from the same socket/core could
be placed on different NUMA nodes.
Commit fb43b73b (pc: fix default VCPU to NUMA node mapping)
fixed it by grouping threads within a socket on the same node
introducing cpu_index_to_socket_id() callback and commit
20bb648d (spapr: Fix default NUMA node allocation for threads)
reused callback to fix similar issues for SPAPR machine
even though socket doesn't make much sense there.

As result QEMU ended up having 3 default distribution rules
used by 3 targets /virt-arm, spapr, pc/.

In effort of moving NUMA mapping for CPUs into possible_cpus,
generalize default mapping in numa.c by making boards decide
on default mapping and let them explicitly tell generic
numa code to which node a CPU thread belongs to by replacing
cpu_index_to_socket_id() with @cpu_index_to_instance_props()
which provides default node_id assigned by board to specified
cpu_index.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1494415802-227633-2-git-send-email-imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:48 -03:00
Laurent Vivier
3bfe57165b numa: equally distribute memory on nodes
When there are more nodes than available memory to put the minimum
allowed memory by node, all the memory is put on the last node.

This is because we put (ram_size / nb_numa_nodes) &
~((1 << mc->numa_mem_align_shift) - 1); on each node, and in this
case the value is 0. This is particularly true with pseries,
as the memory must be aligned to 256MB.

To avoid this problem, this patch uses an error diffusion algorithm [1]
to distribute equally the memory on nodes.

We introduce numa_auto_assign_ram() function in MachineClass
to keep compatibility between machine type versions.
The legacy function is used with pseries-2.9, pc-q35-2.9 and
pc-i440fx-2.9 (and previous), the new one with all others.

Example:

qemu-system-ppc64 -S -nographic  -nodefaults -monitor stdio -m 1G -smp 8 \
                  -numa node -numa node -numa node \
                  -numa node -numa node -numa node

Before:

(qemu) info numa
6 nodes
node 0 cpus: 0 6
node 0 size: 0 MB
node 1 cpus: 1 7
node 1 size: 0 MB
node 2 cpus: 2
node 2 size: 0 MB
node 3 cpus: 3
node 3 size: 0 MB
node 4 cpus: 4
node 4 size: 0 MB
node 5 cpus: 5
node 5 size: 1024 MB

After:
(qemu) info numa
6 nodes
node 0 cpus: 0 6
node 0 size: 0 MB
node 1 cpus: 1 7
node 1 size: 256 MB
node 2 cpus: 2
node 2 size: 0 MB
node 3 cpus: 3
node 3 size: 256 MB
node 4 cpus: 4
node 4 size: 256 MB
node 5 cpus: 5
node 5 size: 256 MB

[1] https://en.wikipedia.org/wiki/Error_diffusion

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170502162955.1610-2-lvivier@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: s/ram_size/size/ at numa_default_auto_assign_ram()]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:47 -03:00
He Chen
0f203430dd numa: Allow setting NUMA distance for different NUMA nodes
This patch is going to add SLIT table support in QEMU, and provides
additional option `dist` for command `-numa` to allow user set vNUMA
distance by QEMU command.

With this patch, when a user wants to create a guest that contains
several vNUMA nodes and also wants to set distance among those nodes,
the QEMU command would like:

```
-numa node,nodeid=0,cpus=0 \
-numa node,nodeid=1,cpus=1 \
-numa node,nodeid=2,cpus=2 \
-numa node,nodeid=3,cpus=3 \
-numa dist,src=0,dst=1,val=21 \
-numa dist,src=0,dst=2,val=31 \
-numa dist,src=0,dst=3,val=41 \
-numa dist,src=1,dst=2,val=21 \
-numa dist,src=1,dst=3,val=31 \
-numa dist,src=2,dst=3,val=21 \
```

Signed-off-by: He Chen <he.chen@linux.intel.com>
Message-Id: <1493260558-20728-1-git-send-email-he.chen@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-11 16:08:37 -03:00