Commit Graph

14548 Commits

Author SHA1 Message Date
Igor Mammedov
df0acded19 memhp: extend address auto assignment to support gaps
setting gap to TRUE will make sparse DIMM
address auto allocation, leaving gaps between
a new DIMM address and preceeding existing DIMM.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-02 17:04:32 +03:00
David Hildenbrand
d9f090ec77 s390x: rename io_subsystem_reset -> subsystem_reset
According to the Pop:
"Subsystem reset operates only on those elements in the configuration
which are not CPUs".

As this is what we actually do, let's simply rename the function.

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Message-Id: <1443689387-34473-6-git-send-email-jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-02 13:31:52 +02:00
David Hildenbrand
7059384c7e s390x: set missing parent for hotplug and quiesce events
Existing code missed to set a parent for the quiesce and hotplug event.
While this didn't matter in practise, new introspection APIs basically now
do an object_unref(object_new(T)), which loops forever.

When trying to remove the event facility bus, the code tries to
unparent all childs on the bus, so they are properly deleted and therefore removed.
As object_unparent() on these child devices doesn't work, as there is no parent,
we loop forever.

Let's fix this by adding the event facility as a parent. Also switch from
object_initialize to object_new, so the only valid reference is in fact the
parent property. This makes it more obvious when the device (state) is actually
gone (and how the reference counting works).

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Message-Id: <1443689387-34473-4-git-send-email-jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-02 13:31:52 +02:00
Richard Henderson
0d583647a7 virtio: Notice when the system doesn't support MSIx at all
And do not issue an error_report in that case.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-01 16:16:52 +03:00
Eduardo Habkost
798595075b pc: Add a comment explaining why pc_compat_2_4() doesn't exist
pc_compat_2_4() doesn't exist, and we shouldn't create one. Add a
comment explaining why the function doesn't exist and why pc_compat_*()
functions are deprecated.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-01 16:16:52 +03:00
Jason Wang
0cf33fb6b4 virtio-net: correctly drop truncated packets
When packet is truncated during receiving, we drop the packets but
neither discard the descriptor nor add and signal used
descriptor. This will lead several issues:

- sg mappings are leaked
- rx will be stalled if a lots of packets were truncated

In order to be consistent with vhost, fix by discarding the descriptor
in this case.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-01 16:16:52 +03:00
Jason Wang
29b9f5efd7 virtio: introduce virtqueue_discard()
This patch introduces virtqueue_discard() to discard a descriptor and
unmap the sgs. This will be used by the patch that will discard
descriptor when packet is truncated.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-01 16:16:52 +03:00
Jason Wang
ce31746157 virtio: introduce virtqueue_unmap_sg()
Factor out sg unmapping logic. This will be reused by the patch that
can discard descriptor.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Andrew James <andrew.james@hpe.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-01 16:16:52 +03:00
Peter Maydell
9e071429e6 * First batch of MAINTAINERS updates
* IOAPIC fixes (to pass kvm-unit-tests with -machine kernel_irqchip=off)
 * NBD API upgrades from Daniel
 * strtosz fixes from Marc-André
 * improved support for readonly=on on scsi-generic devices
 * new "info ioapic" and "info lapic" monitor commands
 * Peter Crosthwaite's ELF_MACHINE cleanups
 * docs patches from Thomas and Daniel
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJWBSAEAAoJEL/70l94x66DeL4H/21YR4GWCqo30f+W5kx24ZNo
 by8H2kdZmWKRr/La1JlAReki9GCP1U8Q0cYC8V885gHLKcahWS/75UKwNbw0OSyg
 2jj4uREc645TTFAvV5kQ+uAw9F/dchvkXylrVgOoUPipfmYibXY8JLu9AcVnZi6H
 X5Rvpqo4Uhp2cbRG7rYWrwgpNL+VZmKc8LDdqdlXrkjjanhuAYO2E9NBKaE+xJQQ
 FHcpkV92iSZFEZ0CB535BTIdNdDM/ae6bw1As27EF10YBTfneCQNazSeh13pLO2n
 lHit2GZr2VeTSBrPkPsItToY/Gw38duVZK4QM5/wSkHBzyeUJY0ltQrf53veYfk=
 =uc+I
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* First batch of MAINTAINERS updates
* IOAPIC fixes (to pass kvm-unit-tests with -machine kernel_irqchip=off)
* NBD API upgrades from Daniel
* strtosz fixes from Marc-André
* improved support for readonly=on on scsi-generic devices
* new "info ioapic" and "info lapic" monitor commands
* Peter Crosthwaite's ELF_MACHINE cleanups
* docs patches from Thomas and Daniel

# gpg: Signature made Fri 25 Sep 2015 11:20:52 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (52 commits)
  doc: Refresh URLs in the qemu-tech documentation
  docs: describe the QEMU build system structure / design
  typedef: add typedef for QemuOpts
  i386: interrupt poll processing
  i386: partial revert of interrupt poll fix
  ppc: Rename ELF_MACHINE to be PPC specific
  i386: Rename ELF_MACHINE to be x86 specific
  alpha: Remove ELF_MACHINE from cpu.h
  mips: Remove ELF_MACHINE from cpu.h
  sparc: Remove ELF_MACHINE from cpu.h
  s390: Remove ELF_MACHINE from cpu.h
  sh4: Remove ELF_MACHINE from cpu.h
  xtensa: Remove ELF_MACHINE from cpu.h
  tricore: Remove ELF_MACHINE from cpu.h
  or32: Remove ELF_MACHINE from cpu.h
  lm32: Remove ELF_MACHINE from cpu.h
  unicore: Remove ELF_MACHINE from cpu.h
  moxie: Remove ELF_MACHINE from cpu.h
  cris: Remove ELF_MACHINE from cpu.h
  m68k: Remove ELF_MACHINE from cpu.h
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-25 21:52:30 +01:00
Peter Maydell
8bfbbb4bcb VFIO updates 2015-09-25
- Remove use of g_malloc0_n for glib2.22 compat
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWBX0cAAoJECObm247sIsidV0P/1A1hgFW+iFl2sA5kesfV3DG
 FSiPTEI0xF+lDJM498i31QatvVAAMYgk6qK7CWyRCodfWhCntAUxAPoUkeKsEP+k
 cVo7qOeO+PUCFwVdykuxWRTsVVkuhjcOZZ0UylWs9A5G2biIncTzum8mAycTshqz
 AWqk52pQMTAgkV5bD/ZAH1IMp9NX1Ipc2mngtXubhYDZPCVq2ERHbr2TqTj27Qzj
 9f1tdeD0VOy6JvQqu2nFqDMrDaVtjMAUldg473UfNTNNaNUdvbr4+o88+oSrNdke
 U7tdLwV/ani7FLUI1dadezbvGlx64EOO8Pa7FO5ZrjmS+IHVMBTVlPLxNd5ljHdJ
 e9Ei+lsgH6nGnAWuL7b1HI6VUS1dExahqZtA6NGG2DxmYqcKQDH49bDUNmSuFcFj
 ViQAeU11+ZpEm3IgvvQl3ZYuSTuQwzfH7xc73v5nWp9m7YNPVyqPerxqOgHxxzeG
 +v9wSi+9Tt4SoEMzyABr4s8WwRUY7YWXPtF077/E4Jk24xNpJaDoP/A1bQ0JORee
 gVlKOYJA+Smy5sQmLss5h5iQ9zun97Ad+WwTA2QdCLeKA7p6x+NCtUUk+7aH4lVv
 doood1Xx2Seic9LDOy9G8h2TN8gMG870gcbPUfEgxfEnLwSHLOnW7w+X5xLSoFjr
 u/8eIeLuPQLMeNf11ocj
 =BLF9
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150925.0' into staging

VFIO updates 2015-09-25

 - Remove use of g_malloc0_n for glib2.22 compat

# gpg: Signature made Fri 25 Sep 2015 17:58:04 BST using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"

* remotes/awilliam/tags/vfio-update-20150925.0:
  vfio/pci: Remove use of g_malloc0_n() from quirks

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-25 21:11:12 +01:00
Peter Maydell
690b286fef Remove muldiv64() by using period instead of frequency
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWBVIdAAoJEPMMOL0/L748BXUP/2j+TRnsaU/gMHoL2IGP78BK
 LLLOL7yyV8ZrsrOFvyv8IW0DtpldoYsvObty/bNAr0iq+QwqwGWn9Gw4im5DtIkN
 s7e1WcxLgFHHcT1QLa70MUjjVtRrflTmcW9TVIW79PQ+HsCqnb7EmFZ96HxzH3zN
 YM93eBT6cJV3axsLwJsE82igCXsLo3raKGNb0jt8b6/XwMoR3iUb1Kgs2dJXZUJw
 TYPtHv7sJpQiLQY8Y8o4EjyyjdFuWPVeIfokgPahoOdVA1PSCx6Qh8o+FV1GZ+nF
 vmAr7Jolri6tdbMgRWtIgQQs2YSvPNIUEOYTXVu/4p287JGZPNU5790V2aIczERc
 gEPTqjI6w1AYy8/yMlO3WpfFxXWZH6ZsNBmxCmhH/mczA2dx3DzDlyI7SofQsCHW
 +81U6GSc/Ryy47C+b6m/YZNQDx3yG8rUFtY4PqCcjJwPZdSEhLEM7crC2XWJwy+0
 rg3SnVvXuE2vC/k7UHEYbnFOyVbvezUYJnigbppMilO8nfXIsyuvc7G4AT96jxbt
 4HQJT6ESGEsIToslWObJ53z3jzoAA17xp4gzkZjx7RwSofkFFIaT7jjaA/D2cxFn
 UOXZgAfde6mfg4Ak0czcBYYvm+peEjXBC+DfsBjfAcQ1dz6WSGyd3QZY0J7i9/7y
 iSNiuCE9J6Ha7XVVYzd2
 =krOI
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier-misc/tags/pull-muldiv64-20150925' into staging

Remove muldiv64() by using period instead of frequency

# gpg: Signature made Fri 25 Sep 2015 14:54:37 BST using RSA key ID 3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier-misc/tags/pull-muldiv64-20150925:
  net: remove muldiv64()
  bt: remove muldiv64()
  hpet: remove muldiv64()
  arm: clarify the use of muldiv64()
  openrisc: remove muldiv64()
  mips: remove muldiv64()
  pcnet: remove muldiv64()
  rtl8139: remove muldiv64()
  i6300esb: remove muldiv64()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-25 18:03:19 +01:00
Peter Maydell
cdf9818242 virtio,pc features, fixes
New features:
     vhost-user multiqueue support
     virtio-ccw virtio 1 support
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWBOxjAAoJECgfDbjSjVRpao8H/1hV55WvPXyEHB9ian+JPVEb
 pYFUcKGRO/bWMbXkqWnIBzNPrViPNQHot3zrOcoXtgnBGcuniiteGcAtqj4WEkgb
 WSa22AI1QrEPfHIkhR3sYdJAsqte/RppnFKLSDDi9TwKOGUho47OnkzJWfB+vuup
 7YM/r8YDCkckdvsvfsCwW4Fbjxv7oKSokFkkdV/NwNDocNvRSBS9iAXsQYFdS7tm
 8DIkWK63HQDY9in+fYkk8zoaXK7oZMyi3vHd2g4W0t0mGznxj9dxomrJrMo/4GWZ
 ZrnlB9R1QxpOCtoDtozelxkCnLJhEVjd8xYkGPg+xzYjrxl9aHIWjSNGhf5Q9QY=
 =5IBX
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio,pc features, fixes

New features:
    vhost-user multiqueue support
    virtio-ccw virtio 1 support

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 25 Sep 2015 07:40:35 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  MAINTAINERS: add more devices to the PCI section
  MAINTAINERS: add more devices to the PC section
  vhost-user: add a new message to disable/enable a specific virt queue.
  vhost-user: add multiple queue support
  vhost: introduce vhost_backend_get_vq_index method
  vhost-user: add VHOST_USER_GET_QUEUE_NUM message
  vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE
  vhost-user: add protocol feature negotiation
  vhost-user: use VHOST_USER_XXX macro for switch statement
  virtio-ccw: enable virtio-1
  virtio-ccw: feature bits > 31 handling
  virtio-ccw: support ring size changes
  virtio: ring sizes vs. reset
  pc: Introduce pc-*-2.5 machine classes
  q35: Move options common to all classes to pc_i440fx_machine_options()
  q35: Move options common to all classes to pc_q35_machine_options()
  virtio-net: unbreak self announcement and guest offloads after migration
  virtio: right size for virtio_queue_get_avail_size

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-25 16:40:05 +01:00
Laurent Vivier
fdfea124f9 bt: remove muldiv64()
Originally, timers were ticks based, and it made sense to
add ticks to current time to know when to trigger an alarm.

But since commit:

7447545 change all other clock references to use nanosecond resolution accessors

All timers use nanoseconds and we need to convert ticks to nanoseconds.

As get_ticks_per_sec() is 10^9,

    a = muldiv64(b, get_ticks_per_sec(), 100);
    y = muldiv64(x, get_ticks_per_sec(), 1000000);

can be converted to

    a = b * 10000000;
    y = x * 1000;

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 14:56:22 +02:00
Laurent Vivier
0a4f9240f5 hpet: remove muldiv64()
hpet defines a clock period in femtoseconds but
then converts it to nanoseconds to use the internal
timers.

We can define the period in nanoseconds and use it
directly, this allows to remove muldiv64().

We only need to convert the period to femtoseconds
to put it in internal hpet capability register.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 14:56:05 +02:00
Laurent Vivier
ccaf174923 openrisc: remove muldiv64()
Originally, timers were ticks based, and it made sense to
add ticks to current time to know when to trigger an alarm.

But since commit:

7447545 change all other clock references to use nanosecond resolution accessors

All timers use nanoseconds and we need to convert ticks to nanoseconds, by
doing something like:

    y = muldiv64(x, get_ticks_per_sec(), TIMER_FREQ)

where x is the number of device ticks and y the number of system ticks.

y is used as nanoseconds in timer functions,
it works because 1 tick is 1 nanosecond.
(get_ticks_per_sec() is 10^9)

But as openrisc timer frequency is 20 MHz, we can also do:

    y = x * 50; /* 20 MHz period is 50 ns */

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2015-09-25 14:54:22 +02:00
Laurent Vivier
683dca6bd5 mips: remove muldiv64()
Originally, timers were ticks based, and it made sense to
add ticks to current time to know when to trigger an alarm.

But since commit:

7447545 change all other clock references to use nanosecond resolution accessors

All timers use nanoseconds and we need to convert ticks to nanoseconds, by
doing something like:

    y = muldiv64(x, get_ticks_per_sec(), TIMER_FREQ)

where x is the number of device ticks and y the number of system ticks.

y is used as nanoseconds in timer functions,
it works because 1 tick is 1 nanosecond.
(get_ticks_per_sec() is 10^9)

But as MIPS timer frequency is 100 MHz, we can also do:

    y = x * 10; /* 100 MHz period is 10 ns */

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
2015-09-25 14:54:04 +02:00
Laurent Vivier
c6acbe861f pcnet: remove muldiv64()
Originally, timers were ticks based, and it made sense to
add ticks to current time to know when to trigger an alarm.

But since commit:

7447545 change all other clock references to use nanosecond resolution accessors

All timers use nanoseconds and we need to convert ticks to nanoseconds, by
doing something like:

    y = muldiv64(x, get_ticks_per_sec(), PCI_FREQUENCY)

where x is the number of device ticks and y the number of system ticks.

y is used as nanoseconds in timer functions,
it works because 1 tick is 1 nanosecond.
(get_ticks_per_sec() is 10^9)

But as PCI frequency is 33 MHz, we can also do:

    y = x * 30; /* 33 MHz PCI period is 30 ns */

Which is much more simple.

This implies a 33.333333 MHz PCI frequency,
but this is correct.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-09-25 14:53:50 +02:00
Laurent Vivier
37b9ab92f7 rtl8139: remove muldiv64()
Originally, timers were ticks based, and it made sense to
add ticks to current time to know when to trigger an alarm.

But since commit:

7447545 change all other clock references to use nanosecond resolution accessors

All timers use nanoseconds and we need to convert ticks to nanoseconds, by
doing something like:

    y = muldiv64(x, get_ticks_per_sec(), PCI_FREQUENCY)

where x is the number of device ticks and y the number of system ticks.

y is used as nanoseconds in timer functions,
it works because 1 tick is 1 nanosecond.
(get_ticks_per_sec() is 10^9)

But as PCI frequency is 33 MHz, we can also do:

    y = x * 30; /* 33 MHz PCI period is 30 ns */

Which is much more simple.

This implies a 33.333333 MHz PCI frequency,
but this is correct.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-09-25 14:53:29 +02:00
Laurent Vivier
9491e9bc01 i6300esb: remove muldiv64()
Originally, timers were ticks based, and it made sense to
add ticks to current time to know when to trigger an alarm.

But since commit:

7447545 change all other clock references to use nanosecond resolution accessors

All timers use nanoseconds and we need to convert ticks to nanoseconds, by
doing something like:

    y = muldiv64(x, get_ticks_per_sec(), PCI_FREQUENCY)

where x is the number of device ticks and y the number of system ticks.

y is used as nanoseconds in timer functions,
it works because 1 tick is 1 nanosecond.
(get_ticks_per_sec() is 10^9)

But as PCI frequency is 33 MHz, we can also do:

    y = x * 30; /* 33 MHz PCI period is 30 ns */

Which is much more simple.

This implies a 33.333333 MHz PCI frequency,
but this is correct.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2015-09-25 14:52:17 +02:00
Peter Crosthwaite
4ecd4d16a0 ppc: Rename ELF_MACHINE to be PPC specific
Rename ELF_MACHINE to be PPC specific. This is used as-is by the
various PPC bootloaders and is locally defined to ELF_MACHINE in linux
user in PPC specific ifdeffery.

This removes another architecture specific definition from the global
namespace (as desired by multi-arch).

Cc: Alexander Graf <agraf@suse.de>
Cc: qemu-ppc@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:44 +02:00
Peter Crosthwaite
a5e8788f89 i386: Rename ELF_MACHINE to be x86 specific
Rename ELF_MACHINE to be I386 specific. This is used as-is by the
multiboot loader.

Linux-user previously used this definition but will not anymore,
falling back to the default bahaviour of using ELF_ARCH as ELF_MACHINE.

This removes another architecture specific definition from the global
namespace.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:44 +02:00
Peter Crosthwaite
04ce380e9e mips: Remove ELF_MACHINE from cpu.h
The only generic code relying on this is linux-user, but linux users'
default behaviour of defaulting ELF_MACHINE to ELF_ARCH will handle
this.

The bootloaders can just pass EM_MIPS directly, as that is
architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:44 +02:00
Peter Crosthwaite
77452383e0 sparc: Remove ELF_MACHINE from cpu.h
The bootloaders can just pass EM_SPARC or EM_SPARCV9 directly, as
they are architecture specific code (to one or the other).

This removes another architecture specific definition from the global
namespace.

Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:44 +02:00
Peter Crosthwaite
99a4434ed7 s390: Remove ELF_MACHINE from cpu.h
The bootloader can just pass EM_S390 directly, as that
is architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Richard Henderson <rth@twiddle.net>
Cc: Alexander Graf <agraf@suse.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:44 +02:00
Peter Crosthwaite
943cd38722 xtensa: Remove ELF_MACHINE from cpu.h
The bootloaders can just pass EM_XTENSA directly, as that
is architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:44 +02:00
Peter Crosthwaite
7183128bc9 tricore: Remove ELF_MACHINE from cpu.h
The bootloader can just pass EM_TRICORE directly, as that
is architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Acked-By: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:44 +02:00
Peter Crosthwaite
ed03ecf8f0 or32: Remove ELF_MACHINE from cpu.h
The only generic code relying on this is linux-user, but linux users'
default behaviour of defaulting ELF_MACHINE to ELF_ARCH will handle
this.

The bootloader can just pass EM_OPENRISC directly, as that is
architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Jia Liu <proljc@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:44 +02:00
Peter Crosthwaite
22d2fb4c59 lm32: Remove ELF_MACHINE from cpu.h
The bootloaders can just pass EM_LATTICEMICO32 directly, as that is
architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Michael Walle <michael@walle.cc>
Acked-By: Michael Walle <michael@walle.cc>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:43 +02:00
Peter Crosthwaite
b744d332f3 moxie: Remove ELF_MACHINE from cpu.h
The bootloader can just pass EM_MOXIE directly, as that is architecture
specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Anthony Green <green@moxielogic.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:43 +02:00
Peter Crosthwaite
7233df4949 cris: Remove ELF_MACHINE from cpu.h
The only generic code relying on this is linux-user, but linux users'
default behaviour of defaulting ELF_MACHINE to ELF_ARCH will handle
this.

The bootloader can just pass EM_CRIS directly, as that is architecture
specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:43 +02:00
Peter Crosthwaite
45e6b8b61a m68k: Remove ELF_MACHINE from cpu.h
The only generic code relying on this is linux-user, but linux users'
default behaviour of defaulting ELF_MACHINE to ELF_ARCH will handle
this.

The machine model bootloaders can just pass EM_68K directly, as that
is architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Laurent Vivier <laurent@vivier.eu>
Cc: Greg Ungerer <gerg@uclinux.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Greg Ungerer <gerg@uclinux.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:43 +02:00
Peter Crosthwaite
f4fc2bbfa2 mb: Remove ELF_MACHINE from cpu.h
The only generic code relying on this is linux-user, but linux-users'
default behaviour or setting ELF_MACHINE to ELF_ARCH will handle this.

The microblaze bootloader can just pass EM_MICROBLAZE directly, as that
is architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:43 +02:00
Peter Crosthwaite
b597c3f7da arm: Remove ELF_MACHINE from cpu.h
The only generic code relying on this is linux-user. Linux user
already has a lot of #ifdef TARGET_ customisation so instead, define
ELF_ARCH as either EM_ARM or EM_AARCH64 appropriately.

The armv7m bootloader can just pass EM_ARM directly, as that
is architecture specific code. Note that arm_boot already has its own
logic selecting an arm specific elf machine so this makes V7M more
consistent with arm_boot.

This removes another architecture specific definition from the global
namespace.

Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:43 +02:00
Peter Crosthwaite
98dbe5aca8 elf: Update EM_MOXIE definition
EM_MOXIE now has a proper assigned elf code. Use it. Register the old
interim value as EM_MOXIE_OLD and accept either in elf loading.

Cc: Anthony Green <green@moxielogic.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:43 +02:00
Pavel Butsykin
6bde8fd69f hmp: implemented io apic dump state for TCG
Added support emulator for the hmp command "info ioapic"

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas Färber <afaerber@suse.de>
Message-Id: <1442927901-1084-10-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:42 +02:00
Pavel Butsykin
d665d696c5 hmp: added io apic dump state
Added the hmp command to query io apic state, may be usefull after guest
crashes to understand IRQ routing in guest.

Implementation is only for kvm here. The dump will look like
(qemu) info ioapic
ioapic id=0x00 sel=0x26 (redir[11])
pin 0  0x0000000000010000 dest=0 vec=0   active-hi edge  masked fixed  physical
pin 1  0x0000000000000031 dest=0 vec=49  active-hi edge         fixed  physical
...
pin 23 0x0000000000010000 dest=0 vec=0   active-hi edge  masked fixed  physical
IRR        (none)
Remote IRR (none)

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas Färber <afaerber@suse.de>
Message-Id: <1442927901-1084-9-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:42 +02:00
Pavel Butsykin
a22bf99c58 apic_internal.h: rename ESR_ILLEGAL_ADDRESS to APIC_ESR_ILLEGAL_ADDRESS
Added prefix APIC_ for determining the constant of a particular subsystem,
improve the overall readability and match other constant names.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas Färber <afaerber@suse.de>
Message-Id: <1442927901-1084-3-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:42 +02:00
Pavel Butsykin
82a5e042fa apic_internal.h: make some apic_get_* functions externally visible
Move apic_get_bit(), apic_set_bit() to apic_internal.h, make the apic_get_ppr
symbol external. It's necessary to work with isr, tmr, irr and ppr outside
hw/intc/apic.c

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas Färber <afaerber@suse.de>
Message-Id: <1442927901-1084-2-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:42 +02:00
Paolo Bonzini
2f5a3b1252 ioapic: fix contents of arbitration register
The arbitration register should read to the same value as the
IOAPIC id register.  Fixes kvm-unit-tests ioapic.flat.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:42 +02:00
Paolo Bonzini
c5955a561c ioapic: coalesce level interrupts
If a level-triggered interrupt goes down and back up before the
corresponding EOI, it should be coalesced.  This fixes one testcase
in kvm-unit-tests' ioapic.flat.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:42 +02:00
Marc-André Lureau
500887768a vhost-scsi: include linux/vhost.h
Replace ad-hoc declarations with the linux header.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1442585920-28373-1-git-send-email-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:41 +02:00
Paolo Bonzini
0eb2baeb44 scsi-generic: let guests recognize readonly=on on passthrough devices
Passed-through SCSI devices can be opened with the readonly=on option.
When this happens, Linux filters away write commands so that the guest
cannot overwrite the contents of the device.

However, the guest does not know that the device is read-only, and
accepts writes.  The writes only fail later when the page cache is
flushed.

This patch modifies scsi-generic to modify the MODE SENSE data and
set the read-only bit in the device-specific parameters, so that
the guest OS treats the disk as write protected.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:41 +02:00
Peter Maydell
9438fe9e56 Remove libcacard
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWAxuHAAoJENro4Ql1lpzlXuAQALbdywlUC/C8Qmx12s7H3DtT
 bsasSDq2UkFERcAH0OWXJ/oAgIw95a0yq5ulP9mpT+jiT8rqT2sTuHArHzLICV+T
 37LAcMxZCdOQwRW2q2RnceSWiYOVMIuCXrWw9FjmM44CKZT4AX+5re4EiOcXYotf
 TQaGMhNr9CWOPp/aUUhPQt8ccaapWWyr2j4pNx2Apo6qvLuHMsIczlEAVRCEwL6J
 XdNGKCvhvmtK2e1RUfD7djeGRAYlcnotoq1FvrGnXxrZIV+fTjpL4I3+GNNisn5e
 29e3BrtHcWmksRP9rr12S++4yn9XegqojbTbKrUKrzNx+2rvSBc7lQenYrY36KUa
 426Uy3cQAagkrch7m9bcT5UqH9axKfeP+FgTnP+0xAQHg4tM+dYzlJOEwT8le+p+
 2iPswwTOJBO4R2DR6dt39cKAB7jz9ZpBD++biaNhd91dt4o9MY3ebiBiFTq6dmxi
 xUryff6MM/FnT0b69FgWsOblTdhLsdIEKU+JJFt5WMWl7vfOfRlP3G6rdFZypWPT
 E8U+7vLdAxeFONauYb1BHkwrZbFxv2tXB9VGaaf5FRSh83zCO1K0Jih22udYcLF0
 QYQUFNSjTZteLC4TbAJvKUs52Isc6IMAK7dqIty5yoFmQT7hGWCWqK2p8CjA3JYq
 EwWW0WgKlLTxR0O1DXZC
 =b/4L
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/elmarco/tags/rm-libcacard' into staging

Remove libcacard

# gpg: Signature made Wed 23 Sep 2015 22:37:11 BST using RSA key ID 75969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/rm-libcacard:
  libcacard: use the standalone project

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-24 17:04:31 +01:00
Changchun Ouyang
7263a0ad78 vhost-user: add a new message to disable/enable a specific virt queue.
Add a new message, VHOST_USER_SET_VRING_ENABLE, to enable or disable
a specific virt queue, which is similar to attach/detach queue for
tap device.

virtio driver on guest doesn't have to use max virt queue pair, it
could enable any number of virt queue ranging from 1 to max virt
queue pair.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:53 +03:00
Changchun Ouyang
b931bfbf04 vhost-user: add multiple queue support
This patch is initially based a patch from Nikolay Nikolaev.

This patch adds vhost-user multiple queue support, by creating a nc
and vhost_net pair for each queue.

Qemu exits if find that the backend can't support the number of requested
queues (by providing queues=# option). The max number is queried by a
new message, VHOST_USER_GET_QUEUE_NUM, and is sent only when protocol
feature VHOST_USER_PROTOCOL_F_MQ is present first.

The max queue check is done at vhost-user initiation stage. We initiate
one queue first, which, in the meantime, also gets the max_queues the
backend supports.

In older version, it was reported that some messages are sent more times
than necessary. Here we came an agreement with Michael that we could
categorize vhost user messages to 2 types: non-vring specific messages,
which should be sent only once, and vring specific messages, which should
be sent per queue.

Here I introduced a helper function vhost_user_one_time_request(), which
lists following messages as non-vring specific messages:

        VHOST_USER_SET_OWNER
        VHOST_USER_RESET_DEVICE
        VHOST_USER_SET_MEM_TABLE
        VHOST_USER_GET_QUEUE_NUM

For above messages, we simply ignore them when they are not sent the first
time.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:53 +03:00
Yuanhan Liu
fc57fd9900 vhost: introduce vhost_backend_get_vq_index method
Minusing the idx with the base(dev->vq_index) for vhost-kernel, and
then adding it back for vhost-user doesn't seem right. Here introduces
a new method vhost_backend_get_vq_index() for getting the right vq
index for following vhost messages calls.

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:53 +03:00
Yuanhan Liu
e2051e9e00 vhost-user: add VHOST_USER_GET_QUEUE_NUM message
This is for querying how many queues the backend supports if it has mq
support(when VHOST_USER_PROTOCOL_F_MQ flag is set from the quried
protocol features).

vhost_net_get_max_queues() is the interface to export that value, and
to tell if the backend supports # of queues user requested, which is
done in the following patch.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:52 +03:00
Yuanhan Liu
d1f8b30ec8 vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE
Quote from Michael:

    We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:52 +03:00
Michael S. Tsirkin
dcb10c000c vhost-user: add protocol feature negotiation
Support a separate bitmask for vhost-user protocol features,
and messages to get/set protocol features.

Invoke them at init.

No features are defined yet.

[ leverage vhost_user_call for request handling -- Yuanhan Liu ]

Signed-off-by: Michael S. Tsirkin <address@hidden>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:52 +03:00
Yuanhan Liu
7305483a3d vhost-user: use VHOST_USER_XXX macro for switch statement
So that we could let vhost_user_call to handle extented requests,
such as VHOST_USER_GET/SET_PROTOCOL_FEATURES, instead of invoking
vhost_user_read/write and constructing the msg again by ourself.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:52 +03:00
Cornelia Huck
542571d523 virtio-ccw: enable virtio-1
Let's enable revision 1 for virtio-ccw devices. We can always offer
VERSION_1 as drivers in legacy mode won't be able to see it anyway.

We have to introduce a way to set a lower maximum revision for a device
to accommodate the following cases:
- compat machines (to enforce legacy only)
- virtio-blk with scsi support (version 1 + scsi is fenced by common
  code, with a user-configured max revision of 0 we can allow scsi
  via not offering VERSION_1)

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-09-24 13:42:17 +03:00
Cornelia Huck
b4f8f9df15 virtio-ccw: feature bits > 31 handling
We currently switch off the VERSION_1 feature bit if the guest has
not negotiated at least revision 1. As no feature bits beyond 31 are
valid however unless VERSION_1 has been negotiated, make sure that
legacy guests never see a feature bit beyond 31.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-09-24 13:42:17 +03:00
Cornelia Huck
79cd0c80f8 virtio-ccw: support ring size changes
Wire up changing the ring size for virtio-1 devices.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-09-24 13:42:17 +03:00
Cornelia Huck
46c5d0823d virtio: ring sizes vs. reset
We allow guests to change the size of the virtqueue rings by supplying
a number of buffers that is different from the number of buffers the
device was initialized with. Current code has some problems, however,
since reset does not reset the ringsizes to the default values (as this
is not saved anywhere).

Let's extend the core code to keep track of the default ringsizes and
migrate them once the guest changed them for any of the virtqueues
for a device.

Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-09-24 13:42:17 +03:00
Eduardo Habkost
87e896abe6 pc: Introduce pc-*-2.5 machine classes
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-09-24 13:42:00 +03:00
Eduardo Habkost
254bdb1cbf q35: Move options common to all classes to pc_i440fx_machine_options()
The existing default_machine_opts and default_display settings will
still apply to future machine classes. So it makes sense to move them to
pc_i440fx_machine_options() instead of keeping them in a
version-specific machine_options function.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-09-24 13:39:47 +03:00
Eduardo Habkost
0b7783a79e q35: Move options common to all classes to pc_q35_machine_options()
The existing default_machine_opts, default_display, no_floppy, and
no_tco settings will still apply to future machine classes. So it makes
sense to move them to pc_q35_machine_options() instead of keeping them
in a version-specific machine_options function.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-09-24 13:39:47 +03:00
Jason Wang
1f8828ef57 virtio-net: unbreak self announcement and guest offloads after migration
After commit 019a3edbb2 ("virtio: make
features 64bit wide"). Device's guest_features was actually set after
vdc->load(). This breaks the assumption that device specific load()
function can check guest_features. For virtio-net, self announcement
and guest offloads won't work after migration.

Fixing this by defer them to virtio_net_load() where guest_features
were guaranteed to be set. Other virtio devices looks fine.

Fixes: 019a3edbb2
       ("virtio: make features 64bit wide")
Cc: qemu-stable@nongnu.org
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-09-24 13:39:46 +03:00
Pierre Morel
50764fc8a3 virtio: right size for virtio_queue_get_avail_size
Being working on dataplane I notice something strange:

virtio_queue_get_avail_size() used a 64bit size index
for the calculation of the available ring size.

It is quite strange but it did work with the old calculation
of the avail ring, at most with performance penalty,
and I wonder where I missed something.

This patch let use a 16bit size as defined in virtio_ring.h

Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-09-24 13:39:46 +03:00
Alex Williamson
9d146b2e2f vfio/pci: Remove use of g_malloc0_n() from quirks
For compatibility with glib 2.22.

Reported-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 21:27:17 -06:00
Shannon Zhao
cd37aaf876 hw/arm/virt-acpi-build: Fix wrong size of flash in ACPI table
While virt machine creates two flash devices with total size 0x08000000,
the ACPI table generation code was wrongly using this total size as the
size of each flash device, so it would overlap other MMIO spaces.
Make each device entry in the table half the total; this brings the
ACPI table into line with the code which generates the device tree
and which creates the flash devices themselves.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Wei Huang <wei@redhat.com>
Tested-by: Graeme Gregory <graeme.gregory@linaro.org>
Message-id: 1442455041-6596-1-git-send-email-shannon.zhao@linaro.org
[PMM: edited commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-24 01:29:37 +01:00
Pavel Fedin
b92ad3949b hw/arm/virt: Add gic-version option to virt machine
Add gic_version to VirtMachineState, set it to value of the option
and pass it around where necessary. Instantiate devices and fdt
nodes according to the choice.

max_cpus for virt machine increased to 123 (calculated from redistributor
space available in the memory map). GICv2 compatibility check happens
inside arm_gic_common_realize().

ITS region is added to the memory map too, however currently it not used,
just reserved.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Tested-by: Ashok kumar <ashoks@broadcom.com>
[PMM: Added missing cpu_to_le* calls, thanks to Shannon Zhao]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-24 01:29:37 +01:00
Pavel Fedin
a7bf30342e hw/intc: Initial implementation of vGICv3
This is the initial version of KVM-accelerated GICv3 support.
State load and save are not yet supported, live migration is
not possible.

In order to get correct class name in a simpler way, gicv3_class_name()
function is implemented, similar to gic_class_name().

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Ashok kumar <ashoks@broadcom.com>
Message-id: 69d8f01d14994d7a1a140e96aef59fd332d02293.1441784344.git.p.fedin@samsung.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-24 01:29:37 +01:00
Pavel Fedin
4b3cfe72d9 intc/gic: Extract some reusable vGIC code
Some functions previously used only by vGICv2 are useful also for vGICv3
implementation. Untie them from GICState and make accessible from within
other modules:
- kvm_arm_gic_set_irq()
- kvm_gic_supports_attr() - moved to common code and renamed to
  kvm_device_check_attr()
- kvm_gic_access() - turned into GIC-independent kvm_device_access().
  Data pointer changed to void * because some GICv3 registers are
  64-bit wide

Some of these changes are not used right now, but they will be helpful for
implementing live migration.

Actually kvm_dist_get() and kvm_dist_put() could also be made reusable, but
they would require two extra parameters (s->dev_fd and s->num_cpu) as well as
lots of typecasts of 's' to DeviceState * and back to GICState *. This makes
the code very ugly so i decided to stop at this point. I tried also an
approach with making a base class for all possible GICs, but it would contain
only three variables (dev_fd, cpu_num and irq_num), and accessing them through
the rest of the code would be again tedious (either ugly casts or qemu-style
separate object pointer). So i disliked it too.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Tested-by: Ashok kumar <ashoks@broadcom.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 2ef56d1dd64ffb75ed02a10dcdaf605e5b8ff4f8.1441784344.git.p.fedin@samsung.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-24 01:29:36 +01:00
Shlomo Pongratz
ff8f06ee76 hw/intc: Implement GIC-500 base class
This class is to be used by both software and KVM implementations of GICv3

Currently it is mostly a placeholder, but in future it is supposed to hold
qemu's representation of GICv3 state, which is necessary for migration.

The interface of this class is fully compatible with GICv2 one. This is
done in order to simplify integration with existing code.

Signed-off-by: Shlomo Pongratz <shlomo.pongratz@huawei.com>
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Ashok kumar <ashoks@broadcom.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: aff8baaee493cdcab0694b4a1d4dd5ff27c37ed2.1441784344.git.p.fedin@samsung.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-24 01:29:36 +01:00
Marc-André Lureau
7b02f5447c libcacard: use the standalone project
libcacard is now a standalone project hosted with the Spice project (see
the 2.5.0 release announcement), remove it from qemu tree.

Use the library if found during configure or if --enable-smartcard.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-23 23:34:17 +02:00
Alex Williamson
89dcccc593 vfio/pci: Add emulated PCI IDs
Specifying an emulated PCI vendor/device ID can be useful for testing
various quirk paths, even though the behavior and functionality of
the device with bogus IDs is fully unsupportable.  We need to use a
uint32_t for the vendor/device IDs, even though the registers
themselves are only 16-bit in order to be able to determine whether
the value is valid and user set.

The same support is added for subsystem vendor/device ID, though these
have the possibility of being useful and supported for more than a
testing tool.  An emulated platform might want to impose their own
subsystem IDs or at least hide the physical subsystem ID.  Windows
guests will often reinstall drivers due to a change in subsystem IDs,
something that VM users may want to avoid.  Of course careful
attention would be required to ensure that guest drivers do not rely
on the subsystem ID as a basis for device driver quirks.

All of these options are added using the standard experimental option
prefix and should not be considered stable.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:49 -06:00
Alex Williamson
ff635e3775 vfio/pci: Cache vendor and device ID
Simplify access to commonly referenced PCI vendor and device ID by
caching it on the VFIOPCIDevice struct.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:49 -06:00
Alex Williamson
c9c5000991 vfio/pci: Move AMD device specific reset to quirks
This is just another quirk, for reset rather than affecting memory
regions.  Move it to our new quirks file.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:49 -06:00
Alex Williamson
958d553405 vfio/pci: Remove old config window and mirror quirks
These are now unused.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:48 -06:00
Alex Williamson
0d38fb1c5f vfio/pci: Config mirror quirk
Re-implement our mirror quirk using the new infrastructure.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:48 -06:00
Alex Williamson
0e54f24a5b vfio/pci: Config window quirks
Config windows make use of an address register and a data register.
In VGA cards, these are often used to provide real mode code in the
BIOS an easy way to access MMIO registers since the window often
resides in an I/O port register.  When the MMIO register has a mirror
of PCI config space, we need to trap those accesses and redirect them
to emulated config space.

The previous version of this functionality made use of a single
MemoryRegion and single match address.  This version uses separate
MemoryRegions for each of the address and data registers and allows
for multiple match addresses.  This is useful for Nvidia cards which
have two ranges which index into PCI config space.

The previous implementation is left for the follow-on patch for a more
reviewable diff.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:48 -06:00
Alex Williamson
954258a5f1 vfio/pci: Rework RTL8168 quirk
Another rework of this quirk, this time to update to the new quirk
structure.  We can handle the address and data registers with
separate MemoryRegions and a quirk specific data structure, making the
code much more understandable.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:47 -06:00
Alex Williamson
6029a424be vfio/pci: Cleanup Nvidia 0x3d0 quirk
The Nvidia 0x3d0 quirk makes use of a two separate registers and gives
us our first chance to make use of separate memory regions for each to
simplify the code a bit.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:47 -06:00
Alex Williamson
b946d28611 vfio/pci: Cleanup ATI 0x3c3 quirk
This is an easy quirk that really doesn't need a data structure if
its own.  We can pass vdev as the opaque data and access to the
MemoryRegion isn't required.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:47 -06:00
Alex Williamson
8c4f234853 vfio/pci: Foundation for new quirk structure
VFIOQuirk hosts a single memory region and a fixed set of data fields
that try to handle all the quirk cases, but end up making those that
don't exactly match really confusing.  This patch introduces a struct
intended to provide more flexibility and simpler code.  VFIOQuirk is
stripped to its basics, an opaque data pointer for quirk specific
data and a pointer to an array of MemoryRegions with a counter.  This
still allows us to have common teardown routines, but adds much
greater flexibility to support multiple memory regions and quirk
specific data structures that are easier to maintain.  The existing
VFIOQuirk is transformed into VFIOLegacyQuirk, which further patches
will eliminate entirely.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:46 -06:00
Alex Williamson
056dfcb695 vfio/pci: Cleanup ROM blacklist quirk
Create a vendor:device ID helper that we'll also use as we rework the
rest of the quirks.  Re-reading the config entries, even if we get
more blacklist entries, is trivial overhead and only incurred during
device setup.  There's no need to typedef the blacklist structure,
it's a static private data type used once.  The elements get bumped
up to uint32_t to avoid future maintenance issues if PCI_ANY_ID gets
used for a blacklist entry (avoiding an actual hardware match).  Our
test loop is also crying out to be simplified as a for loop.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:45 -06:00
Alex Williamson
c00d61d8fa vfio/pci: Split quirks to a separate file
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:45 -06:00
Alex Williamson
78f33d2bfd vfio/pci: Extract PCI structures to a separate header
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:44 -06:00
Alex Williamson
5e15d79b86 vfio: Change polarity of our no-mmap option
The default should be to allow mmap and new drivers shouldn't need to
expose an option or set it to other than the allocation default in
their initfn.  Take advantage of the experimental flag to change this
option to the correct polarity.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:44 -06:00
Alex Williamson
46746dbaa8 vfio/pci: Make interrupt bypass runtime configurable
Tracing is more effective when we can completely disable all KVM
bypass paths.  Make these runtime rather than build-time configurable.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:44 -06:00
Alex Williamson
0de70dc7ba vfio/pci: Rename MSI/X functions for easier tracing
This allows vfio_msi* tracing.  The MSI/X interrupt tracing is also
pulled out of #ifdef DEBUG_VFIO to avoid a recompile for tracing this
path.  A few cycles to read the message is hardly anything if we're
already in QEMU.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:43 -06:00
Alex Williamson
870cb6f104 vfio/pci: Rename INTx functions for easier tracing
Rename functions and tracing callbacks so that we can trace vfio_intx*
to see all the INTx related activities.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:43 -06:00
Alex Williamson
b5bd049fa9 vfio/pci: Cleanup vfio_early_setup_msix() error path
With the addition of the Chelsio quirk we have an error path out of
vfio_early_setup_msix() that doesn't free the allocated VFIOMSIXInfo
struct.  This doesn't introduce a leak as it still gets freed in the
vfio_put_device() path, but it's complicated and sloppy to rely on
that.  Restructure to free the allocated data on error and only link
it into the vdev on success.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2015-09-23 13:04:43 -06:00
Alex Williamson
d451008e0f vfio/pci: Cleanup RTL8168 quirk and tracing
There's quite a bit of cleanup that can be done to the RTL8168 quirk,
as well as the tracing to prevent a spew of uninteresting accesses
for anything else the driver might choose to use the window registers
for besides the MSI-X table.  There should be no functional change,
but it's now possible to get compact and useful traces by enabling
vfio_rtl8168_quirk*, ex:

vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f000
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f000
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0xfee0100c
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f004
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f004
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x0
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f008
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f008
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x49b1
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f00c
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f00c
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x0

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-09-23 13:04:42 -06:00
Gavin Shan
d76548a98f sPAPR: Enable EEH on VFIO PCI device only
This checks if the PCI device retrieved from the PCI device address
is VFIO PCI device when enabling EEH functionality. If it's not
VFIO PCI device, the EEH functonality isn't enabled.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Gavin Shan
47445c80fb sPAPR: Revert don't enable EEH on emulated PCI devices
This reverts commit 7cb18007 ("sPAPR: Don't enable EEH on emulated
PCI devices") as rtas_ibm_set_eeh_option() isn't the right place
to check if there has the corresponding PCI device for the input
address, which can be PE address, not PCI device address.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Thomas Huth
4d9392be6c ppc/spapr: Implement H_RANDOM hypercall in QEMU
The PAPR interface defines a hypercall to pass high-quality
hardware generated random numbers to guests. Recent kernels can
already provide this hypercall to the guest if the right hardware
random number generator is available. But in case the user wants
to use another source like EGD, or QEMU is running with an older
kernel, we should also have this call in QEMU, so that guests that
do not support virtio-rng yet can get good random numbers, too.

This patch now adds a new pseudo-device to QEMU that either
directly provides this hypercall to the guest or is able to
enable the in-kernel hypercall if available. The in-kernel
hypercall can be enabled with the use-kvm property, e.g.:

 qemu-system-ppc64 -device spapr-rng,use-kvm=true

For handling the hypercall in QEMU instead, a "RngBackend" is
required since the hypercall should provide "good" random data
instead of pseudo-random (like from a "simple" library function
like rand() or g_random_int()). Since there are multiple RngBackends
available, the user must select an appropriate back-end via the
"rng" property of the device, e.g.:

 qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=gid0 \
                   -device spapr-rng,rng=gid0 ...

See http://wiki.qemu-project.org/Features-Done/VirtIORNG for
other example of specifying RngBackends.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Thomas Huth
ef001f069e ppc/spapr: Fix buffer overflow in spapr_populate_drconf_memory()
The buffer that is allocated in spapr_populate_drconf_memory()
is used for setting both, the "ibm,dynamic-memory" and the
"ibm,associativity-lookup-arrays" property. However, only the
size of the first one is taken into account when allocating the
memory. So if the length of the second property is larger than
the length of the first one, we run into a buffer overflow here!
Fix it by taking the length of the second property into account,
too.

Fixes: "spapr: Support ibm,dynamic-reconfiguration-memory" patch
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
David Gibson
20bb648dca spapr: Fix default NUMA node allocation for threads
At present, if guest numa nodes are requested, but the cpus in each node
are not specified, spapr just uses the default behaviour or assigning each
vcpu round-robin to nodes.

If smp_threads != 1, that will assign adjacent threads in a core to
different NUMA nodes.  As well as being just weird, that's a configuration
that can't be represented in the device tree we give to the guest, which
means the guest and qemu end up with different ideas of the NUMA topology.

This patch implements mc->cpu_index_to_socket_id in the spapr code to
make sure vcpus get assigned to nodes only at the socket granularity.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2015-09-23 10:51:11 +10:00
Bharata B Rao
0a4178692c spapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type
Till now memory hotplug used RTAS_LOG_V6_HP_ID_DRC_INDEX hotplug type
which meant that we generated one hotplug type of EPOW event for every
256MB (SPAPR_MEMORY_BLOCK_SIZE). This quickly overruns the kernel
rtas log buffer thus resulting in loss of memory hotplug events. Switch
to RTAS_LOG_V6_HP_ID_DRC_COUNT hotplug type for memory so that we
generate only one event per hotplug request.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Bharata B Rao
7a36ae7a9f spapr: Support hotplug by specifying DRC count
Support hotplug identifier type RTAS_LOG_V6_HP_ID_DRC_COUNT that allows
hotplugging of DRCs by specifying the DRC count.

While we are here, rename

spapr_hotplug_req_add_event() to spapr_hotplug_req_add_by_index()
spapr_hotplug_req_remove_event() to spapr_hotplug_req_remove_by_index()

so that they match with spapr_hotplug_req_add_by_count().

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Bharata B Rao
e8f986fc57 spapr: Revert to memory@XXXX representation for non-hotplugged memory
Don't represent non-hotluggable memory under drconf node. With this
we don't have to create DRC objects for them.

The effect of this patch is that we revert back to memory@XXXX representation
for all the memory specified with -m option and represent the cold
plugged memory and hot-pluggable memory under
ibm,dynamic-reconfiguration-memory.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Bharata B Rao
6663864e95 spapr: Populate ibm,associativity-lookup-arrays correctly for non-NUMA
When NUMA isn't configured explicitly, assume node 0 is present for
the purpose of creating ibm,associativity-lookup-arrays property
under ibm,dynamic-reconfiguration-memory DT node. This ensures that
the associativity index property is correctly updated in ibm,dynamic-memory
for the LMB that is hotplugged.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Bharata B Rao
19a35c9e1b spapr: Provide better error message when slots exceed max allowed
Currently when user specifies more slots than allowed max of
SPAPR_MAX_RAM_SLOTS (32), we error out like this:

qemu-system-ppc64: unsupported amount of memory slots: 64

Let the user know about the max allowed slots like this:

qemu-system-ppc64: Specified number of memory slots 64 exceeds max supported 32

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Bharata B Rao
b556854bd8 spapr: Don't allow memory hotplug to memory less nodes
Currently PowerPC kernel doesn't allow hot-adding memory to memory-less
node, but instead will silently add the memory to the first node that has
some memory. This causes two unexpected behaviours for the user.

- Memory gets hotplugged to a different node than what the user specified.
- Since pc-dimm subsystem in QEMU still thinks that memory belongs to
  memory-less node, a reboot will set things accordingly and the previously
  hotplugged memory now ends in the right node. This appears as if some
  memory moved from one node to another.

So until kernel starts supporting memory hotplug to memory-less
nodes, just prevent such attempts upfront in QEMU.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Bharata B Rao
c20d332a85 spapr: Memory hotplug support
Make use of pc-dimm infrastructure to support memory hotplug
for PowerPC.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:10 +10:00
Bharata B Rao
ce881f774d spapr: Make hash table size a factor of maxram_size
The hash table size is dependent on ram_size, but since with hotplug
the memory can grow till maxram_size. Hence make hash table size dependent
on maxram_size.

This allows to hotplug huge amounts of memory to the guest.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:10 +10:00
Bharata B Rao
03d196b7c5 spapr: Support ibm,dynamic-reconfiguration-memory
Parse ibm,architecture.vec table obtained from the guest and enable
memory node configuration via ibm,dynamic-reconfiguration-memory if guest
supports it. This is in preparation to support memory hotplug for
sPAPR guests.

This changes the way memory node configuration is done. Currently all
memory nodes are built upfront. But after this patch, only memory@0 node
for RMA is built upfront. Guest kernel boots with just that and rest of
the memory nodes (via memory@XXX or ibm,dynamic-reconfiguration-memory)
are built when guest does ibm,client-architecture-support call.

Note: This patch needs a SLOF enhancement which is already part of
SLOF binary in QEMU.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:10 +10:00
David Gibson
224245bf52 spapr: Add LMB DR connectors
Enable memory hotplug for pseries 2.4 and add LMB DR connectors.
With memory hotplug, enforce RAM size, NUMA node memory size and maxmem
to be a multiple of SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the
granularity in which LMBs are represented and hot-added.

LMB DR connectors will be used by the memory hotplug code.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
               [spapr_drc_reset implementation]
[since this missed the 2.4 cutoff, changing to only enable for 2.5]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:10 +10:00