Commit Graph

4988 Commits

Author SHA1 Message Date
zhanghailiang
35a6ed4f71 migration: Introduce capability 'x-colo' to migration
We add helper function colo_supported() to indicate whether
colo is supported or not, with which we use to control whether or not
showing 'x-colo' string to users, they can use qmp command
'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
to learn if colo is supported.

The default value for COLO (COarse-Grain LOck Stepping) is disabled.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
Peter Maydell
66a77ea676 ppc patch queue 2016-10-28
This pull request supersedes and extends the one from 2016-10-26
 (which had a build bug).
 
 Highlights:
   * SLOF (pseries guest firmware) update
   * Enable a number of extra testcases on ppc / pseries
   * Added the 'powernv' machine type
     - Almost enough to be minimally usable
     - But still missing necessary interrupt controller updates
   * Cleanup and consolidation of NVRAM handling on several platforms
     with related firmware
   * Substantial cleanup to device tree construction
   * Some more POWER9 instruction emulation
   * Cleanup to handling of pseries option vectors and CAS reboot
     handling (host/guest feature negotiation mechanism)
   * Significant cleanups to handling of PCI devices in test cases
   * New hotplug event infrastructure
   * Memory hot unplug support for pseries
   * Several bug fixes
 
 The NVRAM cleanup affects some Sun sparc platforms as well as ppc
 ones, but have been tested by the sparc maintainer (Mark Cave-Ayland).
 
 The test additions also include substantial general changes to the
 test framework that aren't strictly ppc related.  They don't seem to
 break tests on other platforms, they're for the benefit of enabling
 tests on ppc and there isn't a specific maintainer for them, so
 they're included in this tree.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYEqvPAAoJEGw4ysog2bOSQiwP/jOr5bxwZmGDrOwAIzqhagib
 lo0+W1E6OCttbBhAp1inWZP6RnexAZ7orb+Q7DHQFbtYukUbPYwB/VzmWhaws6XV
 jxK/lVB3A+XlRIEKUUc8bWWGRN+QBMnQIcUhNlKuC4AVKMC1aZY9ZLT6LvilV6X7
 QtxAlBPmI2od2kyDHt/ibG9FkROFMi9ybbQG+D7Pu32NlTPgF06R6NPKtpkjEpUU
 dRYAUB+VTB4eofjzyVqsL+QB7uX5g0V9aPmYWBaXqjTG61ivHMJJ7zHta+GdckJM
 fk3S68ftPmf4EE+uL1Ff2fy+2Sxjh4QeNwxl1rppzrgvW8VkNOX6969idxSERUb8
 I2/RVM7F1mJ7+4fNIjenAru8qu3O981lU9+t7R5mmTcEsSk28FOkOv6Io+6JnGA2
 32qgOXwihsUDaH2pDagZ+ySaOqjWMD9WGQTfQgFMthGkcs6heG7ByvFrcpcacl5a
 kbMl7cj+zkgusLuQHx0dp669R7Ch7bxSigQC11iMCpAmFhXl8qJ37ACPJn8NlzOq
 NJdZwXOp9plYZ70a2CgKVXB6j+jhxOeJHg2v08jfF0rUILovBwH0WOrl1m0fb2gB
 1u+ua2FED+5rtwpGJ7pL/oE20H11QDfHNDvqEpagvHAHSSu5nqGxd/falYRYE59C
 wdMXPqJYQqkSYuA6XkgO
 =3Z5V
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.8-20161028' into staging

ppc patch queue 2016-10-28

This pull request supersedes and extends the one from 2016-10-26
(which had a build bug).

Highlights:
  * SLOF (pseries guest firmware) update
  * Enable a number of extra testcases on ppc / pseries
  * Added the 'powernv' machine type
    - Almost enough to be minimally usable
    - But still missing necessary interrupt controller updates
  * Cleanup and consolidation of NVRAM handling on several platforms
    with related firmware
  * Substantial cleanup to device tree construction
  * Some more POWER9 instruction emulation
  * Cleanup to handling of pseries option vectors and CAS reboot
    handling (host/guest feature negotiation mechanism)
  * Significant cleanups to handling of PCI devices in test cases
  * New hotplug event infrastructure
  * Memory hot unplug support for pseries
  * Several bug fixes

The NVRAM cleanup affects some Sun sparc platforms as well as ppc
ones, but have been tested by the sparc maintainer (Mark Cave-Ayland).

The test additions also include substantial general changes to the
test framework that aren't strictly ppc related.  They don't seem to
break tests on other platforms, they're for the benefit of enabling
tests on ppc and there isn't a specific maintainer for them, so
they're included in this tree.

# gpg: Signature made Fri 28 Oct 2016 02:37:19 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.8-20161028: (73 commits)
  ppc: allow certain HV interrupts to be delivered to guests
  spapr: Memory hot-unplug support
  spapr: use count+index for memory hotplug
  spapr: Add DRC count indexed hotplug identifier type
  spapr: add hotplug interrupt machine options
  spapr_events: add support for dedicated hotplug event source
  spapr: update spapr hotplug documentation
  target-ppc: Add xvcmpnesp, xvcmpnedp instructions
  target-ppc: add xscmp[eq,gt,ge,ne]dp instructions
  tests: Add pseries machine to the prom-env-test, too
  spapr_nvram: Pre-initialize the NVRAM to support the -prom-env parameter
  libqos: Change PCI accessors to take opaque BAR handle
  tests: Don't assume structure of PCI IO base in ahci-test
  tests: Use qpci_mem{read,write} in ivshmem-test
  libqos: Add 64-bit PCI IO accessors
  tests: Clean up IO handling in ide-test
  libqos: Implement mmio accessors in terms of mem{read,write}
  libqos: Add streaming accessors for PCI MMIO
  tests: Adjust tco-test to use qpci_legacy_iomap()
  libqos: Better handling of PCI legacy IO
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-28 16:31:59 +01:00
Peter Maydell
01b601f061 Merge qio 2016/10/27 v1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYEfjrAAoJEL6G67QVEE/fdU4P/i7yBJo436OpkdgeWS8AWuFr
 ptZ+Fj/weGka5GU9E3KQu36kbSgrtfcgwTHphCMXnZ0YCeKQDuM57f7LNiN6qheB
 nqgJvJioLbUvLTQvCHOISM7bWOnYvASBmYtLJFtUcP/jhdOy61KaADnJ+7MbliNv
 yJSW2RN+s/y9nUb+dxEpIXXUVMRa6BX+wHW3O44c1oLn6/Pe20aJeHTyDx3qiBhD
 8RYXUgRZopH2bouBSzXgMQTbn/QMD/dC81WQlHKlt4swffyei2D/1pciOcuc0SXz
 +SZdkTre5JB5Kd6DU8zQ6PrrIt1nPmLSptSyhQvNxm+uWNWHnFcW1s2aYmf/ikjl
 4boW37ayJx09mns8yv7TerzEPbL5qJvVX8Dsnb6telkvrS9hy9S1xuIB5xHbt6/h
 vwFmCdwaZoGpDDaoXRL+9k9TOI9BbEMKX33nAPDqvEXLMIf+og4fmweTKcY4XTRL
 /Fdg1H71v8Ayv+r5TJOKwFg3PNNjnvqkbk1psS+aaW7dup43iaYGIKWy+VFaCufk
 hPXLOtR5lUsYC2qm+nkjPIgoP7D8oZx4AGkCHbYsqzi+l1lynZH3rBIs8ggLr72o
 FFk4g0sNYe1ccAa89jFEgWIQbS0N6ckUXCv12g3eyF/UIC1F35/mGGugSRnTXuc2
 a/WsvgU7pGBrtqXcg7lF
 =gsxL
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-2016-10-27-1' into staging

Merge qio 2016/10/27 v1

# gpg: Signature made Thu 27 Oct 2016 13:54:03 BST
# gpg:                using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/pull-qio-2016-10-27-1:
  main: set names for main loop sources created
  vnc: set name for all I/O channels created
  migration: set name for all I/O channels created
  char: set name for all I/O channels created
  nbd: set name for all I/O channels created
  io: add ability to set a name for IO channels
  io: Add a QIOChannelSocket cleanup test
  io: set LISTEN flag explicitly for listen sockets
  io: Introduce a qio_channel_set_feature() helper
  io: Use qio_channel_has_feature() where applicable
  io: Fix double shift usages on QIOChannel features

Conflicts:
	qemu-char.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-28 15:30:55 +01:00
Peter Maydell
fd209e4a77 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYEm6NAAoJEH3vgQaq/DkO+30P/0n5EYgIZQugvUXJvpu0xx7V
 EDqtgrirRTtKaoApqiY36U3WbWK+1pPdMcdJ2z4bI9+VyYkKxlEisgEeGy2E7S33
 qD1kV+L1NabBV8Ee577wxAZz3xl4MBh1pzCIXsySA2VRNg/W6L8hj4rTmjap1U9p
 ZtbLkmwMpSwTkJxWPG1W+k0klk1tYxmcwsWcCSCuSOTXm/0gBpWdy5gBRuQXVi0l
 DQFlKS6BDlRiCvR4Qix6n0v8VTQfbRMGS40M6tpr3/QH/HvoKhxfTS/g8P72Bk20
 DPNsKF9DBfTY3KCtjcSrPTREaMqFw8VXn5XSw1uE30ALZNHru9PpVS3hbLfGmltB
 HAVANMbqROFvkQghtGWD7f34Oks/bxzLKxEXPAs9stwvthV46KyJsMHuiSbuzJhv
 tOUq0MadEquuVvgDqoRYKrwyYrjsRRZ4z5kDDnOr2iGZK+Mrhq7jBuYuKcvHyQi0
 apd27X4wwQTx/9tavC+ujeuVxAWBlSSP1EVGSiIenlq21cHLowuZdqrt2swAYkCs
 VlUyOzdCO/62SJGcrnrRCj3sKWbPTySnmDZQKrHve4rBzcL28IHCRxIfzbXRBkQI
 kGigceOwIyNW/bnp6rSYoBFKpz1NF2VScr/t5JzknsC8gT/tA0wPDBoIeL/kPVHm
 T/qOTHLDY/fHUNwXOkTe
 =tz25
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging

# gpg: Signature made Thu 27 Oct 2016 22:15:57 BST
# gpg:                using RSA key 0x7DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/ide-pull-request:
  qemu-iotests: Test creating floppy drives
  fdc: Move qdev properties to FloppyDrive
  fdc: Add a floppy drive qdev
  fdc: Add a floppy qbus
  macio: switch over to new byte-aligned DMA helpers
  dma-helpers: explicitly pass alignment into DMA helpers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-28 14:29:50 +01:00
Bharata B Rao
afdbd40356 spapr: Add DRC count indexed hotplug identifier type
Add support for DRC count indexed hotplug ID type which is primarily
needed for memory hot unplug. This type allows for specifying the
number of DRs that should be plugged/unplugged starting from a given
DRC index.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
* updated rtas_event_log_v6_hp to reflect count/index field ordering
  used in PAPR hotplug ACR
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 11:17:35 +11:00
Michael Roth
ffbb1705a3 spapr_events: add support for dedicated hotplug event source
Hotplug events were previously delivered using an EPOW interrupt
and were queued by linux guests into a circular buffer. For traditional
EPOW events like shutdown/resets, this isn't an issue, but for hotplug
events there are cases where this buffer can be exhausted, resulting
in the loss of hotplug events, resets, etc.

Newer-style hotplug event are delivered using a dedicated event source.
We enable this in supported guests by adding standard an additional
event source in the guest device-tree via /event-sources, and, if
the guest advertises support for the newer-style hotplug events,
using the corresponding interrupt to signal the available of
hotplug/unplug events.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 11:17:35 +11:00
Michael Roth
417ece33fc spapr: improve ibm,architecture-vec-5 property handling
ibm,architecture-vec-5 is supposed to encode all option vector 5 bits
negotiated between platform/guest. Currently we hardcode this property
in the boot-time device tree to advertise a single negotiated
capability, "Form 1" NUMA Affinity, regardless of whether or not CAS
has been invoked or that capability has actually been negotiated.

Improve this by generating ibm,architecture-vec-5 based on the full
set of option vector 5 capabilities negotiated via CAS.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:26 +11:00
Michael Roth
6787d27b04 spapr: add option vector handling in CAS-generated resets
In some cases, ibm,client-architecture-support calls can fail. This
could happen in the current code for situations where the modified
device tree segment exceeds the buffer size provided by the guest
via the call parameters. In these cases, QEMU will reset, allowing
an opportunity to regenerate the device tree from scratch via
boot-time handling. There are potentially other scenarios as well,
not currently reachable in the current code, but possible in theory,
such as cases where device-tree properties or nodes need to be removed.

We currently don't handle either of these properly for option vector
capabilities however. Instead of carrying the negotiated capability
beyond the reset and creating the boot-time device tree accordingly,
we start from scratch, generating the same boot-time device tree as we
did prior to the CAS-generated and the same device tree updates as we
did before. This could (in theory) cause us to get stuck in a reset
loop. This hasn't been observed, but depending on the extensiveness
of CAS-induced device tree updates in the future, could eventually
become an issue.

Address this by pulling capability-related device tree
updates resulting from CAS calls into a common routine,
spapr_dt_cas_updates(), and adding an sPAPROptionVector*
parameter that allows us to test for newly-negotiated capabilities.
We invoke it as follows:

1) When ibm,client-architecture-support gets called, we
   call spapr_dt_cas_updates() with the set of capabilities
   added since the previous call to ibm,client-architecture-support.
   For the initial boot, or a system reset generated by something
   other than the CAS call itself, this set will consist of *all*
   options supported both the platform and the guest. For calls
   to ibm,client-architecture-support immediately after a CAS-induced
   reset, we call spapr_dt_cas_updates() with only the set
   of capabilities added since the previous call, since the other
   capabilities will have already been addressed by the boot-time
   device-tree this time around. In the unlikely event that
   capabilities are *removed* since the previous CAS, we will
   generate a CAS-induced reset. In the unlikely event that we
   cannot fit the device-tree updates into the buffer provided
   by the guest, well generate a CAS-induced reset.

2) When a CAS update results in the need to reset the machine and
   include the updates in the boot-time device tree, we call the
   spapr_dt_cas_updates() using the full set of negotiated
   capabilities as part of the reset path. At initial boot, or after
   a reset generated by something other than the CAS call itself,
   this set will be empty, resulting in what should be the same
   boot-time device-tree as we generated prior to this patch. For
   CAS-induced reset, this routine will be called with the full set of
   capabilities negotiated by the platform/guest in the previous
   CAS call, which should result in CAS updates from previous call
   being accounted for in the initial boot-time device tree.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Changed an int -> bool conversion to be more explicit]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:26 +11:00
Michael Roth
facdb8b63b spapr_hcall: use spapr_ovec_* interfaces for CAS options
Currently we access individual bytes of an option vector via
ldub_phys() to test for the presence of a particular capability
within that byte. Currently this is only done for the "dynamic
reconfiguration memory" capability bit. If that bit is present,
we pass a boolean value to spapr_h_cas_compose_response()
to generate a modified device tree segment with the additional
properties required to enable this functionality.

As more capability bits are added, will would need to modify the
code to add additional option vector accesses and extend the
param list for spapr_h_cas_compose_response() to include similar
boolean values for these parameters.

Avoid this by switching to spapr_ovec_* helpers so we can do all
the parsing in one shot and then test for these additional bits
within spapr_h_cas_compose_response() directly.

Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:26 +11:00
Michael Roth
b20b7b7add spapr_ovec: initial implementation of option vector helpers
PAPR guests advertise their capabilities to the platform by passing
an ibm,architecture-vec structure via an
ibm,client-architecture-support hcall as described by LoPAPR v11,
B.6.2.3. during early boot.

Using this information, the platform enables the capabilities it
supports, then encodes a subset of those enabled capabilities (the
5th option vector of the ibm,architecture-vec structure passed to
ibm,client-architecture-support) into the guest device tree via
"/chosen/ibm,architecture-vec-5".

The logical format of these these option vectors is a bit-vector,
where individual bits are addressed/documented based on the byte-wise
offset from the beginning of the bit-vector, followed by the bit-wise
index starting from the byte-wise offset. Thus the bits of each of
these bytes are stored in reverse order. Additionally, the first
byte of each option vector is encodes the length of the option vector,
so byte offsets begin at 1, and bit offset at 0.

This is not very intuitive for the purposes of mapping these bits to
a particular documented capability, so this patch introduces a set
of abstractions that encapsulate the work of parsing/encoding these
options vectors and testing for individual capabilities.

Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[dwg: Tweaked double-include protection to not trigger a checkpatch
 false positive]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:26 +11:00
David Gibson
398a0bd5ae pseries: Remove spapr_create_fdt_skel()
For historical reasons construction of the guest device tree in spapr is
divided between spapr_create_fdt_skel() which is called at init time, and
spapr_build_fdt() which runs at reset time.  Over time, more and more
things have needed to be moved to reset time.

Previous cleanups mean the only things left in spapr_create_fdt_skel() are
the properties of the root node itself.  Finish consolidating these two
parts of device tree construction, by moving this to the start of
spapr_build_fdt(), and removing spapr_create_fdt_skel() entirely.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson
bf5a6696ba pseries: Consolidate construction of /vdevice device tree node
Construction of the /vdevice node (and its children) is divided between
spapr_create_fdt_skel() (at init time), which creates the base node, and
spapr_populate_vdevice() (at reset time) which creates the nodes for each
individual virtual device.

This consolidates both into a single function called from
spapr_build_fdt().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson
ffb1e275a6 pseries: Move /event-sources construction to spapr_build_fdt()
The /event-sources device tree node is built from spapr_create_fdt_skel().
As part of consolidating device tree construction to reset time, this moves
it to spapr_build_fdt().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson
3f5dabceba pseries: Consolidate construction of /rtas device tree node
For historical reasons construction of the /rtas node in the device
tree (amongst others) is split into several places.  In particular
it's split between spapr_create_fdt_skel(), spapr_build_fdt() and
spapr_rtas_device_tree_setup().

In fact, as well as adding the actual RTAS tokens to the device tree,
spapr_rtas_device_tree_setup() just adds the ibm,lrdr-capacity
property, which despite going in the /rtas node, doesn't have a lot to
do with RTAS.

This patch consolidates the code constructing /rtas together into a new
spapr_dt_rtas() function.  spapr_rtas_device_tree_setup() is renamed to
spapr_dt_rtas_tokens() and now only adds the token properties.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson
7c866c6a60 pseries: Consolidate construction of /chosen device tree node
For historical reasons, building the /chosen node in the guest device tree
is split across several places and includes both parts which write the DT
sequentially and others which use random access functions.

This patch consolidates construction of the node into one place, using
random access functions throughout.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson
9b9a19080a pseries: Move construction of /interrupt-controller fdt node
Currently the device tree node for the XICS interrupt controller is in
spapr_create_fdt_skel().  As part of consolidating device tree construction
to reset time, this moves it to a function called from spapr_build_fdt().

In addition we move the actual code into hw/intc/xics_spapr.c with the
rest of the PAPR specific interrupt controller code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson
2cac78c12a pseries: Consolidate RTAS loading
At each system reset, the pseries machine needs to load RTAS (the runtime
portion of the guest firmware) into the VM.  This means copying
the actual RTAS code into guest memory, and also updating the device
tree so that the guest OS and boot firmware can locate it.

For historical reasons the copy and update to the device tree were in
different parts of the code.  This cleanup brings them both together in
an spapr_load_rtas() function.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:26 +11:00
David Gibson
a19f7fb045 pseries: Make spapr_create_fdt_skel() get information from machine state
Currently spapr_create_fdt_skel() takes a bunch of individual parameters
for various things it will put in the device tree.  Some of these can
already be taken directly from sPAPRMachineState.  This patch alters it so
that all of them can be taken from there, which will allow this code to
be moved away from its current caller in future.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:25 +11:00
David Gibson
cae172ab6d pseries: Remove rtas_addr and fdt_addr fields from machinestate
These values are used only within ppc_spapr_reset(), so just change them
to local variables.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-28 09:38:25 +11:00
Cédric Le Goater
3495b6b610 ppc/pnv: add a ISA bus
As Qemu only supports a single instance of the ISA bus, we use the LPC
controller of chip 0 to create one and plug in a couple of useful
devices, like an UART and RTC. An IPMI BT device, which is also an ISA
device, can be defined on the command line to connect an external BMC.
That is for later.

The PowerNV machine now has a console. Skiboot should load a kernel
and jump into it but execution will stop quite early because we lack a
model for the native XICS controller for the moment :

    [    0.000000] NR_IRQS:512 nr_irqs:512 16
    [    0.000000] XICS: Cannot find a Presentation Controller !
    [    0.000000] ------------[ cut here ]------------
    [    0.000000] WARNING: at arch/powerpc/platforms/powernv/setup.c:81
    ...
    [    0.000000] NIP [c00000000079d65c] pnv_init_IRQ+0x30/0x44

You can still do a few things under xmon.

Based on previous work from :
      Benjamin Herrenschmidt <benh@kernel.crashing.org>

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Trivial fix for a change in the serial_hds_isa_init() interface]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Benjamin Herrenschmidt
a3980bf517 ppc/pnv: add a LPC controller
The LPC (Low Pin Count) interface on a POWER8 is made accessible to
the system through the ADU (XSCOM interface). This interface is part
of set of units connected together via a local OPB (On-Chip Peripheral
Bus) which act as a bridge between the ADU and the off chip LPC
endpoints, like external flash modules.

The most important units of this OPB are :
 - OPB Master: contains the ADU slave logic, a set of internal
   registers and the logic to control the OPB.
 - LPCHC (LPC HOST Controller): which implements a OPB Slave, a set of
   internal registers and the LPC HOST Controller to control the LPC
   interface.

Four address spaces are provided to the ADU :
 - LPC Bus Firmware Memory
 - LPC Bus Memory
 - LPC Bus I/O (ISA bus)
 - and the registers for the OPB Master and the LPC Host Controller

On POWER8, an intermediate hop is necessary to reach the OPB, through
a unit called the ECCB. OPB commands are simply mangled in ECCB write
commands.

On POWER9, the OPB master address space can be accessed via MMIO. The
logic is same but the code will be simpler as the XSCOM and ECCB hops
are not necessary anymore.

This version of the LPC controller model doesn't yet implement support
for the SerIRQ deserializer present in the Naples version of the chip
though some preliminary work is there.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.7
      - ported on latest PowerNV patchset
      - changed the XSCOM interface to fit new model
      - QOMified the model
      - moved the ISA hunks in another patch
      - removed printf logging
      - added a couple of UNIMP logging
      - rewrote commit log ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater
24ece07250 ppc/pnv: add XSCOM handlers to PnvCore
Now that we are using real HW ids for the cores in PowerNV chips, we
can route the XSCOM accesses to them. We just need to attach a
specific XSCOM memory region to each core in the appropriate window
for the core number.

To start with, let's install the DTS (Digital Thermal Sensor) handlers
which should return 38°C for each core.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater
967b75230b ppc/pnv: add XSCOM infrastructure
On a real POWER8 system, the Pervasive Interconnect Bus (PIB) serves
as a backbone to connect different units of the system. The host
firmware connects to the PIB through a bridge unit, the
Alter-Display-Unit (ADU), which gives him access to all the chiplets
on the PCB network (Pervasive Connect Bus), the PIB acting as the root
of this network.

XSCOM (serial communication) is the interface to the sideband bus
provided by the POWER8 pervasive unit to read and write to chiplets
resources. This is needed by the host firmware, OPAL and to a lesser
extent, Linux. This is among others how the PCI Host bridges get
configured at boot or how the LPC bus is accessed.

To represent the ADU of a real system, we introduce a specific
AddressSpace to dispatch XSCOM accesses to the targeted chiplets. The
translation of an XSCOM address into a PCB register address is
slightly different between the P9 and the P8. This is handled before
the dispatch using a 8byte alignment for all.

To customize the device tree, a QOM InterfaceClass, PnvXScomInterface,
is provided with a populate() handler. The chip populates the device
tree by simply looping on its children. Therefore, each model needing
custom nodes should not forget to declare itself as a child at
instantiation time.

Based on previous work done by :
      Benjamin Herrenschmidt <benh@kernel.crashing.org>

Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Added cpu parameter to xscom_complete()]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater
d2fd9612ee ppc/pnv: add a PnvCore object
This is largy inspired by sPAPRCPUCore with some simplification, no
hotplug for instance. A set of PnvCore objects is added to the PnvChip
and the device tree is populated looping on these cores.

Real HW cpu ids are now generated depending on the chip cpu model, the
chip id and a core mask. The id is propagated to the CPU object, using
properties, to set the SPR_PIR (Processor Identification Register)

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater
631adaff31 ppc/pnv: add a PIR handler to PnvChip
The Processor Identification Register (PIR) is a register that holds a
processor identifier which is used for bus transactions (XSCOM) and
for processor differentiation in multiprocessor systems. It also used
in the interrupt vector entries (IVE) to identify the thread serving
the interrupts.

P9 and P8 have some differences in the CPU PIR encoding.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater
397a79e757 ppc/pnv: add a core mask to PnvChip
This will be used to build real HW ids for the cores and enforce some
limits on the available cores per chip.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater
e997040e3f ppc/pnv: add a PnvChip object
This is is an abstraction of a POWER8 chip which is a set of cores
plus other 'units', like the pervasive unit, the interrupt controller,
the memory controller, the on-chip microcontroller, etc. The whole can
be seen as a socket. It depends on a cpu model and its characteristics:
max cores and specific inits are defined in a PnvChipClass.

We start with an near empty PnvChip with only a few cpu constants
which we will grow in the subsequent patches with the controllers
required to run the system.

The Chip CFAM (Common FRU Access Module) ID gives the model of the
chip and its version number. It is generally the first thing firmwares
fetch, available at XSCOM PCB address 0xf000f, to start initialization.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Benjamin Herrenschmidt
9e933f4a62 ppc/pnv: add skeleton PowerNV platform
The goal is to emulate a PowerNV system at the level of the skiboot
firmware, which loads the OS and provides some runtime services. Power
Systems have a lower firmware (HostBoot) that does low level system
initialization, like DRAM training. This is beyond the scope of what
qemu will address in a PowerNV guest.

No devices yet, not even an interrupt controller. Just to get started,
some RAM to load the skiboot firmware, the kernel and initrd. The
device tree is fully created in the machine reset op.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.7
      - replaced fprintf by error_report
      - used a common definition of _FDT macro
      - removed VMStateDescription as migration is not yet supported
      - added IBM Copyright statements
      - reworked kernel_filename handling
      - merged PnvSystem and sPowerNVMachineState
      - removed PHANDLE_XICP
      - added ppc_create_page_sizes_prop helper
      - removed nmi support
      - removed kvm support
      - updated powernv machine to version 2.8
      - removed chips and cpus, They will be provided in another patches
      - added a machine reset routine to initialize the device tree (also)
      - french has a squelette and english a skeleton.
      - improved commit log.
      - reworked prototypes parameters
      - added a check on the ram size (thanks to Michael Ellerman)
      - fixed chip-id cell
      - changed MAX_CPUS to 2048
      - simplified memory node creation to one node only
      - removed machine version
      - rewrote the device tree creation with the fdt "rw" routines
      - s/sPowerNVMachineState/PnvMachineState/
      - etc.]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:24 +11:00
David Gibson
e763da2344 pseries: Remove unused callbacks from sPAPR VIO bus state
The original QOMification of the spapr VIO devices in 3954d33 "spapr:
convert to QEMU Object Model (v2)" moved some callbacks from the
VIOsPAPRBus structure to the VIOsPAPRDeviceClass.  Except, that it
forgot to actually remove them from the VIOsPAPRBus structure (which
still exists, though it doesn't fulfill quite the same function as it
did pre-QOM).

This patch removes those now unused callback fields.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-10-28 09:36:58 +11:00
Cédric Le Goater
e3403258a2 ppc/xics: change the icp_ routines API to use an 'ICPState *' argument
The routines :

	void icp_set_cppr(ICPState *icp, uint8_t cppr);
	void icp_set_mfrr(ICPState *icp, uint8_t mfrr);
	void icp_eoi(ICPState *icp, uint32_t xirr);

now use one 'ICPState *icp' argument instead of a 'XICSState *' and a
server arguments. The backlink on XICSState* is used whenever needed.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Cédric Le Goater
d49c603b37 ppc/xics: add a XICSState backlink in ICPState
The link will be used to change the API of the icp_* routines which
are still using an XICSState as an argument.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Cédric Le Goater
2bb0d10aeb ppc/xics: add a xics_set_nr_servers common routine
xics_spapr and xics_kvm nearly define the same 'set_nr_servers'
handler. Only the type of the ICP differs. So let's make a common one
to remove some duplicated code.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Thomas Huth
c6363bae17 nvram: Rename openbios_firmware_abi.h into sun_nvram.h
The header now only contains inline functions related to the
Sun NVRAM, so the a name like sun_nvram.h seems to be more
appropriate now.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Thomas Huth
ad723fe5a0 nvram: Move the remaining CHRP NVRAM related code to chrp_nvram.[ch]
Everything that is related to CHRP NVRAM should rather reside in
chrp_nvram.c / chrp_nvram.h instead of openbios_firmware_abi.h.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Thomas Huth
55d9950aaa nvram: Introduce helper functions for CHRP "system" and "free space" partitions
The "system partition" and "free space" partition layouts are
defined by the CHRP and LoPAPR specification, and used by
OpenBIOS and SLOF. We can re-use this code for other machines
that use OpenBIOS and SLOF, too. So let's make this code independent
from the MAC NVRAM environment and put it into two proper helper
functions.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Mark Cave-Ayland
99868af3d0 dma-helpers: explicitly pass alignment into DMA helpers
The hard-coded default alignment is BDRV_SECTOR_SIZE, however this is not
necessarily the case for all platforms. Use this as the default alignment for
all current callers.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-id: 1476445266-27503-2-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: John Snow <jsnow@redhat.com>
2016-10-27 16:29:13 -04:00
Kevin Wolf
cbc14ac9c3 block: Remove bdrv_aio_ioctl()
It is unused now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-10-27 19:05:23 +02:00
Kevin Wolf
16a389dc9e block: Introduce .bdrv_co_ioctl() driver callback
This allows drivers to implement ioctls in a coroutine-based way.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-10-27 19:05:23 +02:00
Kevin Wolf
61b2450414 block: Remove bdrv_ioctl()
It is unused now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-10-27 19:05:23 +02:00
Kevin Wolf
48af776a5b block: Use blk_co_ioctl() for all BB level ioctls
All read/write functions already have a single coroutine-based function
on the BlockBackend level through which all requests go (no matter what
API style the external caller used) and which passes the requests down
to the block node level.

This patch exports a bdrv_co_ioctl() function and uses it to extend this
mode of operation to ioctls.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-10-27 19:05:22 +02:00
Kevin Wolf
7381e95cc2 block: Remove bdrv_aio_pdiscard()
It is unused now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-10-27 19:05:22 +02:00
Daniel P. Berrange
e93a68e102 char: set name for all I/O channels created
Ensure that all I/O channels created for character devices
are given names to distinguish their respective roles.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-27 09:13:10 +02:00
Daniel P. Berrange
20f4aa265e io: add ability to set a name for IO channels
The GSource object has ability to have a name, which is useful
when debugging performance problems with the mainloop event
callbacks that take too long. By associating a name with a
QIOChannel object, we can then set the name on any GSource
associated with the channel.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-27 09:13:10 +02:00
Felipe Franciosi
d8d3c7cc67 io: Introduce a qio_channel_set_feature() helper
Testing QIOChannel feature support can be done with a helper called
qio_channel_has_feature(). Setting feature support, however, was
done manually with a logical OR. This patch introduces a new helper
called qio_channel_set_feature() and makes use of it where applicable.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-26 18:19:53 +02:00
Felipe Franciosi
8fbf661212 io: Fix double shift usages on QIOChannel features
When QIOChannels were introduced in 666a3af9, the feature bits were
already defined shifted. However, when using them, the code was shifting
them again. The incorrect use was consistent until 74b6ce43, where
QIO_CHANNEL_FEATURE_LISTEN was defined shifted but tested unshifted.

This patch changes the definition to be unshifted and fixes the
incorrect usage introduced on 74b6ce43.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-26 18:19:53 +02:00
Richard Henderson
7ebee43ee3 tcg: Add atomic128 helpers
Force the use of cmpxchg16b on x86_64.

Wikipedia suggests that only very old AMD64 (circa 2004) did not have
this instruction.  Further, it's required by Windows 8 so no new cpus
will ever omit it.

If we truely care about these, then we could check this at startup time
and then avoid executing paths that use it.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:01 -07:00
Richard Henderson
fdbc2b5722 tcg: Add EXCP_ATOMIC
When we cannot emulate an atomic operation within a parallel
context, this exception allows us to stop the world and try
again in a serial context.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson
1edaeee095 int128: Add int128_make128
Allows Int128 to be used more generally, rather than having to
begin with 64-bit inputs and accumulate.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson
0846beb366 int128: Use __int128 if available
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Richard Henderson
258dfaaad0 exec: Avoid direct references to Int128 parts
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00