Commit Graph

67662 Commits

Author SHA1 Message Date
Peter Maydell
86c7e2f4a9 Add a standard authorization framework
The current network services now support encryption via TLS and in some
 cases support authentication via SASL. In cases where SASL is not
 available, x509 client certificates can be used as a crude authorization
 scheme, but using a sub-CA and controlling who you give certs to. In
 general this is not very flexible though, so this series introduces a
 new standard authorization framework.
 
 It comes with four initial authorization mechanisms
 
  - Simple - an exact username match. This is useful when there is
    exactly one user that is known to connect. For example when live
    migrating from one QEMU to another with TLS, libvirt would use
    the simple scheme to whitelist the TLS cert of the source QEMU.
 
  - List - an full access control list, with optional regex matching.
    This is more flexible and is used to provide 100% backcompat with
    the existing HMP ACL commands. The caveat is that we can't create
    these via the CLI -object arg yet.
 
  - ListFile - the same as List, but with the rules stored in JSON
    format in an external file. This avoids the -object limitation
    while also allowing the admin to change list entries on the file.
    QEMU uses inotify to notice these changes and auto-reload the
    file contents. This is likely a good default choice for most
    network services, if the "simple" mechanism isn't sufficient.
 
  - PAM - delegate the username lookup to a PAM module, which opens
    the door to many options including things like SQL/LDAP lookups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJcdVxaAAoJEL6G67QVEE/fgK4P/jxF1aTzuc0N0jD7myNi2LE4
 aUqXueawNLrZKk8ZZQtEy/71/fTNJm9fVwxN9zIWITb4jTj4c3hqUnK1iVtEXXOR
 e0qZdKAsVAD2cpj9PBaihn84f+hrSF1ZnuMqtt4/kfOE9QPfLRpbqCNe+7JX5Z7z
 Cw8k+ATvm6Xc5QVEkc+8JTMft9lQaOdRJvldR2CVSqeFgIZWIrQ2NIG44m2GqO6b
 wqqUgmaMHd4cGJ4fgTn8G74GNzp4hpXzgTC9P881zC8qc+dKKJasTq1jSR3eWYKJ
 aE4Y+4EDjH/a0sP/zf97OfDzb/mGEpw8ng00g3o0qphFmSWMHVcoghB6wUZZcZGu
 VcGyoH+ztDdIcoD9HwVbkzMvYt9qrtMNrMUf3p1DRRNrrwcopSYnG6gZ9ArckgD4
 2VzxLrBbfS9tsWGAvOw/+c4SIxEzGQ0mcqHQeZONxLAi+zwkzuKVu973JBkvCBus
 aXpct8B2aMMxZeJ1ipWksj5PvAAEjhOquDqZrUtRiP7vg4156UiAhxW9x6hK//O1
 pof4jVBPZSYgrt3+PjADwK1fmPv01KkNa1dVi/Ew7fBJjGQNfJ12JlAYdSeoGG4i
 ds4OBbQH+Ko40++ZGu+n7xlk/0GQEw771c2GNDoA7ehHPg++S2yvrRoVVneyicur
 RYzwaPQ1nYrGAo3fAJEd
 =67pz
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/berrange/tags/authz-core-pull-request' into staging

Add a standard authorization framework

The current network services now support encryption via TLS and in some
cases support authentication via SASL. In cases where SASL is not
available, x509 client certificates can be used as a crude authorization
scheme, but using a sub-CA and controlling who you give certs to. In
general this is not very flexible though, so this series introduces a
new standard authorization framework.

It comes with four initial authorization mechanisms

 - Simple - an exact username match. This is useful when there is
   exactly one user that is known to connect. For example when live
   migrating from one QEMU to another with TLS, libvirt would use
   the simple scheme to whitelist the TLS cert of the source QEMU.

 - List - an full access control list, with optional regex matching.
   This is more flexible and is used to provide 100% backcompat with
   the existing HMP ACL commands. The caveat is that we can't create
   these via the CLI -object arg yet.

 - ListFile - the same as List, but with the rules stored in JSON
   format in an external file. This avoids the -object limitation
   while also allowing the admin to change list entries on the file.
   QEMU uses inotify to notice these changes and auto-reload the
   file contents. This is likely a good default choice for most
   network services, if the "simple" mechanism isn't sufficient.

 - PAM - delegate the username lookup to a PAM module, which opens
   the door to many options including things like SQL/LDAP lookups.

# gpg: Signature made Tue 26 Feb 2019 15:33:46 GMT
# gpg:                using RSA key BE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full]
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>" [full]
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/authz-core-pull-request:
  authz: delete existing ACL implementation
  authz: add QAuthZPAM object type for authorizing using PAM
  authz: add QAuthZListFile object type for a file access control list
  authz: add QAuthZList object type for an access control list
  authz: add QAuthZSimple object type for easy whitelist auth checks
  authz: add QAuthZ object as an authorization base class
  hw/usb: switch MTP to use new inotify APIs
  hw/usb: fix const-ness for string params in MTP driver
  hw/usb: don't set IN_ISDIR for inotify watch in MTP driver
  qom: don't require user creatable objects to be registered
  util: add helper APIs for dealing with inotify in portable manner

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-02-26 17:59:41 +00:00
Daniel P. Berrangé
3e6f45446b iotests: avoid broken pipe with certtool
When we run "certtool 2>&1 | head -1" the latter command is likely to
complete and exit before certtool has written everything it wants to
stderr. In at least the RHEL-7 gnutls 3.3.29 this causes certtool to
quit with broken pipe before it has finished writing the desired
output file to disk. This causes non-deterministic failures of the
iotest 233 because the certs are sometimes zero length files.
If certtool fails the "head -1" means we also lose any useful error
message it would have printed.

Thus this patch gets rid of the pipe and post-processes the output in a
more flexible & reliable manner.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190220145819.30969-3-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2019-02-26 10:45:37 -06:00
Daniel P. Berrangé
84f8b840a2 iotests: ensure we print nbd server log on error
If we abort the iotest early the server.log file might contain useful
information for diagnosing the problem. Ensure its contents are
displayed in this case.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190220145819.30969-2-berrange@redhat.com>
[eblake: fix shell quoting]
Signed-off-by: Eric Blake <eblake@redhat.com>
2019-02-26 10:45:37 -06:00
Andrey Shinkevich
42eb4591fe iotests: handle TypeError for Python 3 in test 242
The data type for bytes in Python 3 differs from the one in Python 2.
The type cast that is compatible with both versions was applied.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reported-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1551197495-24425-1-git-send-email-andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2019-02-26 10:37:06 -06:00
Daniel P. Berrange
b76806d4ec authz: delete existing ACL implementation
The 'qemu_acl' type was a previous non-QOM based attempt to provide an
authorization facility in QEMU. Because it is non-QOM based it cannot be
created via the command line and requires special monitor commands to
manipulate it.

The new QAuthZ subclasses provide a superset of the functionality in
qemu_acl, so the latter can now be deleted. The HMP 'acl_*' monitor
commands are converted to use the new QAuthZSimple data type instead
in order to provide temporary backwards compatibility.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2019-02-26 15:32:19 +00:00
Daniel P. Berrange
8953caf3cd authz: add QAuthZPAM object type for authorizing using PAM
Add an authorization backend that talks to PAM to check whether the user
identity is allowed. This only uses the PAM account validation facility,
which is essentially just a check to see if the provided username is permitted
access. It doesn't use the authentication or session parts of PAM, since
that's dealt with by the relevant part of QEMU (eg VNC server).

Consider starting QEMU with a VNC server and telling it to use TLS with
x509 client certificates and configuring it to use an PAM to validate
the x509 distinguished name. In this example we're telling it to use PAM
for the QAuthZ impl with a service name of "qemu-vnc"

 $ qemu-system-x86_64 \
     -object tls-creds-x509,id=tls0,dir=/home/berrange/security/qemutls,\
             endpoint=server,verify-peer=yes \
     -object authz-pam,id=authz0,service=qemu-vnc \
     -vnc :1,tls-creds=tls0,tls-authz=authz0

This requires an /etc/pam/qemu-vnc file to be created with the auth
rules. A very simple file based whitelist can be setup using

  $ cat > /etc/pam/qemu-vnc <<EOF
  account         requisite       pam_listfile.so item=user sense=allow file=/etc/qemu/vnc.allow
  EOF

The /etc/qemu/vnc.allow file simply contains one username per line. Any
username not in the file is denied. The usernames in this example are
the x509 distinguished name from the client's x509 cert.

  $ cat > /etc/qemu/vnc.allow <<EOF
  CN=laptop.berrange.com,O=Berrange Home,L=London,ST=London,C=GB
  EOF

More interesting would be to configure PAM to use an LDAP backend, so
that the QEMU authorization check data can be centralized instead of
requiring each compute host to have file maintained.

The main limitation with this PAM module is that the rules apply to all
QEMU instances on the host. Setting up different rules per VM, would
require creating a separate PAM service name & config file for every
guest. An alternative approach for the future might be to not pass in
the plain username to PAM, but instead combine the VM name or UUID with
the username. This requires further consideration though.

Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2019-02-26 15:32:19 +00:00
Daniel P. Berrangé
55d869846d authz: add QAuthZListFile object type for a file access control list
Add a QAuthZListFile object type that implements the QAuthZ interface. This
built-in implementation is a proxy around the QAuthZList object type,
initializing it from an external file, and optionally, automatically
reloading it whenever it changes.

To create an instance of this object via the QMP monitor, the syntax
used would be:

      {
        "execute": "object-add",
        "arguments": {
          "qom-type": "authz-list-file",
          "id": "authz0",
          "props": {
            "filename": "/etc/qemu/vnc.acl",
	    "refresh": true
          }
        }
      }

If "refresh" is "yes", inotify is used to monitor the file,
automatically reloading changes. If an error occurs during reloading,
all authorizations will fail until the file is next successfully
loaded.

The /etc/qemu/vnc.acl file would contain a JSON representation of a
QAuthZList object

    {
      "rules": [
         { "match": "fred", "policy": "allow", "format": "exact" },
         { "match": "bob", "policy": "allow", "format": "exact" },
         { "match": "danb", "policy": "deny", "format": "glob" },
         { "match": "dan*", "policy": "allow", "format": "exact" },
      ],
      "policy": "deny"
    }

This sets up an authorization rule that allows 'fred', 'bob' and anyone
whose name starts with 'dan', except for 'danb'. Everyone unmatched is
denied.

The object can be loaded on the comand line using

   -object authz-list-file,id=authz0,filename=/etc/qemu/vnc.acl,refresh=yes

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-02-26 15:32:18 +00:00
Daniel P. Berrange
c8c99887d1 authz: add QAuthZList object type for an access control list
Add a QAuthZList object type that implements the QAuthZ interface. This
built-in implementation maintains a trivial access control list with a
sequence of match rules and a final default policy. This replicates the
functionality currently provided by the qemu_acl module.

To create an instance of this object via the QMP monitor, the syntax
used would be:

  {
    "execute": "object-add",
    "arguments": {
      "qom-type": "authz-list",
      "id": "authz0",
      "props": {
        "rules": [
           { "match": "fred", "policy": "allow", "format": "exact" },
           { "match": "bob", "policy": "allow", "format": "exact" },
           { "match": "danb", "policy": "deny", "format": "glob" },
           { "match": "dan*", "policy": "allow", "format": "exact" },
        ],
        "policy": "deny"
      }
    }
  }

This sets up an authorization rule that allows 'fred', 'bob' and anyone
whose name starts with 'dan', except for 'danb'. Everyone unmatched is
denied.

It is not currently possible to create this via -object, since there is
no syntax supported to specify non-scalar properties for objects. This
is likely to be addressed by later support for using JSON with -object,
or an equivalent approach.

In any case the future "authz-listfile" object can be used from the
CLI and is likely a better choice, as it allows the ACL to be refreshed
automatically on change.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2019-02-26 15:32:18 +00:00
Daniel P. Berrangé
fb5c4ebc08 authz: add QAuthZSimple object type for easy whitelist auth checks
In many cases a single VM will just need to whitelist a single identity
as the allowed user of network services. This is especially the case for
TLS live migration (optionally with NBD storage) where we just need to
whitelist the x509 certificate distinguished name of the source QEMU
host.

Via QMP this can be configured with:

  {
    "execute": "object-add",
    "arguments": {
      "qom-type": "authz-simple",
      "id": "authz0",
      "props": {
        "identity": "fred"
      }
    }
  }

Or via the command line

  -object authz-simple,id=authz0,identity=fred

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2019-02-26 15:25:58 +00:00
Daniel P. Berrange
5b76dd132c authz: add QAuthZ object as an authorization base class
The current qemu_acl module provides a simple access control list
facility inside QEMU, which is used via a set of monitor commands
acl_show, acl_policy, acl_add, acl_remove & acl_reset.

Note there is no ability to create ACLs - the network services (eg VNC
server) were expected to create ACLs that they want to check.

There is also no way to define ACLs on the command line, nor potentially
integrate with external authorization systems like polkit, pam, ldap
lookup, etc.

The QAuthZ object defines a minimal abstract QOM class that can be
subclassed for creating different authorization providers.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2019-02-26 15:25:58 +00:00
Daniel P. Berrangé
47287c27d0 hw/usb: switch MTP to use new inotify APIs
The internal inotify APIs allow a lot of conditional statements to be
cleared out, and provide a simpler callback for handling events.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-02-26 15:25:58 +00:00
Daniel P. Berrangé
888e0359bf hw/usb: fix const-ness for string params in MTP driver
Various functions accepting 'char *' string parameters were missing
'const' qualifiers.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-02-26 15:25:58 +00:00
Daniel P. Berrangé
3c48baf1d4 hw/usb: don't set IN_ISDIR for inotify watch in MTP driver
IN_ISDIR is not a bit that one can request when registering a
watch with inotify_add_watch. Rather it is a bit that is set
automatically when reading events from the kernel.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-02-26 15:25:58 +00:00
Daniel P. Berrangé
6134d7522e qom: don't require user creatable objects to be registered
When an object is in turn owned by another user object, it is not
desirable to expose this in the QOM object hierarchy. It is just an
internal implementation detail, we should be free to change without
exposure to apps managing QEMU.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-02-26 15:25:58 +00:00
Daniel P. Berrangé
90e33dfec6 util: add helper APIs for dealing with inotify in portable manner
The inotify userspace API for reading events is quite horrible, so it is
useful to wrap it in a more friendly API to avoid duplicating code
across many users in QEMU. Wrapping it also allows introduction of a
platform portability layer, so that we can add impls for non-Linux based
equivalents in future.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-02-26 15:25:58 +00:00
Alex Bennée
bf30e8662c tests/Makefile.include: test all rounding modes of softfloat
We missed a bug in a recent patch as we were not testing all the
rounding modes for all operations. However enabling all rounding modes
for mulAdd does slow down the already slowest test and doesn't really
buy us much additional coverage so lets allow the default test flags
to be overridden.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-02-26 14:08:03 +00:00
Richard Henderson
5d64abb32f softfloat: Support float_round_to_odd more places
Previously this was only supported for roundAndPackFloat64.

New support in round_canonical, round_to_int, float128_round_to_int,
roundAndPackFloat32, roundAndPackInt32, roundAndPackInt64,
roundAndPackUint64.  This does not include any of the floatx80 routines,
as we do not have users for that rounding mode there.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190215170225.15537-1-richard.henderson@linaro.org>
Tested-by: David Hildenbrand <david@redhat.com>
[AJB: add missing break]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-02-26 14:08:03 +00:00
Alex Bennée
dc3f8a9dcf tests/fp: enable f128_to_ui[32/64] tests in float-to-uint
We've just added f128_to_ui32 and we missed out the f128_to_ui64 tests
last time.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-02-26 14:07:34 +00:00
Alex Bennée
80d491fea3 tests/fp: add wrapping for f128_to_ui32
Needed to test: softfloat: Implement float128_to_uint32

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-02-26 14:05:19 +00:00
David Hildenbrand
e45de9922e softfloat: Implement float128_to_uint32
Handling it just like float128_to_uint32_round_to_zero, that hopefully
is free of bugs :)

Documentation basically copied from float128_to_uint64

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-02-26 14:05:19 +00:00
David Hildenbrand
4739318160 softfloat: add float128_is_{normal,denormal}
Needed on s390x, to test for the data class of a number. So it will
gain soon a user.

A number is considered normal if the exponent is neither 0 nor all 1's.
That can be checked by adding 1 to the exponent, and comparing against
>= 2 after dropping an eventual overflow into the sign bit.

While at it, convert the other floatXX_is_normal functions to use a
similar, less error prone calculation, as suggested by Richard H.

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-02-26 14:05:19 +00:00
Eric Blake
3acf04f899 tests: Ignore fp test outputs
Commit 2cade3d wired up new tests, but did not exclude the
new *.out files produced by running the tests.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-02-26 14:05:19 +00:00
Murilo Opsfelder Araujo
b268a6162d ppc/pnv: use IEC binary prefixes to represent sizes
Using IEC binary prefixes from qemu/units.h provides a more human-friendly value
to size constants.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Message-Id: <20190225170155.1972-4-muriloo@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 14:20:30 +11:00
Murilo Opsfelder Araujo
584ea7e76f ppc/pnv: add INITRD_MAX_SIZE constant
The current 0x10000000 value is actually 256MiB, not 128MB as the comment
suggests. Move it to a constant and fix the comment (no change in the size
value).

Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Message-Id: <20190225170155.1972-3-muriloo@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 14:20:30 +11:00
Murilo Opsfelder Araujo
b45b56baee ppc/pnv: increase kernel size limit to 256MiB
Building kernel with CONFIG_DEBUG_INFO_REDUCED can generate a ~90MB image and
building with CONFIG_DEBUG_INFO can generate a ~225M one, both exceeds the
current limit of 32MiB.

Increasing kernel size limit to 256MiB should fit for now.

Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Message-Id: <20190225170155.1972-2-muriloo@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 14:20:30 +11:00
Thomas Huth
f6d4dca807 hw/ppc: Use object_initialize_child for correct reference counting
Both functions, object_initialize() and object_property_add_child() increase
the reference counter of the new object, so one of the references has to be
dropped afterwards to get the reference counting right. Otherwise the child
object will not be properly cleaned up when the parent gets destroyed.
Thus let's use now object_initialize_child() instead to get the reference
counting here right.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1550748288-30598-1-git-send-email-thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Cédric Le Goater
3dbe65c178 ppc/xive: xive does not have a POWER7 interrupt model
Patch "target/ppc: Add POWER9 external interrupt model" should have
removed the section covering PPC_FLAGS_INPUT_POWER7.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190219142530.17807-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
9bcb5b2941 tests/device-plug: Add PHB unplug request test for spapr
We can easily test this, just like PCI. PHB unplug is not supported
on s390x and x86 ACPI.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059673939.1466090.14354001937819612724.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Michael Roth
dae5e39ada spapr: enable PHB hotplug for default pseries machine type
The 'dr_phb_enabled' field of that class can be set as part of
machine-specific init code. It will be used to conditionally
enable creation of DRC objects and device-tree description to
facilitate hotplug of PHBs.

Since we can't migrate this state to older machine types,
default the option to true and disable it for older machine
types.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <155059673433.1466090.6188091133769611501.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
bb2bdd812e spapr: add hotplug hooks for PHB hotplug
Hotplugging PHBs is a machine-level operation, but PHBs reside on the
main system bus, so we register spapr machine as the handler for the
main system bus.

Provide the usual pre-plug, plug and unplug-request handlers.

Move the checking of the PHB index to the pre-plug handler. It is okay
to do that and assert in the realize function because the pre-plug
handler is always called, even for the oldest machine types we support.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(Fixed interrupt controller phandle in "interrupt-map" and
 TCE table size in "ibm,dma-window" FDT fragment, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059672926.1466090.13612804072190051439.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Michael Roth
f130928d2a spapr_pci: add ibm, my-drc-index property for PHB hotplug
This is needed to denote a boot-time PHB as being hot-pluggable.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059672420.1466090.15147504040270659866.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Michael Roth
0a0a66cd1b spapr_pci: provide node start offset via spapr_populate_pci_dt()
PHB hotplug re-uses PHB device tree generation code and passes
it to a guest via RTAS. Doing this requires knowledge of where
exactly in the device tree the node describing the PHB begins.

Provide this via a new optional pointer that can be used to
store the PHB node's start offset.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059671912.1466090.10891589403973703473.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Michael Roth
4b6d336f2c spapr_events: add support for phb hotplug events
Extend the existing EPOW event format we use for PCI
devices to emit PHB plug/unplug events.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059671405.1466090.535964535260503283.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Nathan Fontenot
3998ccd092 spapr: populate PHB DRC entries for root DT node
This add entries to the root OF node to advertise our PHBs as being
DR-capable in accordance with PAPR specification.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059670897.1466090.10843921337591637414.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Michael Roth
962b6c3650 spapr: create DR connectors for PHBs
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059670389.1466090.10015601248906623076.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
ef28b98d58 spapr_pci: add PHB unrealize
To support PHB hotplug we need to clean up lingering references,
memory, child properties, etc. prior to the PHB object being
finalized. Generally this will be called as a result of calling
object_unparent() on the PHB object, which in turn would normally
be called as the result of an unplug() operation.

When the PHB is finalized, child objects will be unparented in
turn, and finalized if the PHB was the only reference holder. so
we don't bother to explicitly unparent child objects of the PHB,
with the notable exception of DRCs. This is needed to avoid a QEMU
crash when unplugging a PHB and resetting the machine before the
guest could handle the event. The DRCs are removed from the QOM tree
by  pci_unregister_root_bus() and we must make sure we're not leaving
stale aliases under the global /dr-connector path.

The formula that gives the number of DMA windows is moved to an
inline function in the hw/pci-host/spapr.h header because it
will have other users.

The unrealize function is able to cope with partially realized PHBs.
It is hence used to implement proper rollback on the realize error
path.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <155059669881.1466090.13515030705986041517.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
ad62bff638 spapr_irq: Expose the phandle of the interrupt controller
This will be used by PHB hotplug in order to create the "interrupt-map"
property of the PHB node.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059669374.1466090.12943228478046223856.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
743ed566c1 spapr: Expose the name of the interrupt controller node
This will be needed by PHB hotplug in order to access the "phandle"
property of the interrupt controller node.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <155059668867.1466090.6339199751719123386.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
6cead90c5c xics: Write source state to KVM at claim time
The pseries machine only uses LSIs to support legacy PCI devices. Every
PHB claims 4 LSIs at realize time. When using in-kernel XICS (or upcoming
in-kernel XIVE), QEMU synchronizes the state of all irqs, including these
LSIs, later on at machine reset.

In order to support PHB hotplug, we need a way to tell KVM about the LSIs
that doesn't require a machine reset. An easy way to do that is to always
inform KVM when an interrupt is claimed, which really isn't a performance
path.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059668360.1466090.5969630516627776426.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
09d876ce2c spapr/drc: Drop spapr_drc_attach() fdt argument
All DRC subtypes have been converted to generate the FDT fragment at
configure connector time instead of attach time. The fdt and fdt_offset
arguments of spapr_drc_attach() aren't needed anymore. Drop them and
make the implementation of the dt_populate() method mandatory.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059667853.1466090.16527852453054217565.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
46fd02990d spapr/pci: Generate FDT fragment at configure connector time
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059667346.1466090.326696113231137772.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
345b12b99e spapr: Generate FDT fragment for CPUs at configure connector time
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059666839.1466090.3833376527523126752.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
62d38c9bd3 spapr: Generate FDT fragment for LMBs at configure connector time
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059666331.1466090.6766540766297333313.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Greg Kurz
d9c95c71ac spapr_drc: Allow FDT fragment to be added later
The current logic is to provide the FDT fragment when attaching a device
to a DRC. This works perfectly fine for our current hotplug support, but
soon we will add support for PHB hotplug which has some constraints, that
CPU, PCI and LMB devices don't seem to have.

The first constraint is that the "ibm,dma-window" property of the PHB
node requires the IOMMU to be configured, ie, spapr_tce_table_enable()
has been called, which happens during PHB reset. It is okay in the case
of hotplug since the device is reset before the hotplug handler is
called. On the contrary with coldplug, the hotplug handler is called
first and device is only reset during the initial system reset. Trying
to create the FDT fragment on the hotplug path in this case, would
result in somthing like this:

ibm,dma-window = < 0x80000000 0x00 0x00 0x00 0x00 >;

This will cause linux in the guest to panic, by simply removing and
re-adding the PHB using the drmgr command:

	page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz));
	if (!page)
		panic("iommu_init_table: Can't allocate %ld bytes\n", sz);

The second and maybe more problematic constraint is that the
"interrupt-map" property needs to reference the interrupt controller
node using the very same phandle that SLOF has already exposed to the
guest. QEMU requires SLOF to call the private KVMPPC_H_UPDATE_DT hcall
at some point to know about this phandle. With the latest QEMU and SLOF,
this happens when SLOF gets quiesced. This means that if the PHB gets
hotplugged after CAS but before SLOF quiesce, then we're sure that the
phandle is not known when the hotplug handler is called.

The FDT is only needed when the guest first invokes RTAS to configure
the connector actually, long after SLOF quiesce. Let's postpone the
creation of FDT fragments for PHBs to rtas_ibm_configure_connector().

Since we only need this for PHBs, introduce a new method in the base
DRC class for that. DRC subtypes will be converted to use it in
subsequent patches.

Allow spapr_drc_attach() to be passed a NULL fdt argument if the method
is available. When all DRC subtypes have been converted, the fdt argument
will eventually disappear.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155059665823.1466090.18358845122627355537.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Benjamin Herrenschmidt
539c6e7358 target/ppc: Basic POWER9 bare-metal radix MMU support
No guest support yet

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-13-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Benjamin Herrenschmidt
3367c62f52 target/ppc: Support for POWER9 native hash
(Might need more patch splitting)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-12-clg@kaod.org>
[dwg: Hack to fix compile with some earlier include tweaks of mine]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Benjamin Herrenschmidt
79825f4d58 target/ppc: Rename PATB/PATBE -> PATE
That "b" means "base address" and thus shouldn't be in the name
of actual entries and related constants.

This patch keeps the synthetic patb_entry field of the spapr
virtual hypervisor unchanged until I figure out if that has
an impact on the migration stream.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-11-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Benjamin Herrenschmidt
c4dae9cd37 target/ppc: Flush the TLB locally when the LPIDR is written
Our TCG TLB only tags whether it's a HV vs a guest access, so it must
be flushed when the LPIDR is changed.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-10-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Benjamin Herrenschmidt
74c4912f09 target/ppc: Fix synchronization of mttcg with broadcast TLB flushes
Let's use the generic helper tlb_flush_all_cpus_synced() instead
of iterating the CPUs ourselves.

We do lose the optimization of clearing the "other" CPUs "need flush"
flags but this shouldn't be a problem in practice.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Benjamin Herrenschmidt
34525595fb target/ppc: Add basic support for "new format" HPTE as found on POWER9
POWER9 (arch v3) slightly changes the HPTE format. The B bits move
from the first to the second half of the HPTE, and the AVPN/ARPN
are slightly shorter.

However, under SPAPR, the hypercalls still take the old format
(and probably will for the foreseable future).

The simplest way to support this is thus to convert the HPTEs from
new to old format when reading them if the MMU model is v3 and there
is no virtual hypervisor, leaving the rest of the code unchanged.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-8-clg@kaod.org>
[dwg: Moved function to .c since there was no real need for it in the .h]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00