Commit Graph

18132 Commits

Author SHA1 Message Date
Cédric Le Goater
7844e12b28 ppc/xics: use the QOM interface under the sPAPR machine
Add 'ics_get' and 'ics_resend' handlers to the sPAPR machine. These
are relatively simple for a single ICS.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
Cédric Le Goater
51b180051e ppc/xics: introduce a XICSFabric QOM interface to handle ICSs
This interface provides two simple handlers. One is to get an ICS
(Interrupt Source Controller) object from an irq number and a second
to resend the irqs when needed.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
Cédric Le Goater
b9038e7806 ppc/xics: add an InterruptStatsProvider interface to ICS and ICP objects
This is, again, to reduce the use of the list of ICS objects. Let's
make each individual ICS and ICP object an InterruptStatsProvider and
remove this same interface from XICSState.

The InterruptStatsProvider will be moved at the machine level after
the XICS cleanups are completed.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
Cédric Le Goater
681bfaded6 ppc/xics: store the ICS object under the sPAPR machine
A list of ICS objects was introduced under the XICS object for the
PowerNV machine but, for the sPAPR machine, it brings extra complexity
as there is only a single ICS. To simplify the code, let's add the ICS
pointer under the sPAPR machine and try to reduce the use of this list
where possible.

Also, change the xics_spapr_*() routines to use an ICS object instead
of an XICSState and change their name to reflect that these are
specific to the sPAPR ICS object.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
Cédric Le Goater
817bb6a446 ppc/xics: remove set_nr_servers() handler from XICSStateClass
Today, the ICP (Interrupt Controller Presenter) objects are created by
the 'nr_servers' property handler of the XICS object and a class
handler. They are realized in the XICS object realize routine.

Let's simplify the process by creating the ICP objects along with the
XICS object at the machine level.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
Cédric Le Goater
4e4169f7a2 ppc/xics: remove set_nr_irqs() handler from XICSStateClass
Today, the ICS (Interrupt Controller Source) object is created and
realized by the init and realize routines of the XICS object, but some
of the parameters are only known at the machine level.

These parameters are passed from the sPAPR machine to the ICS object
in a rather convoluted way using property handlers and a class handler
of the XICS object. The number of irqs required to allocate the IRQ
state objects in the ICS realize routine is one of them.

Let's simplify the process by creating the ICS object along with the
XICS object at the machine level and link the ICS into the XICS list
of ICSs at this level also. In the sPAPR machine, there is only a
single ICS but that will change with the PowerNV machine.

Also, QOMify the creation of the objects and get rid of the
superfluous code.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
David Gibson
738d5db824 xics: XICS should not be a SysBusDevice
Currently xics - the component of the IBM POWER interrupt controller
representing the overall interrupt fabric / architecture is
represented as a descendent of SysBusDevice.  However, this is not
really correct - the xics presents nothing in MMIO space so it should
be an "unattached" device in the current QOM model.

Since this device will always be created by the machine type, not created
specifically from the command line, and because it has no migrated state
it should be safe to move it around the device composition tree.

Therefore this patch changes it to a descendent of TYPE_DEVICE, and
makes it an unattached device.  So that its reset handler still gets
called correctly, we add a qdev_set_parent_bus() to attach it to
sysbus.  It's not really clear that's correct (instead of using
register_reset()) but it appears to a common technique.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[clg corrected problems with reset]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg folded together and updated commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
Greg Kurz
a8eeafda19 spapr/pci: populate PCI DT in reverse order
Since commit 1d2d974244 "spapr_pci: enumerate and add PCI device tree", QEMU
populates the PCI device tree in the opposite order compared to SLOF.

Before 1d2d974244:

Populating /pci@800000020000000
                     00 0000 (D) : 1af4 1000    virtio [ net ]
                     00 0800 (D) : 1af4 1001    virtio [ block ]
                     00 1000 (D) : 1af4 1009    virtio [ network ]
Populating /pci@800000020000000/unknown-legacy-device@2

7e5294b8 :  /pci@800000020000000
7e52b998 :  |-- ethernet@0
7e52c0c8 :  |-- scsi@1
7e52c7e8 :  +-- unknown-legacy-device@2 ok

Since 1d2d974244:

Populating /pci@800000020000000
                     00 1000 (D) : 1af4 1009    virtio [ network ]
Populating /pci@800000020000000/unknown-legacy-device@2
                     00 0800 (D) : 1af4 1001    virtio [ block ]
                     00 0000 (D) : 1af4 1000    virtio [ net ]

7e5e8118 :  /pci@800000020000000
7e5ea6a0 :  |-- unknown-legacy-device@2
7e5eadb8 :  |-- scsi@1
7e5eb4d8 :  +-- ethernet@0 ok

This behaviour change is not actually a bug since no assumptions should be
made on DT ordering. But it has no real justification either, other than
being the consequence of the way fdt_add_subnode() inserts new elements
to the front of the FDT rather than adding them to the tail.

This patch reverts to the historical SLOF ordering by walking PCI devices
in reverse order. This reconciles pseries with x86 machine types behavior.
It is expected to make things easier when porting existing applications to
power.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
(slight update to the changelog)
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
David Gibson
e57ca75ce3 target/ppc: Manage external HPT via virtual hypervisor
The pseries machine type implements the behaviour of a PAPR compliant
hypervisor, without actually executing such a hypervisor on the virtual
CPU.  To do this we need some hooks in the CPU code to make hypervisor
facilities get redirected to the machine instead of emulated internally.

For hypercalls this is managed through the cpu->vhyp field, which points
to a QOM interface with a method implementing the hypercall.

For the hashed page table (HPT) - also a hypervisor resource - we use an
older hack.  CPUPPCState has an 'external_htab' field which when non-NULL
indicates that the HPT is stored in qemu memory, rather than within the
guest's address space.

For consistency - and to make some future extensions easier - this merges
the external HPT mechanism into the vhyp mechanism.  Methods are added
to vhyp for the basic operations the core hash MMU code needs: map_hptes()
and unmap_hptes() for reading the HPT, store_hpte() for updating it and
hpt_mask() to retrieve its size.

To match this, the pseries machine now sets these vhyp fields in its
existing vhyp class, rather than reaching into the cpu object to set the
external_htab field.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
2017-03-01 11:23:39 +11:00
David Gibson
36778660d7 target/ppc: Eliminate htab_base and htab_mask variables
CPUPPCState includes fields htab_base and htab_mask which store the base
address (GPA) and size (as a mask) of the guest's hashed page table (HPT).
These are set when the SDR1 register is updated.

Keeping these in sync with the SDR1 is actually a little bit fiddly, and
probably not useful for performance, since keeping them expands the size of
CPUPPCState.  It also makes some upcoming changes harder to implement.

This patch removes these fields, in favour of calculating them directly
from the SDR1 contents when necessary.

This does make a change to the behaviour of attempting to write a bad value
(invalid HPT size) to the SDR1 with an mtspr instruction.  Previously, the
bad value would be stored in SDR1 and could be retrieved with a later
mfspr, but the HPT size as used by the softmmu would be, clamped to the
allowed values.  Now, writing a bad value is treated as a no-op.  An error
message is printed in both new and old versions.

I'm not sure which behaviour, if either, matches real hardware.  I don't
think it matters that much, since it's pretty clear that if an OS writes
a bad value to SDR1, it's not going to boot.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2017-03-01 11:23:39 +11:00
David Gibson
7222b94a83 target/ppc: Cleanup HPTE accessors for 64-bit hash MMU
Accesses to the hashed page table (HPT) are complicated by the fact that
the HPT could be in one of three places:
   1) Within guest memory - when we're emulating a full guest CPU at the
      hardware level (e.g. powernv, mac99, g3beige)
   2) Within qemu, but outside guest memory - when we're emulating user and
      supervisor instructions within TCG, but instead of emulating
      the CPU's hypervisor mode, we just emulate a hypervisor's behaviour
      (pseries in TCG or KVM-PR)
   3) Within the host kernel - a pseries machine using KVM-HV
      acceleration.  Mostly accesses to the HPT are handled by KVM,
      but there are a few cases where qemu needs to access it via a
      special fd for the purpose.

In order to batch accesses to the fd in case (3), we use a somewhat awkward
ppc_hash64_start_access() / ppc_hash64_stop_access() pair, which for case
(3) reads / releases several HPTEs from the kernel as a batch (usually a
whole PTEG).  For cases (1) & (2) it just returns an address value.  The
actual HPTE load helpers then need to interpret the returned token
differently in the 3 cases.

This patch keeps the same basic structure, but simplfiies the details.
First start_access() / stop_access() are renamed to map_hptes() and
unmap_hptes() to make their operation more obvious.  Second, map_hptes()
now always returns a qemu pointer, which can always be used in the same way
by the load_hpte() helpers.  In case (1) it comes from address_space_map()
in case (2) directly from qemu's HPT buffer and in case (3) from a
temporary buffer read from the KVM fd.

While we're at it, make things a bit more consistent in terms of types and
variable names: avoid variables named 'index' (it shadows index(3) which
can lead to confusing results), use 'hwaddr ptex' for HPTE indices and
uint64_t for each of the HPTE words, use ptex throughout the call stack
instead of pte_offset in some places (we still need that at the bottom
layer, but nowhere else).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
David Gibson
b7b0b1f13a target/ppc: Merge cpu_ppc_set_vhyp() with cpu_ppc_set_papr()
cpu_ppc_set_papr() sets up various aspects of CPU state for use with PAPR
paravirtualized guests.  However, it doesn't set the virtual hypervisor,
so callers must also call cpu_ppc_set_vhyp() so that PAPR hypercalls are
handled properly.  This is a bit silly, so fold setting the virtual
hypervisor into cpu_ppc_set_papr().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
2017-03-01 11:23:39 +11:00
David Gibson
c6404adebf pseries: Minor cleanups to HPT management hypercalls
* Standardize on 'ptex' instead of 'pte_index' for HPTE index variables
   for consistency and brevity
 * Avoid variables named 'index'; shadowing index(3) from libc can lead to
   surprising bugs if the variable is removed, because compiler errors
   might not appear for remaining references
 * Clarify index calculations in h_enter() - we have two cases, H_EXACT
   where the exact HPTE slot is given, and !H_EXACT where we search for
   an empty slot within the hash bucket.  Make the calculation more
   consistent between the cases.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
2017-03-01 11:23:39 +11:00
Greg Kurz
6244bb7e58 sysemu: support up to 1024 vCPUs
Some systems can already provide more than 255 hardware threads.

Bumping the QEMU limit to 1024 seems reasonable:
- it has no visible overhead in top;
- the limit itself has no effect on hot paths.

Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
Laurent Vivier
2530a1a5cf spapr: generate DT node names
When DT node names for PCI devices are generated by SLOF,
they are generated according to the type of the device
(for instance, ethernet for virtio-net-pci device).

Node name for hotplugged devices is generated by QEMU.
This patch adds the mechanic to QEMU to create the node
name according to the device type too.

The data structure has been roughly copied from OpenBIOS/OpenHackware,
node names from SLOF.

Example:

Hotplugging some PCI cards with QEMU monitor:

device_add virtio-tablet-pci
device_add virtio-serial-pci
device_add virtio-mouse-pci
device_add virtio-scsi-pci
device_add virtio-gpu-pci
device_add ne2k_pci
device_add nec-usb-xhci
device_add intel-hda

What we can see in linux device tree:

for dir in /proc/device-tree/pci@800000020000000/*@*/; do
    echo $dir
    cat $dir/name
    echo
done

WITHOUT this patch:

/proc/device-tree/pci@800000020000000/pci@0/
pci
/proc/device-tree/pci@800000020000000/pci@1/
pci
/proc/device-tree/pci@800000020000000/pci@2/
pci
/proc/device-tree/pci@800000020000000/pci@3/
pci
/proc/device-tree/pci@800000020000000/pci@4/
pci
/proc/device-tree/pci@800000020000000/pci@5/
pci
/proc/device-tree/pci@800000020000000/pci@6/
pci
/proc/device-tree/pci@800000020000000/pci@7/
pci

WITH this patch:

/proc/device-tree/pci@800000020000000/communication-controller@1/
communication-controller
/proc/device-tree/pci@800000020000000/display@4/
display
/proc/device-tree/pci@800000020000000/ethernet@5/
ethernet
/proc/device-tree/pci@800000020000000/input-controller@0/
input-controller
/proc/device-tree/pci@800000020000000/mouse@2/
mouse
/proc/device-tree/pci@800000020000000/multimedia-device@7/
multimedia-device
/proc/device-tree/pci@800000020000000/scsi@3/
scsi
/proc/device-tree/pci@800000020000000/usb-xhci@6/
usb-xhci

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
Marc-André Lureau
072bdb07c5 tests: fix ptimer leaks
Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-01 00:09:28 +04:00
Peter Maydell
758af5e862 Network boot for s390x. More information (and instructions
for building a s390-netboot.img) can be found at
 http://wiki.qemu-project.org/Features/S390xNetworkBoot
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYtV6WAAoJEN7Pa5PG8C+vip4P/0qdPwFJ7BJlbWH9os58btwH
 3fkIaGvbVbHVPyi5E9XyyiYRHfxUMdDJkQxzf2kD7HR8Sqx5Pyy1p/Fdz/XPCw6R
 vWcLAn8FkcI9pPdjI7dY7MxM6BqeeZAUwL9HmTLU1L80Rf1xxHxjbzE7hy3YeEGg
 bFPx4AEpa/MVyOV4sSpfUjsYVQiB9dPtgjjDnxVSGcU3oE2lyceQXj4UxcagxfbE
 gQis2hPqQC13pSC/jB6KPI8IW1+Y88Azf493fiGEj2MCm+Mge/9ksK+TtYuWrbV8
 e5DY73vNymJd4WXEFp1f2+2T14zpObpGOQv6O0yJMNj/UOJY5KON8qIEIe1XP7lY
 D22MiV1q+CoDnRQeN4BoNRPwGDoRPCh31BdzMOZXc6c/kyTcW5lxOBWl6SKaxlDM
 K3MoboiH9qF23VX3i2C6id9EiebDMubYmO3d8WlIHB+7digEco0VjC+vEZa18AW2
 3/fMke8mWvymeHKIbmZwm2X15QQg0s7VMjoU+iZ4l3vqjBb21BkYv/ZofqfcNe2q
 y9xp84m4YuFu8rRjPoupgexuY7N5JaxX1opExFLNyR6RUWGRWpBkaQzBbvUYmVF0
 9Ml3lXXPpo5AfXBs256KXrAZTG3oZ+eUeIWYTgWPY8ibqUCHHHMa23ZnB8UUe/2a
 rxx9PEZVcG4i73cOTclH
 =nH9Y
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20170228' into staging

Network boot for s390x. More information (and instructions
for building a s390-netboot.img) can be found at
http://wiki.qemu-project.org/Features/S390xNetworkBoot

# gpg: Signature made Tue 28 Feb 2017 11:27:18 GMT
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* remotes/cohuck/tags/s390x-20170228:
  pc-bios/s390-ccw.img: rebuild image
  pc-bios/s390-ccw: Use the ccw bios to start the network boot
  s390x/ipl: Load network boot image
  s390x/ipl: Extend S390IPLState to support network boot
  elf-loader: Allow late loading of elf

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 19:52:26 +00:00
Kevin Wolf
dabd18f64c hw/block: Introduce share-rw qdev property
By default, don't allow another writer for block devices that are
attached to a guest device. For the cases where this setup is intended
(e.g. using a cluster filesystem on the disk), the new option can be
used to allow it.

This change affects only devices using DEFINE_BLOCK_PROPERTIES().
Devices directly using DEFINE_PROP_DRIVE() still accept writers
unconditionally.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf
a17c17a274 hw/block: Request permissions
This makes all device emulations with a qdev drive property request
permissions on their BlockBackend. The only thing we block at this point
is resizing images for some devices that can't support it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf
39829a01ae block: Allow error return in BlockDevOps.change_media_cb()
Some devices allow a media change between read-only and read-write
media. They need to adapt the permissions in their .change_media_cb()
implementation, which can fail. So add an Error parameter to the
function.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf
d7086422b1 block: Add error parameter to blk_insert_bs()
Now that blk_insert_bs() requests the BlockBackend permissions for the
node it attaches to, it can fail. Instead of aborting, pass the errors
to the callers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf
6d0eb64d5c block: Add permissions to blk_new()
We want every user to be specific about the permissions it needs, so
we'll pass the initial permissions as parameters to blk_new(). A user
only needs to call blk_set_perm() if it wants to change the permissions
after the fact.

The permissions are stored in the BlockBackend and applied whenever a
BlockDriverState should be attached in blk_insert_bs().

This does not include actually choosing the right set of permissions
everywhere yet. Instead, the usual FIXME comment is added to each place
and will be addressed in individual patches.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Peter Maydell
9514f2648c This pull request brings:
- a fix to a minor bug reported by Coverity
 - throttling support in the local backend (command line only)
 -----BEGIN PGP SIGNATURE-----
 
 iEYEABECAAYFAli1Q64ACgkQAvw66wEB28I5yQCePbLPSOtHO4LJGc2E973L7vH2
 hQIAnReLFevyNN6BpivucP2/0YmAIKSi
 =uTYd
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging

This pull request brings:
- a fix to a minor bug reported by Coverity
- throttling support in the local backend (command line only)

# gpg: Signature made Tue 28 Feb 2017 09:32:30 GMT
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  throttle: factor out duplicate code
  fsdev: add IO throttle support to fsdev devices
  9pfs: fix v9fs_lock error case

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 17:39:49 +00:00
Clement Deschamps
1eeb5c7dea bcm2835: add sdhost and gpio controllers
This adds the bcm2835_sdhost and bcm2835_gpio to the BCM2835 platform.

For supporting the SD controller selection (alternate function of GPIOs
48-53), the bcm2835_gpio now exposes an sdbus.
It also has a link to both the sdbus of sdhci and sdhost controllers,
and the card is reparented from one bus to another when the alternate
function of GPIOs 48-53 is modified.

Signed-off-by: Clement Deschamps <clement.deschamps@antfield.fr>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1488293711-14195-5-git-send-email-peter.maydell@linaro.org
Message-id: 20170224164021.9066-5-clement.deschamps@antfield.fr
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 17:10:00 +00:00
Clement Deschamps
d72fc9dcb1 bcm2835_gpio: add bcm2835 gpio controller
This adds the BCM2835 GPIO controller.

It currently implements:
- The 54 GPIOs as outputs (qemu_irq)
- The SD controller selection via alternate function of GPIOs 48-53

Signed-off-by: Clement Deschamps <clement.deschamps@antfield.fr>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1488293711-14195-4-git-send-email-peter.maydell@linaro.org
Message-id: 20170224164021.9066-4-clement.deschamps@antfield.fr
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 17:10:00 +00:00
Clement Deschamps
97fb87cc5d hw/sd: add card-reparenting function
Provide a new function sdbus_reparent_card() in sd core for reparenting
a card from a SDBus to another one.

This function is required by the raspi platform, where the two SD
controllers can be dynamically switched.

Signed-off-by: Clement Deschamps <clement.deschamps@antfield.fr>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1488293711-14195-3-git-send-email-peter.maydell@linaro.org
Message-id: 20170224164021.9066-3-clement.deschamps@antfield.fr
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added a doc comment to the header file; changed to
 use new behaviour of qdev_set_parent_bus()]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 17:10:00 +00:00
Peter Maydell
91c968ac72 qdev: Have qdev_set_parent_bus() handle devices already on a bus
Instead of qdev_set_parent_bus() silently doing the wrong
thing if it's handed a device that's already on a bus,
have it remove the device from the old bus and add it to
the new one. This is useful for the raspi2 sdcard.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 1488293711-14195-2-git-send-email-peter.maydell@linaro.org
2017-02-28 17:10:00 +00:00
Vijaya Kumar K
07a5628cb8 hw/intc/arm_gicv3_kvm: Reset GICv3 cpu interface registers
Reset CPU interface registers of GICv3 when CPU is reset.
For this, ARMCPRegInfo struct is registered with one ICC
register whose resetfn is called when cpu is reset.

All the ICC registers are reset under one single register
reset function instead of calling resetfn for each ICC
register.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1487850673-26455-6-git-send-email-vijay.kilari@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 17:10:00 +00:00
Vijaya Kumar K
d3a3e52962 target-arm: Add GICv3CPUState in CPUARMState struct
Add gicv3state void pointer to CPUARMState struct
to store GICv3CPUState.

In case of usecase like CPU reset, we need to reset
GICv3CPUState of the CPU. In such scenario, this pointer
becomes handy.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1487850673-26455-5-git-send-email-vijay.kilari@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 17:10:00 +00:00
Vijaya Kumar K
367b9f527b hw/intc/arm_gicv3_kvm: Implement get/put functions
This actually implements pre_save and post_load methods for in-kernel
vGICv3.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Message-id: 1487850673-26455-4-git-send-email-vijay.kilari@gmail.com
[PMM:
 * use decimal, not 0bnnn
 * fixed typo in names of ICC_APR0R_EL1 and ICC_AP1R_EL1
 * completely rearranged the get and put functions to read and write
   the state in a natural order, rather than mixing distributor and
   redistributor state together]
Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
[Vijay:
 * Update macro KVM_VGIC_ATTR
 * Use 32 bit access for gicd and gicr
 * GICD_IROUTER, GICD_TYPER, GICR_PROPBASER and GICR_PENDBASER reg
   access  are changed from 64-bit to 32-bit access
 * Add ICC_SRE_EL1 save and restore
 * Dropped translate_fn mechanism and coded functions to handle
   save and restore of edge_trigger and priority
 * Number of APnR register saved/restored based on number of
   priority bits supported]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 17:10:00 +00:00
Vijaya Kumar K
6692aac411 hw/intc/arm_gicv3_kvm: Add ICC_SRE_EL1 register to vmstate
To Save and Restore ICC_SRE_EL1 register introduce vmstate
subsection and load only if non-zero.
Also initialize icc_sre_el1 with to 0x7 in pre_load
function.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1487850673-26455-3-git-send-email-vijay.kilari@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 17:10:00 +00:00
Peter Maydell
7d1730b7d9 trivial patches for 2017-02-28
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCAAGBQJYtRwrAAoJEHAbT2saaT5ZQSQIAKWIXrxhIGO6hGEDc50YL6x6
 tQMOnPQOulLtS76rGDAZrJwc47wqpXUtBCuevgwwqbxraLHF4LRnMf0I+xSR+lTt
 PF9vmgDgB4BVDpSTqphjaCBccXPYPqXzUtYaDcT6xePy8aB+/40nqsnby5hf+BXT
 zNpZZrn23papmftS3LnZ5j/lKNIsIlS/v5WIy8xNK0pBTKx4W1ZzDWrYq8crqW+v
 NqQSoVbNOEHOt1+C+nEX6gxUnY6rJXAVB0ICT0fSY9NRhFjPeu1Fx6EtCRNaluXm
 zOZ7t4kTjpB7IcHy3lqDTaWV6VVwBFUym5pfwjRLcge4ln+a5O454+/i8mxerfo=
 =jvkw
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-fetch' into staging

trivial patches for 2017-02-28

# gpg: Signature made Tue 28 Feb 2017 06:43:55 GMT
# gpg:                using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* remotes/mjt/tags/trivial-patches-fetch:
  syscall: fixed mincore(2) not failing with ENOMEM
  hw/acpi/tco.c: fix tco timer stop
  lm32: milkymist-tmu2: fix a third integer overflow
  qemu-options.hx: add missing id=chr0 chardev argument in vhost-user example
  Update copyright year
  tests/prom-env: Enable the test for the sun4u machine, too
  cadence_gem: Remove unused parameter debug message
  register: fix incorrect read mask
  ide: remove undefined behavior in ide-test
  CODING_STYLE: Mention preferred comment form
  hw/core/register: Mark the device with cannot_instantiate_with_device_add_yet
  hw/core/or-irq: Mark the device with cannot_instantiate_with_device_add_yet
  softfloat: Use correct type in float64_to_uint64_round_to_zero()
  target/s390x: Fix typo

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 16:22:41 +00:00
Peter Maydell
8a85e0654e stm32f205: Rename 'nvic' local to 'armv7m'
The local variable 'nvic' in stm32f205_soc_realize() no longer
holds a direct pointer to the NVIC device; it is a pointer to
the ARMv7M container object. Rename it 'armv7m' accordingly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487604965-23220-12-git-send-email-peter.maydell@linaro.org
2017-02-28 16:18:49 +00:00
Peter Maydell
b72e2f6856 stm32f205: Create armv7m object without using armv7m_init()
Switch the stm32f205 SoC to create the armv7m object directly
rather than via the armv7m_init() wrapper. This fits better
with the SoC model's very QOMified design.

In particular this means we can push loading the guest image
out to the top level board code where it belongs, rather
than the SoC object having a QOM property for the filename
to load.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487604965-23220-11-git-send-email-peter.maydell@linaro.org
2017-02-28 16:18:49 +00:00
Peter Maydell
ff68dacbc7 armv7m: Split systick out from NVIC
The SysTick timer isn't really part of the NVIC proper;
we just modelled it that way back when we couldn't
easily have devices that only occupied a small chunk
of a memory region. Split it out into its own device.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1487604965-23220-10-git-send-email-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-02-28 16:18:49 +00:00
Peter Maydell
743eb70560 armv7m: Don't put core v7M devices under CONFIG_STELLARIS
The NVIC is a core v7M device that exists for all v7M CPUs;
put it under a CONFIG_ARM_V7M rather than hiding it under
CONFIG_STELLARIS.

(We'll use CONFIG_ARM_V7M for the SysTick device too
when we split it out of the NVIC.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487604965-23220-9-git-send-email-peter.maydell@linaro.org
2017-02-28 16:18:49 +00:00
Peter Maydell
f68d881c9b armv7m: Make bitband device take the address space to access
Instead of the bitband device doing a cpu_physical_memory_read/write,
make it take a MemoryRegion which specifies where it should be
accessing, and use address_space_read/write to access the
corresponding AddressSpace.

Since this entails pretty much a rewrite, convert away from
old_mmio in the process.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487604965-23220-8-git-send-email-peter.maydell@linaro.org
2017-02-28 16:18:49 +00:00
Peter Maydell
98957a94ef armv7m: Make NVIC expose a memory region rather than mapping itself
Make the NVIC device expose a memory region for its users
to map, rather than mapping itself into the system memory
space on realize, and get the one user (the ARMv7M object)
to do this.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487604965-23220-7-git-send-email-peter.maydell@linaro.org
2017-02-28 16:18:49 +00:00
Peter Maydell
618119c2d3 armv7m: Make ARMv7M object take memory region link
Make the ARMv7M object take a memory region link which it uses
to wire up the bitband rather than having them always put
themselves in the system address space.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487604965-23220-6-git-send-email-peter.maydell@linaro.org
2017-02-28 16:18:49 +00:00
Peter Maydell
21e0c38fe2 armv7m: Use QOMified armv7m object in armv7m_init()
Make the legacy armv7m_init() function use the newly QOMified
armv7m object rather than doing everything by hand.

We can return the armv7m object rather than the NVIC from
armv7m_init() because its interface to the rest of the
board (GPIOs, etc) is identical.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487604965-23220-5-git-send-email-peter.maydell@linaro.org
2017-02-28 16:18:49 +00:00
Peter Maydell
56b7c66f49 armv7m: QOMify the armv7m container
Create a proper QOM object for the armv7m container, which
holds the CPU, the NVIC and the bitband regions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487604965-23220-4-git-send-email-peter.maydell@linaro.org
2017-02-28 16:18:49 +00:00
Peter Maydell
6bf436cf9d armv7m: Move NVICState struct definition into header
Move the NVICState struct definition into a header, so we can
embed it into other QOM objects like SoCs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487604965-23220-3-git-send-email-peter.maydell@linaro.org
2017-02-28 16:18:49 +00:00
Peter Maydell
3651c28569 armv7m: Abstract out the "load kernel" code
Abstract the "load kernel" code out of armv7m_init() into its own
function.  This includes the registration of the CPU reset function,
to parallel how we handle this for A profile cores.

We make the function public so that boards which choose to
directly instantiate an ARMv7M device object can call it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487604965-23220-2-git-send-email-peter.maydell@linaro.org
2017-02-28 16:18:49 +00:00
Krzysztof Kozlowski
f3a6339a5b hw/arm/exynos: Fix proper mapping of CPUs by providing real cluster ID
The Exynos4210 has cluster ID 0x9 in its MPIDR register (raw value
0x8000090x).  If this cluster ID is not provided, then Linux kernel
cannot map DeviceTree nodes to MPIDR values resulting in kernel
warning and lack of any secondary CPUs:

    DT missing boot CPU MPIDR[23:0], fall back to default cpu_logical_map
    ...
    smp: Bringing up secondary CPUs ...
    smp: Brought up 1 node, 1 CPU
    SMP: Total of 1 processors activated (24.00 BogoMIPS).

Provide a cluster ID so Linux will see proper MPIDR and will try to
bring the secondary CPU online.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Message-id: 20170226200142.31169-2-krzk@kernel.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 12:08:20 +00:00
Krzysztof Kozlowski
1e0228fd20 hw/arm/exynos: Fix Linux kernel division by zero for PLLs
Without any clock controller, the Linux kernel was hitting division by
zero during boot or with clk_summary:
[    0.000000] [<c031054c>] (unwind_backtrace) from [<c030ba6c>] (show_stack+0x10/0x14)
[    0.000000] [<c030ba6c>] (show_stack) from [<c05b2660>] (dump_stack+0x88/0x9c)
[    0.000000] [<c05b2660>] (dump_stack) from [<c05b11a4>] (Ldiv0+0x8/0x10)
[    0.000000] [<c05b11a4>] (Ldiv0) from [<c06ad1e0>] (samsung_pll45xx_recalc_rate+0x58/0x74)
[    0.000000] [<c06ad1e0>] (samsung_pll45xx_recalc_rate) from [<c0692ec0>] (clk_register+0x39c/0x63c)
[    0.000000] [<c0692ec0>] (clk_register) from [<c125d360>] (samsung_clk_register_pll+0x2e0/0x3d4)
[    0.000000] [<c125d360>] (samsung_clk_register_pll) from [<c125d7e8>] (exynos4_clk_init+0x1b0/0x5e4)
[    0.000000] [<c125d7e8>] (exynos4_clk_init) from [<c12335f4>] (of_clk_init+0x17c/0x210)
[    0.000000] [<c12335f4>] (of_clk_init) from [<c1204700>] (time_init+0x24/0x2c)
[    0.000000] [<c1204700>] (time_init) from [<c1200b2c>] (start_kernel+0x24c/0x38c)
[    0.000000] [<c1200b2c>] (start_kernel) from [<4020807c>] (0x4020807c)

Provide stub for clock controller returning reset values for PLLs.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Message-id: 20170226200142.31169-1-krzk@kernel.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 12:08:20 +00:00
Clement Deschamps
43ddc182e2 bcm2835_sdhost: add bcm2835 sdhost controller
This adds the BCM2835 SDHost controller from Arasan.

Signed-off-by: Clement Deschamps <clement.deschamps@antfield.fr>
Message-id: 20170224164021.9066-2-clement.deschamps@antfield.fr
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 12:08:19 +00:00
Peter Maydell
5db53e353d armv7m: Allow SHCSR writes to change pending and active bits
Implement the NVIC SHCSR write behaviour which allows pending and
active status of some exceptions to be changed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-02-28 12:08:19 +00:00
Peter Maydell
aa488fe3bb armv7m: Check exception return consistency
Implement the exception return consistency checks
described in the v7M pseudocode ExceptionReturn().

Inspired by a patch from Michael Davidsaver's series, but
this is a reimplementation from scratch based on the
ARM ARM pseudocode.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-02-28 12:08:19 +00:00
Michael Davidsaver
14790f730a armv7m: VECTCLRACTIVE and VECTRESET are UNPREDICTABLE
The VECTCLRACTIVE and VECTRESET bits in the AIRCR are both
documented as UNPREDICTABLE if you write a 1 to them when
the processor is not halted in Debug state (ie stopped
and under the control of an external JTAG debugger).
Since we don't implement Debug state or emulated JTAG
these bits are always UNPREDICTABLE for us. Instead of
logging them as unimplemented we can simply log writes
as guest errors and ignore them.

Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
[PMM: change extracted from another patch; commit message
 constructed from scratch]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-02-28 12:08:18 +00:00
Peter Maydell
a5d8235545 armv7m: Remove unused armv7m_nvic_acknowledge_irq() return value
Having armv7m_nvic_acknowledge_irq() return the new value of
env->v7m.exception and its one caller assign the return value
back to env->v7m.exception is pointless. Just make the return
type void instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-02-28 12:08:18 +00:00
Michael Davidsaver
a73c98e159 armv7m: Escalate exceptions to HardFault if necessary
The v7M exception architecture requires that if a synchronous
exception cannot be taken immediately (because it is disabled
or at too low a priority) then it should be escalated to
HardFault (and the HardFault exception is then taken).
Implement this escalation logic.

Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
[PMM: extracted from another patch]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-02-28 12:08:17 +00:00
Michael Davidsaver
7c14b3ac07 arm: gic: Remove references to NVIC
Now that the NVIC is its own separate implementation, we can
clean up the GIC code by removing REV_NVIC and conditionals
which use it.

Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-02-28 12:08:17 +00:00
Peter Maydell
7ecdaa4a96 armv7m: Fix condition check for taking exceptions
The M profile condition for when we can take a pending exception or
interrupt is not the same as that for A/R profile.  The code
originally copied from the A/R profile version of the
cpu_exec_interrupt function only worked by chance for the
very simple case of exceptions being masked by PRIMASK.
Replace it with a call to a function in the NVIC code that
correctly compares the priority of the pending exception
against the current execution priority of the CPU.

[Michael Davidsaver's patchset had a patch to do something
similar but the implementation ended up being a rewrite.]

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-02-28 12:08:17 +00:00
Michael Davidsaver
da6d674e50 armv7m: Rewrite NVIC to not use any GIC code
Despite some superficial similarities of register layout, the
M-profile NVIC is really very different from the A-profile GIC.
Our current attempt to reuse the GIC code means that we have
significant bugs in our NVIC.

Implement the NVIC as an entirely separate device, to give
us somewhere we can get the behaviour correct.

This initial commit does not attempt to implement exception
priority escalation, since the GIC-based code didn't either.
It does fix a few bugs in passing:
 * ICSR.RETTOBASE polarity was wrong and didn't account for
   internal exceptions
 * ICSR.VECTPENDING was 16 too high if the pending exception
   was for an external interrupt
 * UsageFault, BusFault and MemFault were not disabled on reset
   as they are supposed to be

Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
[PMM: reworked, various bugs and stylistic cleanups]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-02-28 12:08:17 +00:00
Peter Maydell
1004102a77 armv7m: Implement reading and writing of PRIGROUP
Add a state field for the v7M PRIGROUP register and implent
reading and writing it. The current NVIC doesn't honour
the values written, but the new version will.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-02-28 12:08:16 +00:00
Peter Maydell
f797c07507 armv7m: Rename nvic_state to NVICState
Rename the nvic_state struct to NVICState, to match
our naming conventions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-02-28 12:08:16 +00:00
Kurban Mallachiev
c98c9eba88 ARM i.MX timers: fix reset handling
The i.MX timer device can be reset by writing to the SWR bit
of the CR register. This has to behave differently from hard
(power-on) reset because it does not reset all of the bits
in the CR register.

We were incorrectly implementing soft reset and hard reset
the same way, and in addition had a logic error which meant
that we were clearing the bits that soft-reset is supposed
to preserve and not touching the bits that soft-reset clears.
This was not correct behaviour for either kind of reset.

Separate out the soft reset and hard reset code paths, and
correct the handling of reset of the CR register so that it
is correct in both cases.

Signed-off-by: Kurban Mallachiev <mallachiev@ispras.ru>
[PMM: rephrased commit message, spacing on operators;
 use bool rather than int for is_soft_reset]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 12:08:16 +00:00
Eric Auger
ccc11b0279 hw/arm/virt: Add a user option to disallow ITS instantiation
In 2.9 ITS will block save/restore and migration use cases. As such,
let's introduce a user option that allows to turn its instantiation
off, along with GICv3. With the "its" option turned false, migration
will be possible, obviously at the expense of MSI support (with GICv3).

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1487681108-14452-1-git-send-email-eric.auger@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 12:08:16 +00:00
Igor Mammedov
dbb74759fa hw/arm/virt: fix cpu object reference leak
object_new(FOO) returns an object with ref_cnt == 1
and following
  object_property_set_bool(cpuobj, true, "realized", NULL)
set parent of cpuobj to '/machine/unattached' which makes
ref_cnt == 2.

Since machvirt_init() doesn't take ownership of cpuobj
returned by object_new() it should explicitly drop
reference to cpuobj when dangling pointer is about to
go out of scope like it's done pc_new_cpu() to avoid
object leak.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 1487253461-269218-1-git-send-email-imammedo@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 12:08:15 +00:00
Prasad J Pandit
241999bf4c sd: sdhci: Remove block count enable check in single block transfers
In SDHCI protocol, the 'Block count enable' bit of the Transfer
Mode register is relevant only in multi block transfers. We need
not check it in single block transfers.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20170214185225.7994-5-ppandit@redhat.com
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 12:08:15 +00:00
Prasad J Pandit
45ba9f761b sd: sdhci: conditionally invoke multi block transfer
In sdhci_write invoke multi block transfer if it is enabled
in the transfer mode register 's->trnmod'.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20170214185225.7994-4-ppandit@redhat.com
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 12:08:14 +00:00
Prasad J Pandit
6e86d90352 sd: sdhci: check transfer mode register in multi block transfer
In the SDHCI protocol, the transfer mode register value
is used during multi block transfer to check if block count
register is enabled and should be updated. Transfer mode
register could be set such that, block count register would
not be updated, thus leading to an infinite loop. Add check
to avoid it.

Reported-by: Wjjzhang <wjjzhang@tencent.com>
Reported-by: Jiang Xin <jiangxin1@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20170214185225.7994-3-ppandit@redhat.com
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 12:08:14 +00:00
Prasad J Pandit
8b20aefac4 sd: sdhci: mask transfer mode register value
In SDHCI protocol, the transfer mode register is defined
to be of 6 bits. Mask its value with '0x0037' so that an
invalid value could not be assigned.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20170214185225.7994-2-ppandit@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 12:08:14 +00:00
Peter Maydell
373442ea3a bcm2835_rng: Use qcrypto_random_bytes() rather than rand()
Switch to using qcrypto_random_bytes() rather than rand() as
our source of randomness for the BCM2835 RNG.

If qcrypto_random_bytes() fails, we don't want to return the guest a
non-random value in case they're really using it for cryptographic
purposes, so the best we can do is a fatal error.  This shouldn't
happen unless something's broken, though.

In theory we could implement this device's full FIFO and interrupt
semantics and then just stop filling the FIFO.  That's a lot of work,
though, and doesn't really give a very nice diagnostic to the user
since the guest will just seem to hang.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2017-02-28 12:08:14 +00:00
Marcin Chojnacki
54a5ba13a9 target-arm: Implement BCM2835 hardware RNG
Recent vanilla Raspberry Pi kernels started to make use of
the hardware random number generator in BCM2835 SoC. As a
result, those kernels wouldn't work anymore under QEMU
but rather just freeze during the boot process.

This patch implements a trivial BCM2835 compatible RNG,
and adds it as a peripheral to BCM2835 platform, which
allows to boot a vanilla Raspberry Pi kernel under Qemu.

Changes since v1:
 * Prevented guest from writing [31..20] bits in rng_status
 * Removed redundant minimum_version_id_old
 * Added field entries for the state
 * Changed realize function to reset

Signed-off-by: Marcin Chojnacki <marcinch7@gmail.com>
Message-id: 20170210210857.47893-1-marcinch7@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 12:08:13 +00:00
Ashijeet Acharya
7562f90707 migrate: Introduce a 'dc->vmsd' check to avoid segfault for --only-migratable
Commit a3a3d8c7 introduced a segfault bug while checking for
'dc->vmsd->unmigratable' which caused QEMU to crash when trying to add
devices which do no set their 'dc->vmsd' yet while initialization.
Place a 'dc->vmsd' check prior to it so that we do not segfault for
such devices.

NOTE: This doesn't compromise the functioning of --only-migratable
option as all the unmigratable devices do set their 'dc->vmsd'.

Introduce a new function check_migratable() and move the
only_migratable check inside it, also use stubs to avoid user-mode qemu
build failures.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1487009088-23891-1-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28 11:30:22 +00:00
Farhan Ali
f38b5b7fc4 s390x/ipl: Load network boot image
Load the network boot image into guest RAM when the boot
device selected is a network device. Use some of the reserved
space in IplBlockCcw to store the start address of the netboot
image.

A user could also use 'chreipl'(diag 308/5) to change the boot device.
So every time we update the IPLB, we need to verify if the selected
boot device is a network device so we can appropriately load the
network boot image.

Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-02-28 12:04:48 +01:00
Farhan Ali
5f31ade055 s390x/ipl: Extend S390IPLState to support network boot
Add new field to S390IPLState to store the name of the network boot
loader.

Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2017-02-28 12:04:48 +01:00
Farhan Ali
34f1b23f8a elf-loader: Allow late loading of elf
The current QEMU ROM infrastructure rejects late loading of ROMs.
And ELFs are currently loaded as ROM, this prevents delayed loading
of ELFs. So when loading ELF, allow the user to specify if ELF should
be loaded as ROM or not.

If an ELF is not loaded as ROM, then they are not restored on a
guest reboot/reset and so its upto the user to handle the reloading.

Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-02-28 12:04:48 +01:00
Greg Kurz
c23d5f1d5b 9pfs: local: drop unused code
Now that the all callbacks have been converted to use "at" syscalls, we
can drop this code.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
a565fea565 9pfs: local: open2: don't follow symlinks
The local_open2() callback is vulnerable to symlink attacks because it
calls:

(1) open() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
    chmod(), both functions also following symbolic links

This patch converts local_open2() to rely on opendir_nofollow() and
mkdirat() to fix (1), as well as local_set_xattrat(),
local_set_mapped_file_attrat() and local_set_cred_passthrough() to
fix (2), (3) and (4) respectively. Since local_open2() already opens
a descriptor to the target file, local_set_cred_passthrough() is
modified to reuse it instead of opening a new one.

The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to openat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
3f3a16990b 9pfs: local: mkdir: don't follow symlinks
The local_mkdir() callback is vulnerable to symlink attacks because it
calls:

(1) mkdir() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
    chmod(), both functions also following symbolic links

This patch converts local_mkdir() to rely on opendir_nofollow() and
mkdirat() to fix (1), as well as local_set_xattrat(),
local_set_mapped_file_attrat() and local_set_cred_passthrough() to
fix (2), (3) and (4) respectively.

The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to mkdirat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
d815e72190 9pfs: local: mknod: don't follow symlinks
The local_mknod() callback is vulnerable to symlink attacks because it
calls:

(1) mknod() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
    chmod(), both functions also following symbolic links

This patch converts local_mknod() to rely on opendir_nofollow() and
mknodat() to fix (1), as well as local_set_xattrat() and
local_set_mapped_file_attrat() to fix (2) and (3) respectively.

A new local_set_cred_passthrough() helper based on fchownat() and
fchmodat_nofollow() is introduced as a replacement to
local_post_create_passthrough() to fix (4).

The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to mknodat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
38771613ea 9pfs: local: symlink: don't follow symlinks
The local_symlink() callback is vulnerable to symlink attacks because it
calls:

(1) symlink() which follows symbolic links for all path elements but the
    rightmost one
(2) open(O_NOFOLLOW) which follows symbolic links for all path elements but
    the rightmost one
(3) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(4) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one

This patch converts local_symlink() to rely on opendir_nofollow() and
symlinkat() to fix (1), openat(O_NOFOLLOW) to fix (2), as well as
local_set_xattrat() and local_set_mapped_file_attrat() to fix (3) and
(4) respectively.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
d369f20763 9pfs: local: chown: don't follow symlinks
The local_chown() callback is vulnerable to symlink attacks because it
calls:

(1) lchown() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one

This patch converts local_chown() to rely on open_nofollow() and
fchownat() to fix (1), as well as local_set_xattrat() and
local_set_mapped_file_attrat() to fix (2) and (3) respectively.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
e3187a45dd 9pfs: local: chmod: don't follow symlinks
The local_chmod() callback is vulnerable to symlink attacks because it
calls:

(1) chmod() which follows symbolic links for all path elements
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one

We would need fchmodat() to implement AT_SYMLINK_NOFOLLOW to fix (1). This
isn't the case on linux unfortunately: the kernel doesn't even have a flags
argument to the syscall :-\ It is impossible to fix it in userspace in
a race-free manner. This patch hence converts local_chmod() to rely on
open_nofollow() and fchmod(). This fixes the vulnerability but introduces
a limitation: the target file must readable and/or writable for the call
to openat() to succeed.

It introduces a local_set_xattrat() replacement to local_set_xattr()
based on fsetxattrat() to fix (2), and a local_set_mapped_file_attrat()
replacement to local_set_mapped_file_attr() based on local_fopenat()
and mkdirat() to fix (3). No effort is made to factor out code because
both local_set_xattr() and local_set_mapped_file_attr() will be dropped
when all users have been converted to use the "at" versions.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
ad0b46e6ac 9pfs: local: link: don't follow symlinks
The local_link() callback is vulnerable to symlink attacks because it calls:

(1) link() which follows symbolic links for all path elements but the
    rightmost one
(2) local_create_mapped_attr_dir()->mkdir() which follows symbolic links
    for all path elements but the rightmost one

This patch converts local_link() to rely on opendir_nofollow() and linkat()
to fix (1), mkdirat() to fix (2).

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
6dd4b1f1d0 9pfs: local: improve error handling in link op
When using the mapped-file security model, we also have to create a link
for the metadata file if it exists. In case of failure, we should rollback.

That's what this patch does.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
d2767edec5 9pfs: local: rename: use renameat
The local_rename() callback is vulnerable to symlink attacks because it
uses rename() which follows symbolic links in all path elements but the
rightmost one.

This patch simply transforms local_rename() into a wrapper around
local_renameat() which is symlink-attack safe.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
99f2cf4b2d 9pfs: local: renameat: don't follow symlinks
The local_renameat() callback is currently a wrapper around local_rename()
which is vulnerable to symlink attacks.

This patch rewrites local_renameat() to have its own implementation, based
on local_opendir_nofollow() and renameat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
f9aef99b3e 9pfs: local: lstat: don't follow symlinks
The local_lstat() callback is vulnerable to symlink attacks because it
calls:

(1) lstat() which follows symbolic links in all path elements but the
    rightmost one
(2) getxattr() which follows symbolic links in all path elements
(3) local_mapped_file_attr()->local_fopen()->openat(O_NOFOLLOW) which
    follows symbolic links in all path elements but the rightmost
    one

This patch converts local_lstat() to rely on opendir_nofollow() and
fstatat(AT_SYMLINK_NOFOLLOW) to fix (1), fgetxattrat_nofollow() to
fix (2).

A new local_fopenat() helper is introduced as a replacement to
local_fopen() to fix (3). No effort is made to factor out code
because local_fopen() will be dropped when all users have been
converted to call local_fopenat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
bec1e9546e 9pfs: local: readlink: don't follow symlinks
The local_readlink() callback is vulnerable to symlink attacks because it
calls:

(1) open(O_NOFOLLOW) which follows symbolic links for all path elements but
    the rightmost one
(2) readlink() which follows symbolic links for all path elements but the
    rightmost one

This patch converts local_readlink() to rely on open_nofollow() to fix (1)
and opendir_nofollow(), readlinkat() to fix (2).

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
ac125d993b 9pfs: local: truncate: don't follow symlinks
The local_truncate() callback is vulnerable to symlink attacks because
it calls truncate() which follows symbolic links in all path elements.

This patch converts local_truncate() to rely on open_nofollow() and
ftruncate() instead.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
31e51d1c15 9pfs: local: statfs: don't follow symlinks
The local_statfs() callback is vulnerable to symlink attacks because it
calls statfs() which follows symbolic links in all path elements.

This patch converts local_statfs() to rely on open_nofollow() and fstatfs()
instead.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
a33eda0dd9 9pfs: local: utimensat: don't follow symlinks
The local_utimensat() callback is vulnerable to symlink attacks because it
calls qemu_utimens()->utimensat(AT_SYMLINK_NOFOLLOW) which follows symbolic
links in all path elements but the rightmost one or qemu_utimens()->utimes()
which follows symbolic links for all path elements.

This patch converts local_utimensat() to rely on opendir_nofollow() and
utimensat(AT_SYMLINK_NOFOLLOW) directly instead of using qemu_utimens().
It is hence assumed that the OS supports utimensat(), i.e. has glibc 2.6
or higher and linux 2.6.22 or higher, which seems reasonable nowadays.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
a0e640a872 9pfs: local: remove: don't follow symlinks
The local_remove() callback is vulnerable to symlink attacks because it
calls:

(1) lstat() which follows symbolic links in all path elements but the
    rightmost one
(2) remove() which follows symbolic links in all path elements but the
    rightmost one

This patch converts local_remove() to rely on opendir_nofollow(),
fstatat(AT_SYMLINK_NOFOLLOW) to fix (1) and unlinkat() to fix (2).

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
df4938a665 9pfs: local: unlinkat: don't follow symlinks
The local_unlinkat() callback is vulnerable to symlink attacks because it
calls remove() which follows symbolic links in all path elements but the
rightmost one.

This patch converts local_unlinkat() to rely on opendir_nofollow() and
unlinkat() instead.

Most of the code is moved to a separate local_unlinkat_common() helper
which will be reused in a subsequent patch to fix the same issue in
local_remove().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
72f0d0bf51 9pfs: local: lremovexattr: don't follow symlinks
The local_lremovexattr() callback is vulnerable to symlink attacks because
it calls lremovexattr() which follows symbolic links in all path elements
but the rightmost one.

This patch introduces a helper to emulate the non-existing fremovexattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lremovexattr().

local_lremovexattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
3e36aba757 9pfs: local: lsetxattr: don't follow symlinks
The local_lsetxattr() callback is vulnerable to symlink attacks because
it calls lsetxattr() which follows symbolic links in all path elements but
the rightmost one.

This patch introduces a helper to emulate the non-existing fsetxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lsetxattr().

local_lsetxattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
5507904e36 9pfs: local: llistxattr: don't follow symlinks
The local_llistxattr() callback is vulnerable to symlink attacks because
it calls llistxattr() which follows symbolic links in all path elements but
the rightmost one.

This patch introduces a helper to emulate the non-existing flistxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to llistxattr().

local_llistxattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
56ad3e54da 9pfs: local: lgetxattr: don't follow symlinks
The local_lgetxattr() callback is vulnerable to symlink attacks because
it calls lgetxattr() which follows symbolic links in all path elements but
the rightmost one.

This patch introduces a helper to emulate the non-existing fgetxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lgetxattr().

local_lgetxattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
996a0d76d7 9pfs: local: open/opendir: don't follow symlinks
The local_open() and local_opendir() callbacks are vulnerable to symlink
attacks because they call:

(1) open(O_NOFOLLOW) which follows symbolic links in all path elements but
    the rightmost one
(2) opendir() which follows symbolic links in all path elements

This patch converts both callbacks to use new helpers based on
openat_nofollow() to only open files and directories if they are
below the virtfs shared folder

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
0e35a37829 9pfs: local: keep a file descriptor on the shared folder
This patch opens the shared folder and caches the file descriptor, so that
it can be used to do symlink-safe path walk.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:15 +01:00
Greg Kurz
6482a96163 9pfs: introduce relative_openat_nofollow() helper
When using the passthrough security mode, symbolic links created by the
guest are actual symbolic links on the host file system.

Since the resolution of symbolic links during path walk is supposed to
occur on the client side. The server should hence never receive any path
pointing to an actual symbolic link. This isn't guaranteed by the protocol
though, and malicious code in the guest can trick the server to issue
various syscalls on paths whose one or more elements are symbolic links.
In the case of the "local" backend using the "passthrough" or "none"
security modes, the guest can directly create symbolic links to arbitrary
locations on the host (as per spec). The "mapped-xattr" and "mapped-file"
security modes are also affected to a lesser extent as they require some
help from an external entity to create actual symbolic links on the host,
i.e. another guest using "passthrough" mode for example.

The current code hence relies on O_NOFOLLOW and "l*()" variants of system
calls. Unfortunately, this only applies to the rightmost path component.
A guest could maliciously replace any component in a trusted path with a
symbolic link. This could allow any guest to escape a virtfs shared folder.

This patch introduces a variant of the openat() syscall that successively
opens each path element with O_NOFOLLOW. When passing a file descriptor
pointing to a trusted directory, one is guaranteed to be returned a
file descriptor pointing to a path which is beneath the trusted directory.
This will be used by subsequent patches to implement symlink-safe path walk
for any access to the backend.

Symbolic links aren't the only threats actually: a malicious guest could
change a path element to point to other types of file with undesirable
effects:
- a named pipe or any other thing that would cause openat() to block
- a terminal device which would become QEMU's controlling terminal

These issues can be addressed with O_NONBLOCK and O_NOCTTY.

Two helpers are introduced: one to open intermediate path elements and one
to open the rightmost path element.

Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(renamed openat_nofollow() to relative_openat_nofollow(),
 assert path is relative and doesn't contain '//',
 fixed side-effect in assert, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-02-28 11:21:15 +01:00
Greg Kurz
21328e1e57 9pfs: remove side-effects in local_open() and local_opendir()
If these functions fail, they should not change *fs. Let's use local
variables to fix this.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:14 +01:00
Greg Kurz
00c90bd1c2 9pfs: remove side-effects in local_init()
If this function fails, it should not modify *ctx.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:14 +01:00
Greg Kurz
56fc494bdc 9pfs: local: move xattr security ops to 9p-xattr.c
These functions are always called indirectly. It really doesn't make sense
for them to sit in a header file.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-28 11:21:14 +01:00
Pradeep Jagadeesh
b8bbdb886e fsdev: add IO throttle support to fsdev devices
This patchset adds the throttle support for the 9p-local driver.
For now this functionality can be enabled only through qemu cli options.
QMP interface and support to other drivers need further extensions.
To make it simple for other 9p drivers, the throttle code has been put in
separate files.

Signed-off-by: Pradeep Jagadeesh <pradeep.jagadeesh@huawei.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
(pass extra NULL CoMutex * argument to qemu_co_queue_wait(),
 added options to qemu-options.hx, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-02-28 10:31:46 +01:00
Paolo Bonzini
4bae2b397f 9pfs: fix v9fs_lock error case
In this case, we are marshaling an error status instead of the errno value.
Reorganize the out and out_nofid labels to look like all the other cases.
Coverity reports this because the "err = -ENOENT" and "err = -EINVAL"
assignments above are dead, overwritten by the call to pdu_marshal.

(Coverity issues CID1348512 and CID1348513)

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(also open-coded the success path since locking is a nop for us, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-02-28 10:31:46 +01:00
Igor Pavlikevich
6c608953a5 hw/acpi/tco.c: fix tco timer stop
TCO timer does not actually stop

Signed-off-by: Igor Pavlikevich <ipavlikevich@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-02-28 09:03:39 +03:00