Commit Graph

531 Commits

Author SHA1 Message Date
Alexey Kardashevskiy
9fc34ada7e spapr_pci_vfio: Add spapr-pci-vfio-host-bridge to support vfio
The patch adds a spapr-pci-vfio-host-bridge device type
which is a PCI Host Bridge with VFIO support. The new device
inherits from the spapr-pci-host-bridge device and adds an "iommu"
property which is an IOMMU id. This ID represents a minimal entity
for which IOMMU isolation can be guaranteed. In SPAPR architecture IOMMU
group is called a Partitionable Endpoint (PE).

Current implementation supports one IOMMU id per QEMU VFIO PHB. Since
SPAPR allows multiple PHB for no extra cost, this does not seem to
be a problem. This limitation may change in the future though.

Example of use:
Configure and Add 3 functions of a multifunctional device to QEMU:
(the NEC PCI USB card is used as an example here):
-device spapr-pci-vfio-host-bridge,id=USB,iommu=4,index=7 \
-device vfio-pci,host=4:0:1.0,addr=1.0,bus=USB,multifunction=true
-device vfio-pci,host=4:0:1.1,addr=1.1,bus=USB
-device vfio-pci,host=4:0:1.2,addr=1.2,bus=USB

where:
* index=7 is a QEMU PHB index (used as source for MMIO/MSI/IO windows
offset);
* iommu=4 is an IOMMU id which can be found in sysfs:
[aik@vpl2 ~]$ cd /sys/bus/pci/devices/0004:00:00.0/
[aik@vpl2 0004:00:00.0]$ ls -l iommu_group
lrwxrwxrwx 1 root root 0 Jun  5 12:49 iommu_group -> ../../../kernel/iommu_groups/4

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
6d8be4c343 vfio: Add vfio_container_ioctl()
While most operations with VFIO IOMMU driver are generic and used inside
vfio.c, there are still some operations which only specific VFIO IOMMU
drivers implement. The first example of it will be reading a DMA window
start from the host.

This adds a helper which passes an ioctl request to the container's fd.

The helper will check if @req is known. For this, stub is added. This return
-1 on any requests for now.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
9bb62a0702 spapr_iommu: Make in-kernel TCE table optional
POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating
TCE tables in the host kernel memory and handle H_PUT_TCE requests
targeted to specific LIOBN (logical bus number) right in the host without
switching to QEMU. At the moment this is used for emulated devices only
and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE
handler finds a LIOBN and corresponding table, it will put a TCE to
the table and complete hypercall execution. The user space will not be
notified.

Upcoming VFIO support is going to use the same sPAPRTCETable device class
so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE
tables for VFIO are going to be allocated in the host as well.
However VFIO operates with real IOMMU tables and simple copying of
a TCE to the real hardware TCE table will not work as guest physical
to host physical address translation is requited.

So until the host kernel gets VFIO support for H_PUT_TCE, we better not
to register VFIO's TCE in the host.

This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not
in upstream yet and being discussed so now it is always false which means
that in-kernel VFIO acceleration is not supported.

This adds a bool @vfio_accel flag to the sPAPRTCETable device telling
that sPAPRTCETable should not try allocating TCE table in the host kernel
for VFIO. The flag is false now as at the moment there is no VFIO.

This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic
is the same. Since there is only emulated PCI and VIO now, the flag is set
to false. Upcoming VFIO support will set it to true.

This is a preparation patch so no change in behaviour is expected

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
3a3b8502e6 spapr: Fix RTAS token numbers
At the moment spapr_rtas_register() allocates a new token number for every
new RTAS callback so numbers are not fixed and depend on the number of
supported RTAS handlers and the exact order of spapr_rtas_register() calls.
These tokens are copied into the device tree and remain the same during
the guest lifetime.

When we start another guest to receive a migration, it calls
spapr_rtas_register() as well. If the number of RTAS handlers or their
order is different in QEMU on source and destination sides, the "/rtas"
node in the device tree will differ. Since migration overwrites the device
tree (as it overwrites the entire RAM), the actual RTAS config on
the destination side gets broken.

This defines global contant values for every RTAS token which QEMU
is using today.

This changes spapr_rtas_register() to accept a token number instead of
allocating one. This changes all users of spapr_rtas_register().

This changes XICS-KVM not to cache tokens registered with KVM as they
constant now.

This makes TOKEN_BASE global as RTAS_XXX use TOKEN_BASE as
a base. TOKEN_MAX is moved and renamed too and its value is changed
to the last token + 1. Boundary checks for token values are adjusted.

This reserves token numbers for "os-term" handlers and PCI hotplug
which we are working on.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Don Slutz
c87b152072 pc & q35: Add new machine opt max-ram-below-4g
This is a pc & q35 only machine opt.

If you add enough PCI devices then all mmio for them will not fit
below 4G which may not be the layout the user wanted. This allows
you to increase the below 4G address space that PCI devices can use
(aka decrease ram below 4G) and therefore in more cases not have any
mmio that is above 4G.

For example using "-machine pc,max-ram-below-4g=2G" on the command
line will limit the amount of ram that is below 4G to 2G.

Note: this machine option cannot be used to increase the amount
of ram below 4G.

Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: fix 32 bit
2014-06-23 18:02:41 +03:00
Don Slutz
3c2a96699e xen-hvm: Fix xen_hvm_init() to adjust pc memory layout
This is just below_4g_mem_size and above_4g_mem_size which is used later in QEMU.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:50:04 +03:00
Marcel Apfelbaum
f23b6bdc3c hw/pcie: implement power controller functionality
It is needed by hot-unplug in order to get an indication
from the OS when the device can be physically detached.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:48:42 +03:00
Michael S. Tsirkin
7145872ed3 vhost: block migration if backend does not log memory
vhost user does not support LOG_ALL feature bit.
Generally, we should not try to set this bit without
checking that backend can support it first.

Detect and block migration.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 17:37:59 +03:00
Peter Maydell
d70a319b8d Merge remote-tracking branch 'remotes/kvm/uq/master' into staging
* remotes/kvm/uq/master:
  hw/mips: malta: Don't boot from flash with KVM T&E
  MAINTAINERS: Add entry for MIPS KVM
  target-mips: Enable KVM support in build system
  hw/mips: malta: Add KVM support
  hw/mips: In KVM mode, inject IRQ2 (I/O) interrupts via ioctls
  target-mips: Call kvm_mips_reset_vcpu() from mips_cpu_reset()
  target-mips: kvm: Add main KVM support for MIPS
  kvm: Allow arch to set sigmask length
  target-mips: get_physical_address: Add KVM awareness
  target-mips: get_physical_address: Add defines for segment bases
  hw/mips: Add API to convert KVM guest KSEG0 <-> GPA
  hw/mips/cputimer: Don't start periodic timer in KVM mode
  target-mips: Reset CPU timer consistently
  KVM: Fix GSI number space limit

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-20 19:25:18 +01:00
Peter Maydell
0a99aae5fa pc,pci,virtio,hotplug fixes, enhancements
numa work by Hu Tao and others
 memory hotplug by Igor
 vhost-user by Nikolay, Antonios and others
 guest virtio announcements by Jason
 qtest fixes by Sergey
 qdev hotplug fixes by Paolo
 misc other fixes mostly by myself
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTowW+AAoJECgfDbjSjVRpnMMH/jp3sKGzJumYLbi5ihjmYyND
 jYd6ySXoVAjUTgaCvdje5srisOap8pbc783kQvQS4CeWsjZ5Vvh+PZjkBPIqF1pD
 celxGQ43CY7QSUWq+02Dg9VIUwLwZqdKlxNsV01FligQn+ZBQ6sQ6ksWx7oGzqRt
 5/HMZykbwUvSk/4xGUaMn2+/4uhQ0Wz5EsCkv9L/u8kS72k6ldc/tCGZMzBUNHTM
 rW5FPYwMQP0MXgGTXnlLEQjJ7Lozc66IaMZoHw/a/aGSIxdag9Otj0ADuXq6yZaV
 Xi4O/EOJWd1JpSG7w8LOyIZNakpHkU43fmJCLzBjDAupHeRp57TcW5ox4PJYAtg=
 =Oxdt
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc,pci,virtio,hotplug fixes, enhancements

numa work by Hu Tao and others
memory hotplug by Igor
vhost-user by Nikolay, Antonios and others
guest virtio announcements by Jason
qtest fixes by Sergey
qdev hotplug fixes by Paolo
misc other fixes mostly by myself

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

* remotes/mst/tags/for_upstream: (109 commits)
  numa: use RAM_ADDR_FMT with ram_addr_t
  qapi/string-output-visitor: fix bugs
  tests: simplify code
  qapi: fix input visitor bugs
  acpi: rephrase comment
  qmp: add ACPI_DEVICE_OST event handling
  qmp: add query-acpi-ospm-status command
  acpi: implement ospm_status() method for PIIX4/ICH9_LPC devices
  acpi: introduce TYPE_ACPI_DEVICE_IF interface
  qmp: add query-memory-devices command
  numa: handle mmaped memory allocation failure correctly
  pc: acpi: do not hardcode preprocessor
  qmp: clean out whitespace
  qdev: recursively unrealize devices when unrealizing bus
  qdev: reorganize error reporting in bus_set_realized
  qapi: fix build on glib < 2.28
  qapi: make string output visitor parse int list
  qapi: make string input visitor parse int list
  tests: fix memory leak in test of string input visitor
  hmp: add info memdev
  ...

Conflicts:
	include/hw/i386/pc.h
[PMM: fixed minor conflict in pc.h]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-06-20 18:01:24 +01:00
Michael S. Tsirkin
b4acfbcd95 acpi: rephrase comment
"only upto" is not proper English.
Say "up to" and drop "only".

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Igor Mammedov
43f5041008 acpi: implement ospm_status() method for PIIX4/ICH9_LPC devices
... using TYPE_ACPI_DEVICE_IF interface.
Which provides status reporting of ACPI declared memory devices

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Igor Mammedov
521b3673ac acpi: introduce TYPE_ACPI_DEVICE_IF interface
... it will be used to abstract generic ACPI bits from
device that implements ACPI interface.

ACPIOSTInfo type is used for passing-through raw _OST
event/status codes reported by guest OS to a management
layer. It lets management tools interpret values
as specified by ACPI spec if it is interested in it.

QEMU doesn't encode these values as enum, since it
doesn't need to handle them and it allows interface
to scale well without any changes in QEMU while guest
OS and management evolves in time.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Igor Mammedov
6f2e27301d qmp: add query-memory-devices command
... allowing to get state of present memory devices.
Currently implemented only for PCDIMMDevice.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:22 +03:00
Paolo Bonzini
9521d42b54 pc: pass MachineState to pc_memory_init
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:20 +03:00
Paolo Bonzini
dfabb8b916 numa: introduce memory_region_allocate_system_memory
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: resolve conflicts
2014-06-19 18:44:19 +03:00
Nikolay Nikolaev
1a1bfac9ee Add vhost-backend and VhostBackendType
Use vhost_set_backend_type to initialise a proper vhost_ops structure.
In vhost_net_init and vhost_net_start_one call conditionally TAP related
initialisation depending on the vhost backend type.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:56 +03:00
Nikolay Nikolaev
24d1eb33eb Add vhost_ops to vhost_dev struct and replace all relevant ioctls
Decouple vhost from the Linux kernel by introducing vhost_ops. The
intention is to provide different backends - a 'kernel' backend based on
the ioctl interface, and an 'user' backend based on a UNIX domain socket
and shared memory interface.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:56 +03:00
Nikolay Nikolaev
81647a655f vhost_net_init will use VhostNetOptions to get all its arguments
vhost_dev_init will replace devfd and devpath with a single opaque argument.
This is initialised with a file descriptor. When TAP is used (through
vhost_net), open /dev/vhost-net and pass the fd as an opaque parameter in
VhostNetOptions. The same applies to vhost-scsi - open /dev/vhost-scsi and
pass the fd.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:56 +03:00
Nikolay Nikolaev
2e6d46d77e vhost: add vhost_get_features and vhost_ack_features
Generalize the features get/ack to be used for both vhost-net and vhost-scsi.
In vhost-net add vhost_net_get_feature_bits to select the feature bit set
depending on the NetClient kind.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:55 +03:00
Jason Wang
f57fcf7063 virtio-net: announce self by guest
It's hard to track all mac addresses and their configurations (e.g
vlan or ipv6) in qemu. Without this information, it's impossible to
build proper garp packet after migration. The only possible solution
to this is let guest (who knows all configurations) to do this.

So, this patch introduces a new readonly config status bit of virtio-net,
VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
presence of its link through config update interrupt.When guest has
done the announcement, it should ack the notification through
VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
Linux guest).

During load, a counter of announcing rounds is set so that after the vm is
running it can trigger rounds of config interrupts to notify the guest to build
and send the correct garps.

Cc: Liuyongan <liuyongan@huawei.com>
Cc: Amos Kong <akong@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:54 +03:00
Michael S. Tsirkin
292b1634d0 ich: get rid of spaces in type name
Names with spaces in them are nasty, let's not go there.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:53 +03:00
Igor Mammedov
bf1e893959 pc: add "hotplug-memory-region-size" property to PC_MACHINE
... it will be used by acpi-build code and by unit tests

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:52 +03:00
Igor Mammedov
bef3492d11 pc: ACPI BIOS: implement memory hotplug interface
- provides static SSDT object for memory hotplug that can handle
  upto 256 hotplugable memory slots
- SSDT template for memory devices and runtime generator
  of them in SSDT table.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:51 +03:00
Igor Mammedov
781bbd6bec pc: add acpi-device link to PCMachineState
the link will used later to access device implementing
ACPI functions instead of adhoc lookup in QOM tree.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:51 +03:00
Igor Mammedov
f816a62daa pc: migrate piix4 & ich9 MemHotplugState
Adds an optional subsection that allows to migrate current
state of acpi_memory_hotplug of ACPI PM device.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:51 +03:00
Igor Mammedov
1f8621842e acpi:ich9: add memory hotplug handling
Add memory hotplug initialization/handling to ICH9 LPC device
and enable it by default for post 2.0 machine types

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:51 +03:00
Igor Mammedov
34774320c3 acpi:piix4: add memory hotplug handling
Add memory hotplug initialization/handling to PIIX4_PM device
and enable it by default for post 2.0 machine types

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: resolve conflict in pc.h
2014-06-19 16:41:50 +03:00
Igor Mammedov
3ef77acab2 acpi: memory hotplug ACPI hardware implementation
- implements QEMU hardware part of memory hotplug protocol
  described at "docs/specs/acpi_mem_hotplug.txt"
- handles only memory add notification event for now

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:49 +03:00
Igor Mammedov
7e629d1d8d acpi: rename cpu_hotplug_defs.h to pc-hotplug.h
to make it more generic, so it could be used for memory hotplug
as well.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:49 +03:00
Igor Mammedov
0cd03d89b9 pc-dimm: add busy slot check and slot auto-allocation
- if slot property is not specified on -device/device_add command,
treat default value as request for assigning PCDIMMDevice to
the first free slot.

- if slot is provided with -device/device_add command, attempt to
use it or fail command if it's already occupied.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:49 +03:00
Igor Mammedov
0b31257116 pc-dimm: add busy address check and address auto-allocation
- if 'addr' property is not specified on -device/device_add command,
treat the default value as request for assigning PCDIMMDevice to
the first free memory region.

- if 'addr' is provided with -device/device_add command, attempt to
use it or fail command if it's already occupied or falls inside
of an existing PCDIMMDevice memory region.

Note:
GCompareFunc(a, b) used by g_slist_insert_sorted() returns 'gint',
however it might be too small to fit difference between
2 addresses. So use 128bit to calculate the difference and normalize
result to -1/0/1 return values.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Andrey Korolyov <andrey@xdel.ru>

MST: commit log tweaks
2014-06-19 16:41:49 +03:00
Igor Mammedov
95bee274fd pc: add memory hotplug handler to PC_MACHINE
that will perform mapping of PC_DIMM device into guest's RAM address space

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:48 +03:00
Igor Mammedov
de268e134c pc: add 'etc/reserved-memory-end' fw_cfg interface for SeaBIOS
'etc/reserved-memory-end' will allow QEMU to tell BIOS where PCI
BARs mapping could safely start in high memory.

Allowing BIOS to start mapping 64-bit PCI BARs at address where it
wouldn't conflict with other mappings QEMU might place before it.

That permits QEMU to reserve extra address space before
64-bit PCI hole for memory hotplug.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:48 +03:00
Igor Mammedov
a0cc8856e8 pc: exit QEMU if number of slots more than supported 256
... which is imposed by current naming scheme of ACPI memory devices.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:48 +03:00
Igor Mammedov
619d11e463 pc: initialize memory hotplug address space
initialize and map hotplug memory address space container
into guest's RAM address space.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:47 +03:00
Vasilis Liaskovitis
10b7e74bf2 pc: implement pc-dimm device abstraction
Each hotplug-able memory slot is a PCDIMMDevice.
A hot-add operation for a memory device:
- creates a new PCDIMMDevice and makes hotplug controller to map it into
  guest address space

Hotplug operations are done through normal device_add commands.
For migration case, all hotplugged memory devices on source should be
specified on target's command line using '-device' option with
properties set to the same values as on source.

To simplify review, patch introduces only PCDIMMDevice QOM skeleton that
will be extended by following patches to implement actual memory hotplug
and related functions.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:47 +03:00
Igor Mammedov
b745454811 qdev: hotplug for bus-less devices
Add get_hotplug_handler() method to machine, and
make bus-less device use it during hotplug
as a means to discover a hotplug handler controller.
The returned controller is used to perform hotplug
actions.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:46 +03:00
Igor Mammedov
c270fb9eff vl.c: extend -m option to support options for memory hotplug
Add following parameters:
  "slots" - total number of hotplug memory slots
  "maxmem" - maximum possible memory

"slots" and "maxmem" should go in pair and "maxmem" should be greater
than "mem" for memory hotplug to be enabled.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: fix build on 32 bit
2014-06-19 16:41:46 +03:00
Ming Lei
91d670fbf9 virtio-scsi: define dummy handle_output for vhost-scsi vqs
vhost userspace needn't to handle vq's notification from guest,
so define dummy handle_output callback for all vqs of vhost-scsi.

In some corner cases(such as when handling vq's reset from VM), virtio-pci
still trys to handle pending virtio-scsi events, then object check failure
inside virtio_scsi_handle_event() for vhost-scsi can be triggered.

The issue can be reproduced by 'rmmod virtio-scsi', 'system sleep' or reboot
inside VM.

Cc: qemu-stable@nongnu.org
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-19 10:15:48 +02:00
Igor Mammedov
d5747cace7 pc: create custom generic PC machine type
it will be used for PC specific options/variables

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-18 21:09:55 +03:00
Sanjay Lal
253fffe725 hw/mips: Add API to convert KVM guest KSEG0 <-> GPA
Add API for converting physical addresses to KVM guest KSEG0 addresses,
and fix the existing API for converting KSEG0 addresses to physical
addresses to work in the KVM case. Both have the same sized KSEG0, so
it's just a case of fixing the mask.

In KVM trap and emulate mode both the guest kernel and guest userspace
execute in useg:
    Guest User address space:   0x00000000..0x3fffffff
    Guest Kernel Unmapped:      0x40000000..0x5fffffff
    Guest Kernel Mapped:        0x60000000..0x7fffffff

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 16:58:10 +02:00
Paolo Bonzini
3eff1f46f0 virtio-scsi: add support for the any_layout feature
Store the request and response headers by value, and let
virtio_scsi_parse_req check that there is only one of datain
and dataout.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 08:47:11 +02:00
Badari Pulavarty
9dbae97723 spapr_pci: Advertise MSI quota
Hotplug of multiple disks fails due to MSI vector quota check.
Number of MSI vectors default to 8 allowing only 4 devices.
This happens on RHEL6.5 guest. RHEL7 and SLES11 guests fallback
to INTX.

One way to workaround the issue is to increase total MSIs,
so that MSI quota check allows us to hotplug multiple disks.

This sets the quota to the maximum number of interupts XICS has
which is 1024 now (XICS_IRQS). This moves XICS_IRQS from spapr.c
to xics.h for wider visibility.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
[aik: put XICS_IRQS=1024 instead of 64i, fixed endianness and size]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Alexey Kardashevskiy
d5ac4f5433 spapr_hcall: Add address-translation-mode-on-interrupt resource in H_SET_MODE
This adds handling of the RESOURCE_ADDR_TRANS_MODE resource from
the H_SET_MODE, for POWER8 (PowerISA 2.07) only.

This defines AIL flags for LPCR special register.

This changes @excp_prefix according to the mode, takes effect in TCG.

This turns support of a new capability PPC2_ISA207S flag for TCG.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:45 +02:00
Alexey Kardashevskiy
1b8eceee28 spapr_iommu: Introduce bus_offset in sPAPRTCETable
This adds @bus_offset into sPAPRTCETable to tell where TCE table starts
from. It is set to 0 for emulated devices. Dynamic DMA windows will use
other offset.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
650f33adbd spapr_iommu: Introduce page_shift in sPAPRTCETable
At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR
spec allows other page sizes and we are going to implement them, we need
page size to be configrable.

This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT
with it where it is possible.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
523e7b8ab8 spapr_iommu: Get rid of window_size in sPAPRTCETable
This removes window_size as it is basically a copy of nb_table
shifted by SPAPR_TCE_PAGE_SHIFT. As new dynamic DMA windows are
going to support windows as big as the entire RAM and this number
will be bigger that 32 capacity, we will have to do something
about @window_size anyway and removal seems to be the right way to go.

This removes dma_window_start/dma_window_size from sPAPRPHBState as
they are no longer used.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
e28c16f61f spapr_pci: Allow multiple TCE tables per PHB
At the moment sPAPRPHBState contains a @tcet pointer to the only
TCE table. However sPAPR spec allows having more than one DMA window.

Since the TCE object is already a child of SPAPR PHB object, there is
no need to keep an additional pointer to it in sPAPRPHBState so remove it.

This changes the way sPAPRPHBState::reset performs reset of sPAPRTCETable
objects.

This changes the default DMA window properties calculation.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy
cca7fad576 spapr_pci: spapr_iommu: Make DMA window a subregion
Currently the default DMA window is represented by a single MemoryRegion.
However there can be more than just one window so we need
a "root" memory region to be separated from the actual DMA window(s).

This introduces a "root" IOMMU memory region and adds a subregion for
the default DMA 32bit window. Following patches will add other
subregion(s).

This initializes a default DMA window subregion size to the guest RAM
size as this window can be switched into "bypass" mode which implements
direct DMA mapping.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00