mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Anton Blanchard	85423d90c7	hypervisor property clashes with hypervisor node dtc fails on a recent QEMU snapshot: ERROR (name_properties): "name" property in /hypervisor#1 is incorrect ("hypervisor" instead of base node name) Looking at the device tree we have a hypervisor property: # lsprop hypervisor hypervisor "kvm" But we also have a hypervisor node, with a name that doesn't match: # lsprop hypervisor#1/ name "hypervisor" compatible "linux,kvm" linux,phandle 7e5eb5d8 (2120136152) Commit c08ce91d309c (spapr: add uuid/host details to device tree) looks to have collided with an earlier patch. Remove the hypervisor property. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:54 +02:00
Greg Kurz	8c46f7ec85	spapr_pci: map the MSI window in each PHB On sPAPR, virtio devices are connected to the PCI bus and use MSI-X. Commit `cc943c36fa` has modified MSI-X so that writes are made using the bus master address space and follow the IOMMU path. Unfortunately, the IOMMU address space address space does not have an MSI window: the notification is silently dropped in unassigned_mem_write instead of reaching the guest... The most visible effect is that all virtio devices are non-functional on sPAPR since then. :( This patch does the following: 1) map the MSI window into the IOMMU address space for each PHB - since each PHB instantiates its own IOMMU address space, we can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW) - no real need to keep the MSI window setup in a separate function, the spapr_pci_msi_init() code moves to spapr_phb_realize(). 2) kill the global MSI window as it is not needed in the end Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:53 +02:00
Alexey Kardashevskiy	3242052248	spapr_pci: Fix config space corruption When disabling MSI/MSIX via "ibm,change-msi" RTAS call, no check was made if MSI or MSIX is actually supported and the MSI message was reset unconditionally. If this happened on a device which does not support MSI (but does support MSIX, otherwise "ibm,change-msi" would not be called), this device would have PCIDevice::msi_cap field (MSI capability offset) set to zero and writing a vector would actually clear PCI status. This clears MSI message only if MSI or MSIX is present on a device. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:52 +02:00
Alexander Graf	b981289c49	PPC: Cuda: Use cuda timer to expose tbfreq to guest Mac OS X calibrates a number of frequencies on bootup based on reading tb values on bootup and comparing them to via cuda timer values. The only variable we can really steer well (thanks to KVM) is the cuda frequency. So let's use that one to fake Mac OS X into believing the bus frequency is tbfreq * 4. That way Mac OS X will automatically calculate the correct timebase frequency. With this patch and the patch set I posted earlier I can successfully run Mac OS X 10.2, 10.3 and 10.4 guests with -M mac99 on TCG and KVM. Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:52 +02:00
Alexander Graf	caae6c9611	PPC: Mac: Move tbfreq into local variable We already expose the real CPU's tb frequency to the guest via fw_cfg. Soon we will need to also expose it to the MacIO, so let's move it to a variable that we can leverage every time we need the frequency. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:52 +02:00
Alexander Graf	a8b0503701	PPC: mac_nvram: Remove unused functions The macio_nvram_read and macio_nvram_write functions are never called, just remove them. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:51 +02:00
Nikunj A Dadhania	9674a35626	ppc/spapr: Fix MAX_CPUS to 255 MAX_CPUS 256 is inconsistent with qemu supporting upto 255 cpus. This MAX_CPUS number was percolated back to "virsh capabilities" with wrong max_cpus. Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:49 +02:00
Benjamin Herrenschmidt	b7d1f77ada	spapr: Locate RTAS and device-tree based on real RMA We currently calculate the final RTAS and FDT location based on the early estimate of the RMA size, cropped to 256M on KVM since we only know the real RMA size at reset time which happens much later in the boot process. This means the FDT and RTAS end up right below 256M while they could be much higher, using precious RMA space and limiting what the OS bootloader can put there which has proved to be a problem with some OSes (such as when using very large initrd's) Fortunately, we do the actual copy of the device-tree into guest memory much later, during reset, late enough to be able to do it using the final RMA value, we just need to move the calculation to the right place. However, RTAS is still loaded too early, so we change the code to load the tiny blob into qemu memory early on, and then copy it into guest memory at reset time. It's small enough that the memory usage doesn't matter. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [aik: fixed errors from checkpatch.pl, defined RTAS_MAX_ADDR] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: fix compilation on 32bit hosts] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy	c3b4f589d8	spapr: Fix ibm, associativity for memory nodes We want the associtivity lists of memory and CPU nodes to match but memory nodes have incorrect domain#3 which is zero for CPU so they won't match. This clears domain#3 in the list to match CPUs associtivity lists. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy	b082d65a30	spapr: Add a helper for node0_size calculation In multiple places there is a node0_size variable calculation which assumes that NUMA node #0 and memory node #0 are the same things which they are not. Since we are going to change it and do not want to change it in multiple places, let's make a helper. This adds a spapr_node0_size() helper and makes use of it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy	6010818c30	spapr: Split memory nodes to power-of-two blocks Linux kernel expects nodes to have power-of-two size and does WARN_ON if this is not the case: [ 0.041456] WARNING: at drivers/base/memory.c:115 which is: === /* Validate blk_sz is a power of 2 and not less than section size */ if ((block_sz & (block_sz - 1)) \|\| (block_sz < MIN_MEMORY_BLOCK_SIZE)) { WARN_ON(1); block_sz = MIN_MEMORY_BLOCK_SIZE; } === This splits memory nodes into set of smaller blocks with a size which is a power of two. This makes sure the start address of every node is aligned to the node size. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: squash windows compile fix in] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy	7db8a127e3	spapr: Refactor spapr_populate_memory() to allow memoryless nodes Current QEMU does not support memoryless NUMA nodes, however actual hardware may have them so it makes sense to have a way to emulate them in QEMU. This prepares SPAPR for that. This moves 2 calls of spapr_populate_memory_node() into the existing loop over numa nodes so first several nodes may have no memory and this still will work. If there is no numa configuration, the code assumes there is just a single node at 0 and it has all the guest memory. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy	81014ac2b8	spapr: Use DT memory node rendering helper for other nodes This finishes refactoring by using the spapr_populate_memory_node helper for all nodes and removing leftovers from spapr_populate_memory(). This is not a part of the previous patch because the patches look nicer apart. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:47 +02:00
Alexey Kardashevskiy	26a8c353bf	spapr: Move DT memory node rendering to a helper This moves recurring bits of code related to memory@xxx nodes creation to a helper. This makes use of the new helper for node@0. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:47 +02:00
Gonglei	a21a7a7012	spapr: fix possible memory leak get_boot_devices_list() will malloc memory, spapr_finalize_fdt doesn't free it. Signed-off-by: Chenliang <chenliang88@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:47 +02:00
Alexander Graf	261265cc91	PPC: mac99: Move NVRAM to page boundary when necessary When running KVM we have to adhere to host page boundaries for memory slots. Unfortunately the NVRAM on mac99 is a 4k RAM hole inside of an MMIO flash area. So if our host is configured with 64k page size, we can't use the mac99 target with KVM. This is a real shame, as this limitation is not really an issue - we can easily map NVRAM somewhere else and at least Linux and Mac OS X use it at their new location. So in that emergency case when it's about failing to run at all and moving NVRAM to a place it shouldn't be at, choose the latter. This patch enables -M mac99 with KVM on 64k page size hosts. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:47 +02:00
Nikunj A Dadhania	ef9514431d	spapr: add uuid/host details to device tree Useful for identifying the guest/host uniquely within the guest. Adding following properties to the guest root node. vm,uuid - uuid of the guest host-model - Host model number host-serial - Host machine serial number hypervisor type - Tells its "kvm" Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:47 +02:00
Peter Maydell	7d0cd464a7	hw/ppc/spapr_hcall.c: Fix typo in function names Fix a typo in the names of a couple of functions (s/resouce/resource/). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:47 +02:00
Nikunj A Dadhania	2e14072f9e	ppc: spapr-rtas - implement os-term rtas call PAPR compliant guest calls this in absence of kdump. This finally reaches the guest and can be handled according to the policies set by higher level tools(like taking dump) for further analysis by tools like crash. Linux kernel calls ibm,os-term when extended property of os-term is set. This makes sure that a return to the linux kernel is gauranteed. Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> [agraf: reduce RTAS_TOKEN_MAX] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:45 +02:00
Alexander Graf	277c7a4d71	PPC: KVM: Fix g3beige and mac99 when HV is loaded On PPC we have 2 different styles of KVM: PR and HV. HV can only virtualize sPAPR guests while PR can virtualize everything that's reasonably close to the host hardware platform. As long as only one kernel module (PR or HV) is loaded, the "default" kvm type is the module that's loaded. So if your hardware only supports PR mode you can easily spawn a Mac VM. However, if both HV and PR are loaded we default to HV mode. And in that case the Mac machines have to explicitly ask for PR mode to get a working VM. Fix this up by explicitly having the Mac machines ask for PR style KVM. This fixes bootup of Mac VMs on systems where bot HV and PR kvm modules are loaded for me. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-09-08 12:50:45 +02:00
Peter Maydell	f2426947de	pci, pc fixes, features A bunch of bugfixes - these will make sense for 2.1.1 Initial Intel IOMMU support. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJUBdygAAoJECgfDbjSjVRpa9cIAJS06we0CpJaVmPrQS5HvC1w An5Y5bGdfMQtfKjqN1Kehmtu/+wjNKZJw427+6B+KNO7wm9rRUiu927qp9lNGlbH g3ybrknKYeyqVO/43SJt8c1eODSkmNgHPqyCkRVLbriYo850b2HhjJyMvVNZqeHD zuTmU95GTNeiYAV8J1c59OrqUz302kCXI4A47loY7LdoEFMbJat4DbkrkspuTgbQ EVk5sR8p2atKzgaOV6M6yiAtL5uSBNr9KmHvuA7ZBiV21wmOJm5u3y6DpLczUD90 +Ln6BCjmPS5GQ12pzY7U65enr/x/RYo6k01ig9MP3TndNA02XxCaskqfd083jM8= =4drK -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging pci, pc fixes, features A bunch of bugfixes - these will make sense for 2.1.1 Initial Intel IOMMU support. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Tue 02 Sep 2014 16:05:04 BST using RSA key ID D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" * remotes/mst/tags/for_upstream: vhost_net: start/stop guest notifiers properly pci: avoid losing config updates to MSI/MSIX cap regs virtio-net: don't run bh on vm stopped ioh3420: remove unused ioh3420_init() declaration vhost_net: cleanup start/stop condition intel-iommu: add IOTLB using hash table intel-iommu: add context-cache to cache context-entry intel-iommu: add supports for queued invalidation interface intel-iommu: fix coding style issues around in q35.c and machine.c intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch intel-iommu: add DMAR table to ACPI tables intel-iommu: introduce Intel IOMMU (VT-d) emulation iommu: add is_write as a parameter to the translate function of MemoryRegionIOMMUOps Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2014-09-02 16:07:31 +01:00
Le Tan	8d7b8cb9c2	iommu: add is_write as a parameter to the translate function of MemoryRegionIOMMUOps Add a bool variable is_write as a parameter to the translate function of MemoryRegionIOMMUOps to indicate the operation of the access. It can be used for correct fault reporting from within the callback. Change the interface of related functions. Signed-off-by: Le Tan <tamlokveer@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2014-08-28 23:10:22 +02:00
Alexey Kardashevskiy	3431648272	spapr: Add support for new NMI interface This implements an NMI interface POWERPC SPAPR machine. This enables an "nmi" HMP/QMP command supported on SPAPR. This calls POWERPC_EXCP_RESET (vector 0x100) in the guest to deliver NMI to every CPU. The expected result is XMON (in-kernel debugger) invocation. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-08-25 13:25:16 +02:00
Peter Maydell	0e4a773705	SCSI changes that enable sending vendor-specific commands via virtio-scsi. Memory changes for QOMification and automatic tracking of MR lifetime. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJT8et9AAoJEBvWZb6bTYbyIJAQAI3AlLSe27xWoUGfQUgWH30z Rt/pShHz3BJMfQpD79JfTH8u6uBpkQmKtflerNT7FhXN9ULDzNq+b/jRtke8nkuy ctCt05FhhK00rfWpUoRue4XiCuvbizBU7MK0DI3yCyNdXQyYnFvgnvsJtlqox8Zh J5HZcBJEmdCiWBxq7UPk0qBitp4PqNoy7jlD/Ex3m7fJN5WK2cyspQIT9zmhehVn B8Nwp+RitDDbXbwm0r18col5rFr/6Nj6+dW1gr+7sVJDLNsmJEqC2l3Kgk0wbPkG Uqwbih29me9PC9/L1VLGHY0ApKDQ8JGE0GrYgEg162hbhoxEHkjjoHMhDUfV6Pj8 NkqcjjWl11UUhgkNqrGafayXbBVnOiEglxy8uXCeq14y9Xd/gjK9Fz6MQvRSOjms PFmaKknhdmpxh0DuZmTix7WBmKim8zOiCE0/vrAPvwx5L+d1bn5xh6yQvtVjBMpU Sru3Mhdm9bL9dUDBgOM/G6WCxSTVLBlExOblcYkQh03MfabD7bfplcrKYPXt5ull Y8YLjqkoIfoy5t0ErvtlpdBJjeEz99JXU+wLQ6NYHnzwzTV+oUtSaEph14mAFOcY XkFKdoPDI9PnyEfvy4193du8z/dSbhu7sWgHWbTCQyrcaNnSaVhlH43NUC+p23YN 8vfEsVLd1X7MFkDBUmWp =M+/m -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging SCSI changes that enable sending vendor-specific commands via virtio-scsi. Memory changes for QOMification and automatic tracking of MR lifetime. # gpg: Signature made Mon 18 Aug 2014 13:03:09 BST using RSA key ID 9B4D86F2 # gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>" # gpg: aka "Paolo Bonzini <bonzini@gnu.org>" * remotes/bonzini/tags/for-upstream: mtree: remove write-only field memory: Use canonical path component as the name memory: Use memory_region_name for name access memory: constify memory_region_name exec: Abstract away ref to memory region names loader: Abstract away ref to memory region names tpm_tis: remove instance_finalize callback memory: remove memory_region_destroy memory: convert memory_region_destroy to object_unparent ioport: split deletion and destruction nic: do not destroy memory regions in cleanup functions vga: do not dynamically allocate chain4_alias sysbus: remove unused function sysbus_del_io qom: object: move unparenting to the child property's release callback qom: object: delete properties before calling instance_finalize virtio-scsi: implement parse_cdb scsi-block, scsi-generic: implement parse_cdb scsi-block: extract scsi_block_is_passthrough scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo scsi-bus: prepare scsi_req_new for introduction of parse_cdb Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2014-08-19 13:00:57 +01:00
Paolo Bonzini	d8d9581460	memory: convert memory_region_destroy to object_unparent Explicitly call object_unparent in the few places where we will re-create the memory region. If the memory region is simply being destroyed as part of device teardown, let QOM handle it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-08-18 12:06:20 +02:00
Peter Crosthwaite	aa2ac1dac3	ppc: convert g_new(qemu_irq usages to g_new0 To indicate the IRQs are initially disconnected. Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2014-08-15 18:54:50 +04:00
Hu Tao	e206ad4833	ppc: fix -mem-path failure commit `e938ba0c` tried to enable -mem-path for ppc but breaked some ppc boards. The problems are: 1. it fails when allocating memory for rom, sram whose sizes are less than huge page size: ./ppc-softmmu/qemu-system-ppc -m 512 -mem-path /hugepages/ \ -kernel /home/hutao/Downloads/vmlinux-ppc -initrd \ /home/hutao/Downloads/initrd-ppc.gz qemu-system-ppc: /mnt/data/projects/qemu/exec.c:1184: qemu_ram_set_idstr: Assertion `new_block' failed. 2. if there is a numa node backed by memory backend object, qemu fails with message: ./ppc-softmmu/qemu-system-ppc -m 512 \ -object memory-backend-file,size=512M,mem-path=/hugepages,id=f0 \ -numa node,nodeid=0,memdev=f0 \ -kernel /home/hutao/Downloads/vmlinux-ppc \ -initrd /home/hutao/Downloads/initrd-ppc.gz qemu-system-ppc: memory backend f0 is used multiple times. Each -numa option must use a different memdev value. This patch does following: 1. replaces memory_region_allocate_system_memory() with memory_region_init_ram() for rom, sram. Then only system memory is backed by hugepages when specifying mem-path. 2. for memory banks, allocates all ram with one memory_region_allocate_system_memory(), and use memory_region_init_alias() to initialize memory banks. Tested machines: default(g3beige), mac99, taihu, bamboo, ref405ep. Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-22 17:37:25 +02:00
Gavin Shan	27e27782f7	sPAPR/IOMMU: Fix TCE entry permission The permission of TCE entry should exclude physical base address. Otherwise, unmapping TCE entry can be interpreted to mapping TCE entry wrongly for VFIO devices. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-15 16:11:59 +02:00
Alexey Kardashevskiy	f92f5da108	spapr: Enable use of huge pages `0b183fc87` "memory: move mem_path handling to memory_region_allocate_system_memory" disabled -mempath use for all machines that do not use memory_region_allocate_system_memory() to register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages support was disabled for it. This replaces memory_region_init_ram()+vmstate_register_ram_global() with memory_region_allocate_system_memory() to get huge pages back. This changes RAM size from (ram_limit - rma_alloc_size) to ram_limit as the previous patch moved RMA memory region allocation after RAM allocation and therefore this change does not have immediate effect but simplifies the code. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-15 16:11:59 +02:00
Alexey Kardashevskiy	658fa66b81	spapr: Move RMA memory region registration code PPC970 does not support VRMA (virtual RMA) so real memory required for SLOF to execute must be allocated by the KVM_ALLOCATE_RMA ioctl. Later this memory is used as a part of the guest RAM area. The RMA allocating code also registers a memory region for this piece of RAM. We are going to simplify memory regions layout: RMA memory region will be a subregion in the RAM memory region, both starting from zero. This way we will not have to take care of start address alignment for the piece of RAM next to the RMA. This moves memory region business closer to the RAM memory region creation/allocation code. As this is a mechanical patch, no change in behaviour is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: fix compilation on non-kvm systems] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-15 16:11:59 +02:00
Shreyas B. Prabhu	e938ba0c35	ppc: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory Commit 0b183fc871:"memory: move mem_path handling to memory_region_allocate_system_memory" split memory_region_init_ram and memory_region_init_ram_from_file. Also it moved mem-path handling a step up from memory_region_init_ram to memory_region_allocate_system_memory. Therefore for any board that uses memory_region_init_ram directly, -mem-path is not supported. Fix this by replacing memory_region_init_ram with memory_region_allocate_system_memory. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-15 16:11:58 +02:00
Peter Maydell	b653282ecc	hw/ppc/spapr_hcall.c: Add ULL suffix to 64 bit constant Add ULL suffix to 64 bit constant to prevent compiler warnings on some 32 bit platforms. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2014-07-08 16:03:19 +01:00
Laurent Dufour	4bce526ec4	target-ppc: KVMPPC_H_CAS fix cpu-version endianess During KVMPPC_H_CAS processing, the cpu-version updated value is stored without taking care of the current endianess. As a consequence, the guest may not switch to the right CPU model, leading to unexpected results. If needed, the value is now converted. Fixes: `6d9412ea81` ("target-ppc: Implement "compat" CPU option") Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-08 12:10:36 +02:00
Hervé Poussineau	56de2e5269	prep: Remove CPU reset entry point hack related to OpenHack'Ware Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Andreas Färber <andreas.faerber@web.de>	2014-07-07 16:46:35 +02:00
Hervé Poussineau	97db046678	prep: Remove PCI memory hack related to OpenHack'Ware Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Andreas Färber <andreas.faerber@web.de>	2014-07-07 16:46:35 +02:00
Alexander Graf	79c0ff2cae	PPC: e500: Only create dt entries for existing serial ports When the user specifies -nodefaults he can tell us that he doesn't want any serial ports spawned by default. While we do honor that wish, we still create device tree entries for those non-existent devices. Make device tree generation depend on whether the device is actually available. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy	9a321e9234	spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB Currently SPAPR PHB keeps track of all allocated MSI (here and below MSI stands for both MSI and MSIX) interrupt because XICS used to be unable to reuse interrupts. This is a problem for dynamic MSI reconfiguration which happens when guest reloads a driver or performs PCI hotplug. Another problem is that the existing implementation can enable MSI on 32 devices maximum (SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that. This makes use of new XICS ability to reuse interrupts. This reorganizes MSI information storage in sPAPRPHBState. Instead of static array of 32 descriptors (one per a PCI function), this patch adds a GHashTable when @config_addr is a key and (first_irq, num) pair is a value. GHashTable can dynamically grow and shrink so the initial limit of 32 devices is gone. This changes migration stream as @msi_table was a static array while new @msi_devs is a dynamic hash table. This adds temporary array which is used for migration, it is populated in "spapr_pci"::pre_save() callback and expanded into the hash table in post_load() callback. Since the destination side does not know the number of MSI-enabled devices in advance and cannot pre-allocate the temporary array to receive migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro which allocates the array automatically. This resets the MSI configuration space when interrupts are released by the ibm,change-msi RTAS call. This fixed traces to be more informative. This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which was incorrect by accident. As the internal representation changed, thus bumps migration version number. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: drop g_malloc_n usage] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy	ba0e5bf8de	spapr: Remove @next_irq This removes @next_irq from sPAPREnvironment which was used in old IRQ allocator as XICS is now responsible for IRQs and keeps track of allocated IRQs. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy	bee763dbfb	spapr: Move interrupt allocator to xics The current allocator returns IRQ numbers from a pool and does not support IRQs reuse in any form as it did not keep track of what it previously returned, it only keeps the last returned IRQ. Some use cases such as PCI hot(un)plug may require IRQ release and reallocation. This moves an allocator from SPAPR to XICS. This switches IRQ users to use new API. This uses LSI/MSI flags to know if interrupt is allocated. The interrupt release function will be posted as a separate patch. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:26 +02:00
Sam bobroff	3b50d8974b	spapr: Add RTAS sysparm SPLPAR Characteristics Add support for the SPLPAR Characteristics parameter to the emulated RTAS call ibm,get-system-parameter. The support provides just enough information to allow "cat /proc/powerpc/lparcfg" to succeed without generating a kernel error message. Without this patch the above command will produce the following kernel message: arch/powerpc/platforms/pseries/lparcfg.c \ parse_system_parameter_string Error calling get-system-parameter \ (0xfffffffd) Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:26 +02:00
Sam bobroff	b907d7b0fd	spapr: Add RTAS sysparm UUID Add support for the UUID parameter to the emulated RTAS call ibm,get-system-parameter. Return the guest's UUID as the value for the RTAS UUID system parameter, or null (a zero length result) if it is not set. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:26 +02:00
Sam bobroff	3052d95190	spapr: Fix RTAS sysparm DIAGNOSTICS_RUN_MODE This allows the ibm,get-system-parameter RTAS call to succeed for the DIAGNOSTICS_RUN_MODE system parameter. The problem can be seen with "ppc64_cpu --run-mode" from the powerpc-utils package which fails before this patch with "Machine does not support diagnostic run mode". This is corrected by using the rtas_st_buffer() function to write to the buffer. The RTAS constants are also moved out into a header file, some new constants added and the surrounding code slightly simplified. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> [agraf: remove some commentary] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:25 +02:00
Alexey Kardashevskiy	6026db4501	spapr: Define a 2.1 pseries machine This adds a v2.1 machine to support backward compatibility for newer macines in the case if they ever be implemented. This adds a "pseries-2.1" machine as a child of the "pseries" machine and only changes visible machine name. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:25 +02:00
Alexey Kardashevskiy	6ca1502e36	spapr: Fix code design style (s/SPAPRMachine/sPAPRMachineState) Every single sPAPR QOM object has small first "s". Most (not all yet) QOM objects have "State" suffix. This replaces SPAPRMachine with sPAPRMachineState to conform with QEMU code style and removes redundant empty line. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:25 +02:00
BALATON Zoltan	a0bb2a5fa0	mac99: Add motherboard devices before PCI cards Change the order of creating devices for New World Mac emulation so that devices on the motherboard are added first and PCI cards (VGA and NIC) come later. As a side effect, this also causes OpenBIOS to map the motherboard devices into the MMIO space to the same addresses as on real hardware and allow clients that hardcode these addresses (e.g. MorphOS) to find and use them until OpenBIOS is tought to map devices to specific addresses. (On real hardware the graphics and network cards are really on separate buses but we don't model that yet.) This brings the memory map closer to what is found on PowerMac3,1. Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:24 +02:00
Alexey Kardashevskiy	9fc34ada7e	spapr_pci_vfio: Add spapr-pci-vfio-host-bridge to support vfio The patch adds a spapr-pci-vfio-host-bridge device type which is a PCI Host Bridge with VFIO support. The new device inherits from the spapr-pci-host-bridge device and adds an "iommu" property which is an IOMMU id. This ID represents a minimal entity for which IOMMU isolation can be guaranteed. In SPAPR architecture IOMMU group is called a Partitionable Endpoint (PE). Current implementation supports one IOMMU id per QEMU VFIO PHB. Since SPAPR allows multiple PHB for no extra cost, this does not seem to be a problem. This limitation may change in the future though. Example of use: Configure and Add 3 functions of a multifunctional device to QEMU: (the NEC PCI USB card is used as an example here): -device spapr-pci-vfio-host-bridge,id=USB,iommu=4,index=7 \ -device vfio-pci,host=4:0:1.0,addr=1.0,bus=USB,multifunction=true -device vfio-pci,host=4:0:1.1,addr=1.1,bus=USB -device vfio-pci,host=4:0:1.2,addr=1.2,bus=USB where: * index=7 is a QEMU PHB index (used as source for MMIO/MSI/IO windows offset); * iommu=4 is an IOMMU id which can be found in sysfs: [aik@vpl2 ~]$ cd /sys/bus/pci/devices/0004:00:00.0/ [aik@vpl2 0004:00:00.0]$ ls -l iommu_group lrwxrwxrwx 1 root root 0 Jun 5 12:49 iommu_group -> ../../../kernel/iommu_groups/4 Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy	9bb62a0702	spapr_iommu: Make in-kernel TCE table optional POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating TCE tables in the host kernel memory and handle H_PUT_TCE requests targeted to specific LIOBN (logical bus number) right in the host without switching to QEMU. At the moment this is used for emulated devices only and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE handler finds a LIOBN and corresponding table, it will put a TCE to the table and complete hypercall execution. The user space will not be notified. Upcoming VFIO support is going to use the same sPAPRTCETable device class so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE tables for VFIO are going to be allocated in the host as well. However VFIO operates with real IOMMU tables and simple copying of a TCE to the real hardware TCE table will not work as guest physical to host physical address translation is requited. So until the host kernel gets VFIO support for H_PUT_TCE, we better not to register VFIO's TCE in the host. This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not in upstream yet and being discussed so now it is always false which means that in-kernel VFIO acceleration is not supported. This adds a bool @vfio_accel flag to the sPAPRTCETable device telling that sPAPRTCETable should not try allocating TCE table in the host kernel for VFIO. The flag is false now as at the moment there is no VFIO. This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic is the same. Since there is only emulated PCI and VIO now, the flag is set to false. Upcoming VFIO support will set it to true. This is a preparation patch so no change in behaviour is expected Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy	3a3b8502e6	spapr: Fix RTAS token numbers At the moment spapr_rtas_register() allocates a new token number for every new RTAS callback so numbers are not fixed and depend on the number of supported RTAS handlers and the exact order of spapr_rtas_register() calls. These tokens are copied into the device tree and remain the same during the guest lifetime. When we start another guest to receive a migration, it calls spapr_rtas_register() as well. If the number of RTAS handlers or their order is different in QEMU on source and destination sides, the "/rtas" node in the device tree will differ. Since migration overwrites the device tree (as it overwrites the entire RAM), the actual RTAS config on the destination side gets broken. This defines global contant values for every RTAS token which QEMU is using today. This changes spapr_rtas_register() to accept a token number instead of allocating one. This changes all users of spapr_rtas_register(). This changes XICS-KVM not to cache tokens registered with KVM as they constant now. This makes TOKEN_BASE global as RTAS_XXX use TOKEN_BASE as a base. TOKEN_MAX is moved and renamed too and its value is changed to the last token + 1. Boundary checks for token values are adjusted. This reserves token numbers for "os-term" handlers and PCI hotplug which we are working on. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:22 +02:00
Avik Sil	cc84c0f357	spapr: Add "qemu, boot-menu" property to /chosen This is required to enable boot menu display during booting Signed-off-by: Avik Sil <aviksil@linux.vnet.ibm.com> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-27 13:48:22 +02:00
Wenchao Xia	e010ad8f1e	qapi event: convert RTC_CHANGE This patch also eliminates build time warning caused by no caller of monitor_qapi_event_throttle(). Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2014-06-23 11:12:27 -04:00
Wanlong Gao	8c85901ed3	NUMA: Add numa_info structure to contain numa nodes info Add the numa_info structure to contain the numa nodes memory, VCPUs information and the future added numa nodes host memory policies. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> [Fix hw/ppc/spapr.c - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2014-06-19 18:44:18 +03:00
Badari Pulavarty	9dbae97723	spapr_pci: Advertise MSI quota Hotplug of multiple disks fails due to MSI vector quota check. Number of MSI vectors default to 8 allowing only 4 devices. This happens on RHEL6.5 guest. RHEL7 and SLES11 guests fallback to INTX. One way to workaround the issue is to increase total MSIs, so that MSI quota check allows us to hotplug multiple disks. This sets the quota to the maximum number of interupts XICS has which is 1024 now (XICS_IRQS). This moves XICS_IRQS from spapr.c to xics.h for wider visibility. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> [aik: put XICS_IRQS=1024 instead of 64i, fixed endianness and size] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:46 +02:00
Eduardo Habkost	23825581d7	spapr: Add kvm-type property The kvm-type machine option was left out when MachineState was introduced, preventing the kvm-type option from being used. Add the missing property to the sPAPR machine class, so it can be used. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:46 +02:00
Eduardo Habkost	748abce94f	spapr: Create SPAPRMachine struct Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:46 +02:00
Alexey Kardashevskiy	d5ac4f5433	spapr_hcall: Add address-translation-mode-on-interrupt resource in H_SET_MODE This adds handling of the RESOURCE_ADDR_TRANS_MODE resource from the H_SET_MODE, for POWER8 (PowerISA 2.07) only. This defines AIL flags for LPCR special register. This changes @excp_prefix according to the mode, takes effect in TCG. This turns support of a new capability PPC2_ISA207S flag for TCG. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:45 +02:00
Alexey Kardashevskiy	c4015bbd50	spapr_hcall: Split h_set_mode() This moves H_SET_MODE_RESOURCE_LE handler to a separate function as there are other "resources" coming and this is going to become ugly. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:45 +02:00
Alexander Graf	f7d6914654	PPC: spapr: Expose /hypervisor node in device tree PR KVM supports an ePAPR compliant hypercall interface in parallel to the normal sPAPR one. Expose the ePAPR /hypervisor node and properties to the guest so it can use it. This enables magic page sharing on PR KVM with -M pseries. However we had a few nasty bugs in the magic page implementation on vcpus newer than 970 (p7, p8) that KVM now has workarounds for. It indicates that it does have these workarounds through the PPC_FIXUP_HCALL capability. To not expose broken guest kernels to issues on host kernels that don't have the fixups in place, we don't expose working hypercall instructions when the fixups are not available so that the guest can never active the magic page. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:41 +02:00
Alexey Kardashevskiy	1b8eceee28	spapr_iommu: Introduce bus_offset in sPAPRTCETable This adds @bus_offset into sPAPRTCETable to tell where TCE table starts from. It is set to 0 for emulated devices. Dynamic DMA windows will use other offset. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	650f33adbd	spapr_iommu: Introduce page_shift in sPAPRTCETable At the moment only 4K pages are supported by sPAPRTCETable. Since sPAPR spec allows other page sizes and we are going to implement them, we need page size to be configrable. This adds @page_shift into sPAPRTCETable and replaces SPAPR_TCE_PAGE_SHIFT with it where it is possible. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	523e7b8ab8	spapr_iommu: Get rid of window_size in sPAPRTCETable This removes window_size as it is basically a copy of nb_table shifted by SPAPR_TCE_PAGE_SHIFT. As new dynamic DMA windows are going to support windows as big as the entire RAM and this number will be bigger that 32 capacity, we will have to do something about @window_size anyway and removal seems to be the right way to go. This removes dma_window_start/dma_window_size from sPAPRPHBState as they are no longer used. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	e4c35b78bc	spapr_iommu: Convert old qdev_init_nofail() to object_property_set_bool qdev_init_nofail() was replaced by object_property_set_bool("realized") all over the QEMU so do we. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	e28c16f61f	spapr_pci: Allow multiple TCE tables per PHB At the moment sPAPRPHBState contains a @tcet pointer to the only TCE table. However sPAPR spec allows having more than one DMA window. Since the TCE object is already a child of SPAPR PHB object, there is no need to keep an additional pointer to it in sPAPRPHBState so remove it. This changes the way sPAPRPHBState::reset performs reset of sPAPRTCETable objects. This changes the default DMA window properties calculation. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	cca7fad576	spapr_pci: spapr_iommu: Make DMA window a subregion Currently the default DMA window is represented by a single MemoryRegion. However there can be more than just one window so we need a "root" memory region to be separated from the actual DMA window(s). This introduces a "root" IOMMU memory region and adds a subregion for the default DMA 32bit window. Following patches will add other subregion(s). This initializes a default DMA window subregion size to the guest RAM size as this window can be switched into "bypass" mode which implements direct DMA mapping. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	da6ccee418	spapr_pci: Introduce a finish_realize() callback The spapr-pci PHB initializes IOMMU for emulated devices only. The upcoming VFIO support will do it different. However both emulated and VFIO PHB types share most of the initialization code. For the type specific things a new finish_realize() callback is introduced. This introduces sPAPRPHBClass derived from PCIHostBridgeClass and adds the callback pointer. This implements finish_realize() for emulated devices. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: Fix compilation] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	da95324ebe	spapr_iommu: Enable multiple TCE requests Currently only single TCE entry per request is supported (H_PUT_TCE). However PAPR+ specification allows multiple entry requests such as H_PUT_TCE_INDIRECT and H_STUFF_TCE. Having less transitions to the host kernel via ioctls, support of these calls can accelerate IOMMU operations. This implements H_STUFF_TCE and H_PUT_TCE_INDIRECT. This advertises "multi-tce" capability to the guest if the host kernel supports it (KVM_CAP_SPAPR_MULTITCE) or guest is running in TCG mode. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy	a1d59c0ffa	spapr: Enable dynamic change of the supported hypercalls list At the moment the "ibm,hypertas-functions" list is fixed. However some calls should be listed there if they are supported by QEMU or the host kernel. This enables hyperrtas_prop to grow on stack by adding a SPAPR_HYPERRTAS_ADD macro. "qemu,hypertas-functions" is converted as well. The first user of this is going to be a "multi-tce" property. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:38 +02:00
Alexey Kardashevskiy	00d4f525ec	spapr_iommu: Replace @instance_id with LIOBN for migration SPAPR IOMMU is a bus-less device and therefore its only ID in migration stream is an instance id which is not reliable ID as it depends on the command line parameters order. Since libvirt may change the order, we need something better than that. This removes VMSD descriptor from the class definitiion and registers it with @liobn as an intance ID to let the destination side find the right device to receive migration data. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:38 +02:00
Alexey Kardashevskiy	3794d5482d	spapr: Implement processor compatibility in ibm, client-architecture-support Modern Linux kernels support last POWERPC CPUs so when a kernel boots, in most cases it can find a matching cpu_spec in the kernel's cpu_specs list. However if the kernel is quite old, it may be missing a definition of the actual CPU. To provide an ability for old kernels to work on modern hardware, a Processor Compatibility Mode has been introduced by the PowerISA specification. >From the hardware prospective, it is supported by the Processor Compatibility Register (PCR) which is defined in PowerISA. The register enables one of the compatibility modes (2.05/2.06/2.07). Since PCR is a hypervisor privileged register and cannot be directly accessed from the guest, the mode selection is done via ibm,client-architecture-support (CAS) RTAS call using which the guest specifies what "raw" and "architected" CPU versions it supports. QEMU works out the best match, changes a "cpu-version" property of every CPU and notifies the guest about the change by setting these properties in the buffer passed as a response on a custom H_CAS hypercall. This implements ibm,client-architecture-support parameters parsing (now only for PVRs) and cooks the device tree diff with new values for "cpu-version", "ibm,ppc-interrupt-server#s" and "ibm,ppc-interrupt-server#s" properties. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:38 +02:00
Alexey Kardashevskiy	2a48d99335	spapr: Limit threads per core according to current compatibility mode This puts a limit to the number of threads per core based on the current compatibility mode. Although PowerISA specs do not specify the maximum threads per core number, the linux guest still expects that PowerISA2.05-compatible CPU supports only 2 threads per core as this is what POWER6 (2.05 compliant CPU) implements, the same is for POWER7 (2.06, 4 threads) and POWER8 (2.07, 8 threads). This calls spapr_fixup_cpu_smt_dt() with the maximum allowed number of threads which affects ibm,ppc-interrupt-server#s and ibm,ppc-interrupt-gserver#s properties. The number of CPU nodesremains unchanged. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:38 +02:00
Alexey Kardashevskiy	82677ed2f5	spapr: Rework spapr_fixup_cpu_dt() In PPC code we usually use the "cs" name for a CPUState* variables and "cpu" for PowerPCCPU. So let's change spapr_fixup_cpu_dt() to use same rules as spapr_create_fdt_skel() does. This adds missing nodes creation if they do not already exist in the current device tree, this is going to be used from the client-architecture-support handler. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:37 +02:00
Alexey Kardashevskiy	2a6593cb6a	spapr: Add ibm, client-architecture-support call The PAPR+ specification defines a ibm,client-architecture-support (CAS) RTAS call which purpose is to provide a negotiation mechanism for the guest and the hypervisor to work out the best compatibility parameters. During the negotiation process, the guest provides an array of various options and capabilities which it supports, the hypervisor adjusts the device tree and (optionally) reboots the guest. At the moment the Linux guest calls CAS method at early boot so SLOF gets called. SLOF allocates a memory buffer for the device tree changes and calls a custom KVMPPC_H_CAS hypercall. QEMU parses the options, composes a diff for the device tree, copies it to the buffer provided by SLOF and returns to SLOF. SLOF updates the device tree and returns control to the guest kernel. Only then the Linux guest parses the device tree so it is possible to avoid unnecessary reboot in most cases. The device tree diff is a header with an update format version (defined as 1 in this patch) followed by a device tree with the properties which require update. If QEMU detects that it has to reboot the guest, it silently does so as the guest expects reboot to happen because this is usual pHyp firmware behavior. This defines custom KVMPPC_H_CAS hypercall. The current SLOF already has support for it. This implements stub which returns very basic tree (root node, no properties) to the guest. As the return buffer does not contain any change, no change in behavior is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:37 +02:00
Alexey Kardashevskiy	6d9412ea81	target-ppc: Implement "compat" CPU option This adds basic support for the "compat" CPU option. By specifying the compat property, the user can manually switch guest CPU mode from "raw" to "architected". This defines feature disable bits which are not used yet as, for example, PowerISA 2.07 says if 2.06 mode is selected, the TM bit does not matter - transactional memory (TM) will be disabled because 2.06 does not define it at all. The same is true for VSX and 2.05 mode. So just setting a mode must be ok. This does not change the existing behavior as the actual compatibility mode support is coming in next patches. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: fix compilation on 32bit hosts] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:37 +02:00
Alexey Kardashevskiy	833d46685d	spapr: Move SMT-related properties out of skeleton fdt The upcoming support of the "ibm,client-architecture-support" reconfiguration call will be able to change dynamically the number of threads per core (SMT mode). From the device tree prospective this does not change the number of CPU nodes (as it is one node per a CPU core) but affects content and size of the ibm,ppc-interrupt-server#s and ibm,ppc-interrupt-gserver#s properties. This moves ibm,ppc-interrupt-server#s and ibm,ppc-interrupt-gserver#s out of the device tree skeleton. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:37 +02:00
Alexey Kardashevskiy	10582ff832	spapr: Add ibm, chip-id property in device tree This adds a "ibm,chip-id" property for CPU nodes which should be the same for all cores in the same CPU socket. The recent guest kernels use this information to associate threads with sockets. Refer to the kernel commit 256f2d4b463d3030ebc8d2b54f427543814a2bdc for more details. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:35 +02:00
Alexey Kardashevskiy	98a8b52442	spapr: Add support for time base offset migration This allows guests to have a different timebase origin from the host. This is needed for migration, where a guest can migrate from one host to another and the two hosts might have a different timebase origin. However, the timebase seen by the guest must not go backwards, and should go forwards only by a small amount corresponding to the time taken for the migration. This is only supported for recent POWER hardware which has the TBU40 (timebase upper 40 bits) register. That includes POWER6, 7, 8 but not 970. This adds kvm_access_one_reg() to access a special register which is not in env->spr. This requires kvm_set_one_reg/kvm_get_one_reg patch. The feature must be present in the host kernel. This bumps vmstate_spapr::version_id and enables new vmstate_ppc_timebase only for it. Since the vmstate_spapr::minimum_version_id remains unchanged, migration from older QEMU is supported but without vmstate_ppc_timebase. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:35 +02:00
Alexander Graf	3812c71ffa	PPC: e500: Move to u-boot as firmware Almost all platforms QEMU emulates have some sort of firmware they can load to expose a guest environment that closely resembles the way it would look like on real hardware. This patch introduces such a firmware on our e500 platforms. U-boot is the default firmware for most of these systems and as such our preferred choice. For backwards compatibility reasons (and speed and simplicity) we skip u-boot when you use -kernel and don't pass in -bios. For all other combinations like -kernel and -bios or no -kernel you get u-boot as firmware. This allows you to modify the boot environment, execute a networked boot through the e1000 emulation and execute u-boot payloads. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:35 +02:00
Alexander Graf	903585dec6	PPC: e500: Expose kernel load address in dt We want to move to a model where firmware loads our kernel. To achieve this we need to be able to tell firmware where the kernel lies. Let's copy the mechanism we already use for -M pseries and expose the kernel load address and size through the device tree. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:35 +02:00
Bharat Bhushan	3016dca06c	PPC: e500: implement PCI INTx routing This patch adds pci pin to irq_num routing callback. This callback is called from pci_device_route_intx_to_irq to find which pci device maps to which irq. This fix is required for pci-device passthrough using vfio. Also without this patch we gets below prints " PCI: Bug - unimplemented PCI INTx routing (e500-pcihost) qemu-system-ppc64: PCI: Bug - unimplemented PCI INTx routing (e500-pcihost) " and Legacy interrupt does not work with pci device passthrough. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> [agraf: remove double semicolon] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:34 +02:00
Bharat Bhushan	d575a6ce0e	PPC: e500: some pci related cleanup - Use PCI_NUM_PINS rather than hardcoding - use "pin" wherever possible Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:34 +02:00
Alexey Kardashevskiy	28668b5f31	spapr_pci: fix MSI limit At the moment XICS does not support interrupts reuse so sPAPR PHB implements this. sPAPRPHBState holds array of 32 spapr_pci_msi to describe PCI config address, first MSI and number of MSIs. Once allocated for a device, QEMU tries reusing this config until the number of MSIs changes. Existing SPAPR guests call ibm,change-msi in a loop until the handler returns the requested number of vectors. Recently introduced check for the maximum number of MSI/MSIX vectors supported by a device only works for a device which is new for PHB's MSI cache. If it is already there, the check is not performed which leads to new IRQ block allocation. This happens during PCI hotplug even when the user hot plug the same device which he just hot unplugged. This moves the check earlier. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:32 +02:00
BALATON Zoltan	9d1c128341	mac99: Added FW_CFG_PPC_BUSFREQ to match CLOCKFREQ and TBFREQ already there While there, also moved the hard coded value for CLOCKFREQ to a #define. Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:28 +02:00
Alexey Kardashevskiy	b26696b519	spapr_pci: Fix number of returned vectors in ibm, change-msi Current guest kernels try allocating as many vectors as the quota is. For example, in the case of virtio-net (which has just 3 vectors) the guest requests 4 vectors (that is the quota in the test) and the existing ibm,change-msi handler returns 4. But before it returns, it calls msix_set_message() in a loop and corrupts memory behind the end of msix_table. This limits the number of vectors returned by ibm,change-msi to the maximum supported by the actual device. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: qemu-stable@nongnu.org [agraf: squash in bugfix from aik] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:27 +02:00
Greg Kurz	fabe9ee113	spapr-pci: remove io ports workaround In the past, IO space could not be mapped into the memory address space so we introduced a workaround for that. Nowadays it does not look necessary so we can remove the workaround and make sPAPR PCI configuration simplier. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-16 13:24:27 +02:00
Juan Quintela	3aff6c2fea	savevm: Remove all the unneeded version_minimum_id_old (ppc) After previous Peter patch, they are redundant. This way we don't assign them except when needed. Once there, there were lots of case where the ".fields" indentation was wrong: .fields = (VMStateField []) { and .fields = (VMStateField []) { Change all the combinations to: .fields = (VMStateField[]){ The biggest problem (appart from aesthetics) was that checkpatch complained when we copy&pasted the code from one place to another. Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2014-06-16 04:55:26 +02:00
Marcel Apfelbaum	3ef9622182	machine: Conversion of QEMUMachineInitArgs to MachineState Total removal of QEMUMachineInitArgs struct. QEMUMachineInitArgs's fields are copied into MachineState. Removed duplicated fields from MachineState. All the other changes are only mechanical refactoring, no semantic changes. Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> (s390) Reviewed-by: Michael S. Tsirkin <mst@redhat.com> (PC) [AF: Renamed ms -> machine, use MACHINE_GET_CLASS()] Signed-off-by: Andreas Färber <afaerber@suse.de>	2014-05-28 17:35:01 +02:00
Stefan Weil	6a0a70b0f5	hw: Add missing 'static' attributes This fixes warnings from the static code analysis (smatch). Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2014-05-07 21:00:43 +04:00
Kirill Batuzov	848696bf35	PortioList: Store PortioList in device state PortioList is an abstraction used for construction of MemoryRegionPortioList from MemoryRegionPortio. It can be used later to unmap created memory regions. It also requires proper cleanup because some of the memory inside is allocated dynamically. By moving PortioList ot device state we make it possible to cleanup later and avoid leaking memory. This change spans several target platforms. The following testcases cover all changed lines: qemu-system-ppc -M prep qemu-system-i386 -vga qxl qemu-system-i386 -M isapc -soundhw adlib -device ib700,id=watchdog0,bus=isa.0 Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andreas Färber <afaerber@suse.de>	2014-05-05 20:58:33 +02:00
Marcel Apfelbaum	958db90cd5	machine: Remove QEMUMachine indirection from MachineClass No need to go through qemu_machine field. Use MachineClass fields directly. Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com> Signed-off-by: Andreas Färber <afaerber@suse.de>	2014-05-05 19:08:49 +02:00
Marcel Apfelbaum	00b4fbe274	machine: Copy QEMUMachine's fields to MachineClass In order to eliminate the QEMUMachine indirection, add its fields directly to MachineClass. Do not yet remove qemu_machine field because it is still in use by sPAPR. Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com> [AF: Copied fields for sPAPR, too] Signed-off-by: Andreas Färber <afaerber@suse.de>	2014-05-05 19:08:49 +02:00
Alexander Graf	6a2b3d89fa	ppce500_spin: Initialize struct properly The spinning struct is in guest endianness, so we need to initialize its variables in guest endianness too. This fixes booting e500 guests with SMP on x86 for me. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-04-08 11:20:05 +02:00
Alexander Graf	e81a982aa5	PPC: Clean up DECR implementation There are 3 different variants of the decrementor for BookE and BookS. The BookE variant sets TSR[DIS] to 1 when the DEC value becomes 1 or 0. TSR[DIS] is then the indicator whether the decrementor interrupt line is asserted or not. The old BookS variant treats DEC as an edge interrupt that gets triggered when the DEC value's top bit turns 1 from 0. The new BookS variant maintains the assertion bit inside DEC itself. Whenever the DEC value becomes negative (top bit set) the DEC interrupt line is asserted. So far we implemented mostly the old BookS variant. Let's do them all properly. This fixes booting pseries ppc64 guest images in TCG mode for me. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-04-08 11:20:04 +02:00
Alexander Graf	6a450df9b8	PPC: E500: Set PIR default reset value rather than SPR value We now reset SPRs to their reset values on CPU reset. So if we want to have an SPR persistently changed, we need to change its default reset value rather than the value itself manually. Do this for SPR_BOOKE_PIR, fixing e500v2 SMP boot. Reported-by: Frederic Konrad <fred.konrad@greensocs.com> Signed-off-by: Alexander Graf <agraf@suse.de> Tested-by: KONRAD Frederic <fred.konrad@greensocs.com>	2014-04-08 11:19:59 +02:00
Peter Maydell	a1f7f97b95	hw/ppc: Avoid shifting left into sign bit Add U suffix to various places where we were doing "1 << 31", which is undefined behaviour, and also to other constant definitions in the same groups, for consistency. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2014-03-27 19:22:49 +04:00
Peter Maydell	3a87f8b685	PowerPC queue for 2.0 * sPAPR loop fix * SPR reset fix * Reduce allocation size of indirect opcode tables * Restrict number of CPU threads * sPAPR H_SET_MODE fixes * sPAPR firmware path fixes * Static and constness cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJTKkhmAAoJEPou0S0+fgE/mTYQAJWNNiq6ti1udxVOXJ1WAuh1 7DTdiigXY0P/iwigebAVJ78Qs4bLZEJQUv5Xz/uaCwRIyk6uEnLf3J+ZSqpKxN5P Bxj2S+ZS4eaPS1eKSKdVAadFOTqmexrD1g1LarpuvDReeKJVzpdVRAGgQcDcI2Hk 0YVjh2MY5GLWFJJNh6Ir/RI9HNp/TGjzGvvZlT0KS6GG9qJE/xRAmpZKVD1YDcjI nmAA0HgEOT1bGgtdNbK7yst8+nPr9WIbmkbTpvcbo5s+EjVR/iqqeAr4DsxUTjIz EXRpVgKrl+GZSwCo+APnXc+4qNaiuiR/WVdKYwkJBE+jbxUx9o/fuQ3nOz+eUWGz 4sv/MC52MA4IUHaNn6Cm8ghVZzseXB8RfLHd7MX978HDyqCsgPkKXoMyjIhvK9de /AB8h8vTdqlKFclIZFYH/fZE/TuAoaIHBJeQtpT1KKmC7lKWlo6x6MW/Z2jYnc6C nwPyGqM/wjRvaOx+dXewPlc/iz9bWrVNBDhSagdFT15gcRjIVh6Fta839dKn38iT +dA6UG1wCMPdK8+EAcf0xWnsnn/rfuUhz5UjizP6srnB2+oEJtdbW1aPDh/lFxTZ nVU8j/jgA16wTVP8v9dWHrHZjCxlclfH/s51M0eT/ClIiozPxSP2DZbhyzqtdTNs +1jfNaIZCEQnvvBQPBsz =clOM -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/afaerber/tags/ppc-for-2.0' into staging PowerPC queue for 2.0 * sPAPR loop fix * SPR reset fix * Reduce allocation size of indirect opcode tables * Restrict number of CPU threads * sPAPR H_SET_MODE fixes * sPAPR firmware path fixes * Static and constness cleanups # gpg: Signature made Thu 20 Mar 2014 01:46:14 GMT using RSA key ID 3E7E013F # gpg: Good signature from "Andreas Färber <afaerber@suse.de>" # gpg: aka "Andreas Färber <afaerber@suse.com>" * remotes/afaerber/tags/ppc-for-2.0: spapr: Implement interface to fix device pathname spapr: QOM'ify pseries machine spapr_vio: Fix firmware names spapr_llan: Add to boot device list qdev: Introduce FWPathProvider interface vl.c: Extend get_boot_devices_list() to ignore suffixes spapr_hcall: Fix little-endian resource handling in H_SET_MODE target-ppc: Introduce powerisa-207-server flag target-ppc: Force CPU threads count to be a power of 2 target-ppc: Fix overallocation of opcode tables target-ppc: Reset SPRs on CPU reset spapr_hcall: Fix h_enter to loop correctly target-ppc: Add missing 'static' and 'const' attributes Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2014-03-20 11:45:38 +00:00
Alexey Kardashevskiy	71461b0fef	spapr: Implement interface to fix device pathname This extends the pseries machine type with the interface to fix firmware pathnames for devices which have @bootindex property. This fixes SCSI disks' device node names (which are wildcard nodes in the device-tree), for spapr-vscsi, virtio-scsi and usb-storage. This fixes PHB name from "pci" to "pci@XXXX" where XXXX is a BUID as there is no bus on top of sPAPRPHBState where PHB firmware name could be fixed using the BusClass::get_fw_dev_path() mechanism. This stores the boot list in the /chosen/qemu,boot-list property of the device tree. "\n" are replaced by spaces to support OF1275. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Andreas Färber <afaerber@suse.de>	2014-03-20 02:40:26 +01:00
Alexey Kardashevskiy	29ee324740	spapr: QOM'ify pseries machine Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Andreas Färber <afaerber@suse.de>	2014-03-20 02:40:13 +01:00
Alexey Kardashevskiy	5a06393f1d	spapr_vio: Fix firmware names This changes VIO bridge fw name from spapr-vio-bridge to vdevice and vscsi/veth node names from QEMU object names to VIO specific device tree names. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Andreas Färber <afaerber@suse.de>	2014-03-20 02:40:13 +01:00
Alexey Kardashevskiy	a46622fd07	spapr_hcall: Fix little-endian resource handling in H_SET_MODE This changes resource code definitions to ones used in the host kernel. This fixes H_SET_MODE_RESOURCE_LE (switch between big endian and little endian) to sync registers from KVM before changing LPCR value. This adds a set_spr() helper to update an SPR in a CPU's context to avoid possible races and makes use of it to change LPCR. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Andreas Färber <afaerber@suse.de>	2014-03-20 02:39:33 +01:00
Aneesh Kumar K.V	7aaf4957ef	spapr_hcall: Fix h_enter to loop correctly We wanted to loop till index is 8. On 8 we return with H_PTEG_FULL. If we are successful in loading hpte with any other index, we continue with that index value. Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andreas Färber <afaerber@suse.de>	2014-03-20 02:39:23 +01:00
Hervé Poussineau	1fe9e2626f	raven: Set a correct PCI memory region PCI memory region is 0x3f000000 bytes starting at 0xc0000000. However, keep compatibility with Open Hack'Ware expectations by adding a hack for Open Hack'Ware display. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Andreas Färber <andreas.faerber@web.de>	2014-03-20 00:33:17 +01:00

1 2 3 4 5 ...

415 Commits