mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
David Gibson	8149e2992f	pseries: Enforce homogeneous threads-per-core For reasons that may be useful in future, CPU core objects, as used on the pseries machine type have their own nr-threads property, potentially allowing cores with different numbers of threads in the same system. If the user/management uses the values specified in query-hotpluggable-cpus as they're expected to do, this will never matter in pratice. But that's not actually enforced - it's possible to manually specify a core with a different number of threads from that in -smp. That will confuse the platform - most immediately, this can be used to create a CPU thread with index above max_cpus which leads to an assertion failure in spapr_cpu_core_realize(). For now, enforce that all cores must have the same, standard, number of threads. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>	2017-04-03 13:46:18 +10:00
Laurent Vivier	fe6824d126	spapr: fix memory hot-unplugging If, once the kernel has booted, we try to remove a memory hotplugged while the kernel was not started, QEMU crashes on an assert: qemu-system-ppc64: hw/virtio/vhost.c:651: vhost_commit: Assertion `r >= 0' failed. ... #4 in vhost_commit #5 in memory_region_transaction_commit #6 in pc_dimm_memory_unplug #7 in spapr_memory_unplug #8 spapr_machine_device_unplug #9 in hotplug_handler_unplug #10 in spapr_lmb_release #11 in detach #12 in set_allocation_state #13 in rtas_set_indicator ... If we take a closer look to the guest kernel log, we can see when we try to unplug the memory: pseries-hotplug-mem: Attempting to hot-add 4 LMB(s) What happens: 1- The kernel has ignored the memory hotplug event because it was not started when it was generated. 2- When we hot-unplug the memory, QEMU starts to remove the memory, generates an hot-unplug event, and signals the kernel of the incoming new event 3- as the kernel is started, on the QEMU signal, it reads the event list, decodes the hotplug event and tries to finish the hotplugging. 4- QEMU receive the the hotplug notification while it is trying to hot-unplug the memory. This moves the memory DRC to an invalid state This patch prevents this by not allowing to set the allocation state to USABLE while the DRC is awaiting release. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1432382 Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-29 11:35:16 +11:00
Marc-André Lureau	24ec2863b1	spapr: fix buffer-overflow Running postcopy-test with ASAN produces the following error: QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64 tests/postcopy-test ... ================================================================= ==23641==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1556600000 at pc 0x55b8e9d28208 bp 0x7f1555f4d3c0 sp 0x7f1555f4d3b0 READ of size 8 at 0x7f1556600000 thread T6 #0 0x55b8e9d28207 in htab_save_first_pass /home/elmarco/src/qq/hw/ppc/spapr.c:1528 #1 0x55b8e9d2939c in htab_save_iterate /home/elmarco/src/qq/hw/ppc/spapr.c:1665 #2 0x55b8e9beae3a in qemu_savevm_state_iterate /home/elmarco/src/qq/migration/savevm.c:1044 #3 0x55b8ea677733 in migration_thread /home/elmarco/src/qq/migration/migration.c:1976 #4 0x7f15845f46c9 in start_thread (/lib64/libpthread.so.0+0x76c9) #5 0x7f157d9d0f7e in clone (/lib64/libc.so.6+0x107f7e) 0x7f1556600000 is located 0 bytes to the right of 2097152-byte region [0x7f1556400000,0x7f1556600000) allocated by thread T0 here: #0 0x7f159bb76980 in posix_memalign (/lib64/libasan.so.3+0xc7980) #1 0x55b8eab185b2 in qemu_try_memalign /home/elmarco/src/qq/util/oslib-posix.c:106 #2 0x55b8eab186c8 in qemu_memalign /home/elmarco/src/qq/util/oslib-posix.c:122 #3 0x55b8e9d268a8 in spapr_reallocate_hpt /home/elmarco/src/qq/hw/ppc/spapr.c:1214 #4 0x55b8e9d26e04 in ppc_spapr_reset /home/elmarco/src/qq/hw/ppc/spapr.c:1261 #5 0x55b8ea12e913 in qemu_system_reset /home/elmarco/src/qq/vl.c:1697 #6 0x55b8ea13fa40 in main /home/elmarco/src/qq/vl.c:4679 #7 0x7f157d8e9400 in __libc_start_main (/lib64/libc.so.6+0x20400) Thread T6 created by T0 here: #0 0x7f159bae0488 in __interceptor_pthread_create (/lib64/libasan.so.3+0x31488) #1 0x55b8eab1d9cb in qemu_thread_create /home/elmarco/src/qq/util/qemu-thread-posix.c:465 #2 0x55b8ea67874c in migrate_fd_connect /home/elmarco/src/qq/migration/migration.c:2096 #3 0x55b8ea66cbb0 in migration_channel_connect /home/elmarco/src/qq/migration/migration.c:500 #4 0x55b8ea678f38 in socket_outgoing_migration /home/elmarco/src/qq/migration/socket.c:87 #5 0x55b8eaa5a03a in qio_task_complete /home/elmarco/src/qq/io/task.c:142 #6 0x55b8eaa599cc in gio_task_thread_result /home/elmarco/src/qq/io/task.c:88 #7 0x7f15823e38e6 (/lib64/libglib-2.0.so.0+0x468e6) SUMMARY: AddressSanitizer: heap-buffer-overflow /home/elmarco/src/qq/hw/ppc/spapr.c:1528 in htab_save_first_pass index seems to be wrongly incremented, unless I miss something that would be worth a comment. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-29 11:35:02 +11:00
Laurent Vivier	55641213fc	numa,spapr: align default numa node memory size to 256MB Since commit `224245b` ("spapr: Add LMB DR connectors"), NUMA node memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE). But when "-numa" option is provided without "mem" parameter, the memory is equally divided between nodes, but 8MB aligned. This can be not valid for pseries. In that case we can have: $ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB With this patch, we have: (qemu) info numa 3 nodes node 0 cpus: 0 node 0 size: 1280 MB node 1 cpus: node 1 size: 1280 MB node 2 cpus: node 2 size: 1536 MB Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-22 11:32:42 +11:00
Paolo Bonzini	d2528bdc19	qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h This dependency is the wrong way, and we will need util/qemu-timer.h from sysemu/cpus.h in the next patch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-14 13:28:18 +01:00
David Gibson	82516263ce	pseries: Don't expose PCIe extended config space on older machine types `bb9986452` "spapr_pci: Advertise access to PCIe extended config space" allowed guests to access the extended config space of PCI Express devices via the PAPR interfaces, even though the paravirtualized bus mostly acts like plain PCI. However, that patch enabled access unconditionally, including for existing machine types, which is an unwise change in behaviour. This patch limits the change to pseries-2.9 (and later) machine types. Suggested-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-14 11:54:17 +11:00
Peter Maydell	56b51708e9	ppc patch queue for 2017-03-06 Looks like my previous batch wasn't quite the last before hard freeze. This has a handful of bugfixes to go in. They're all genuine bugfixes, though not regressions in some cases. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYvOCUAAoJEGw4ysog2bOSQWgQAKzPeIqz8I/1eXL+zmZCUaiU J2gyjzfaKkQ/AVGPtT45ZjJsihxSFbZT6koxXtEaxwq5DD87yXQOqA/d+BH7jr5d 75FGjVzKOA0IKQymySztwoC2j/ftWmmSx0N6YUmL0QcXCISS1YHRvdQkdXf6j4I/ XtK1FA34wmCsTK1AgZ9WDxjABdkHP+7FDRBpVmr01Nv1TeK2Xms2MqJ5Wku/lOX/ 6bg1KbC8pVHy5YZhIpRFzgGxaMr2UcJ0Q3YR9fD/4UW/k518sJk+i2xlagVsFxyG gqfPolv0wjwuGpYt42UyFG4IouCbKN+MChU5MBIaqU10VouOw+0/W+p+1ZOHgdB8 GoaBGyfuJ6/i4EQL0/+FL4hPOI5vHLliWxPfMJxDL5ujP0cFaPm2XbK5Yqxksu3m uYp3yYIbiSaF8QUxbBjAAoKPdVpP5dsgHjAlxecwCUGlIo0Ur3uphnU5lPoNlvS4 5ZcDDlMGjPb0oIHfdPt2ai8g+32uAsD7X7pi+qI0x+srSnjisRpOT2wKv0otMbGx U4j01/Na2DjFjhGW+vNm9UYsE/QgKr6pU9z3jUXOIplX1HBXirtfv5C/OypCN7Zj LgqsmiMWMJFjSLk8N8cxeM1w839B3wEM+2+46su7/qpW9sd0jKvHk0cJDyZPzn29 zQ52CbQQiewXM8y+mffe =/RZL -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170306' into staging ppc patch queue for 2017-03-06 Looks like my previous batch wasn't quite the last before hard freeze. This has a handful of bugfixes to go in. They're all genuine bugfixes, though not regressions in some cases. # gpg: Signature made Mon 06 Mar 2017 04:07:48 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.9-20170306: target/ppc: use helper for excp handling target/ppc: fmadd: add macro for updating flags target/ppc: fmadd check for excp independently spapr: ensure that all threads within core are on the same NUMA node ppc/xics: register reset handlers for the ICP and ICS objects Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-03-06 13:06:30 +00:00
Igor Mammedov	17b7c39e27	spapr: ensure that all threads within core are on the same NUMA node Threads within a core shouldn't be on different NUMA nodes, so if user has misconfgured command line, fail QEMU at start up to force user fix it. For now use the first thread on the core as source of core's node-id. Later when cpu-numa refactoring lands it will be switched to core's node-id from possible_cpus[]. This prevents the same problems as commit `20bb648d` "spapr: Fix default NUMA node allocation for threads", but for the case of manually configured NUMA node mappings, instead of just the default case. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-06 10:32:53 +11:00
Cédric Le Goater	7ea6e06717	ppc/xics: register reset handlers for the ICP and ICS objects The recent changes on the XICS layer removed the XICSState object to let the sPAPR machine handle the ICP and ICS directly. The reset of these objects was previously handled by XICSState, which was a SysBus device, and to keep the same behavior, the ICP and ICS were assigned to SysbBus. But that broke the 'info qtree' command in the monitor. 'qtree' performs a loop on the children of a bus to print their properties and SysBus devices are expected to be found under SysBus, which is not the case anymore. The fix for this problem is to register reset handlers for the ICP and ICS objects and stop using SysBus for such devices. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-06 10:07:38 +11:00
Markus Armbruster	a4a1c70dc7	qapi: Make input visitors detect unvisited list tails Fix the design flaw demonstrated in the previous commit: new method check_list() lets input visitors report that unvisited input remains for a list, exactly like check_struct() lets them report that unvisited input remains for a struct or union. Implement the method for the qobject input visitor (straightforward), and the string input visitor (less so, due to the magic list syntax there). The opts visitor's list magic is even more impenetrable, and all I can do there today is a stub with a FIXME comment. No worse than before. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488544368-30622-26-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-03-05 09:14:20 +01:00
Sam Bobroff	ec975e839c	spapr: Small cleanup of PPC MMU enums The PPC MMU types are sometimes treated as if they were a bit field and sometime as if they were an enum which causes maintenance problems: flipping bits in the MMU type (which is done on both the 1TB segment and 64K segment bits) currently produces new MMU type values that are not handled in every "switch" on it, sometimes causing an abort(). This patch provides some macros that can be used to filter out the "bit field-like" bits so that the remainder of the value can be switched on, like an enum. This allows removal of all of the "degraded" types from the list and should ease maintenance. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-03 11:30:59 +11:00
David Gibson	bb99864528	spapr_pci: Advertise access to PCIe extended config space The (paravirtual) PCI host bridge on the 'pseries' machine in most regards acts like a regular PCI bus, rather than a PCIe bus. Despite this, though, it does allow access to the PCIe extended config space. We already implemented the RTAS methods to allow this access.. but forgot to put the markers into the device tree so that guest's know it is there. This adds them in. With this, a pseries guest is able to view extended config space on (for example an e1000e device. This should be enough to allow guests to use at least some PCIe devices. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh	24d8e5655f	hw/ppc/spapr: Add POWER9 to pseries cpu models Add POWER9 cpu to list of spapr core models which allows it to be specified as the cpu model for a pseries guest (e.g. -machine pseries -cpu POWER9). This now allows a POWER9 cpu to boot to userspace in tcg emulation for a pseries machine with a legacy kernel. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh	4975c098c9	target/ppc/POWER9: Add POWER9 pa-features definition Add a pa-features definition which includes all of the new fields which have been added, note we don't claim support for any of these new features at this stage. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh	9861bb3efd	target/ppc: Add patb_entry to sPAPRMachineState ISA v3.00 adds the idea of a partition table which is used to store the address translation details for all partitions on the system. The partition table consists of double word entries indexed by partition id where the second double word contains the location of the process table in guest memory. The process table is registered by the guest via a h-call. We need somewhere to store the address of the process table so we add an entry to the sPAPRMachineState struct called patb_entry to represent the second doubleword of a single partition table entry corresponding to the current guest. We need to store this value so we know if the guest is using radix or hash translation and the location of the corresponding process table in guest memory. Since we only have a single guest per qemu instance, we only need one entry. Since the partition table is technically a hypervisor resource we require that access to it is abstracted by the virtual hypervisor through the get_patbe() call. Currently the value of the entry is never set (and thus defaults to 0 indicating hash), but it will be required to both implement POWER9 kvm support and tcg radix support. We also add this field to be migrated as part of the sPAPRMachineState as we will need it on the receiving side as the guest will never tell us this information again and we need it to perform translation. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-03 11:30:59 +11:00
Cédric Le Goater	6449da4545	ppc/xics: move InterruptStatsProvider to the sPAPR machine It provides a better monitor output of the ICP and ICS objects, else the objects are printed out of order. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:40 +11:00
Cédric Le Goater	a7ff1212e9	ppc/xics: move ics-simple post_load under the machine The ICS object uses a post_load() handler which is implicitly relying on the fact that the internal state of the ICS and ICP objects has been restored but this is not guaranteed. So, let's move the code under the post_load() handler of the machine where we know the objects have been fully restored. The icp_resend() handler of the XICSFabric QOM interface is also removed as it is now obsolete. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:40 +11:00
Cédric Le Goater	e6f7e110ee	ppc/xics: remove the XICSState classes The XICSState classes are not used anymore. They have now been fully deprecated by the XICSFabric QOM interface. Do the cleanups. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:40 +11:00
Cédric Le Goater	2192a9303d	ppc/xics: export the XICS init routines There is nothing left related to the XICS object in the realize functions of the KVMXICSState and XICSState class. So adapt the interfaces to call these routines directly from the sPAPR machine init sequence. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:40 +11:00
Cédric Le Goater	852ad27e14	ppc/xics: move the ICP array under the sPAPR machine This is the last step to remove the XICSState abstraction and have the machine hold all the objects related to interrupts : ICSs and ICPs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:40 +11:00
Cédric Le Goater	20147f2fce	ppc/xics: register the reset handler of ICP objects The reset of the ICP objects is currently handled by XICS but this can be done for each individual ICP. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:40 +11:00
Cédric Le Goater	b0ec31290c	ppc/xics: simplify spapr_dt_xics() interface spapr_dt_xics() only needs the number of servers to build the device tree nodes. Let's change the routine interface to reflect that. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Cédric Le Goater	b4f27d71e3	ppc/xics: use the QOM interface to grab an ICP Also introduce a xics_icp_get() helper to simplify the changes. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Cédric Le Goater	b2fc59aaf9	ppc/xics: extend the QOM interface to handle ICPs Let's add two new handlers for ICPs. One is to get an ICP object from a server number and a second is to resend the irqs when needed. The icp_resend() handler is a temporary workaround needed by the ics-simple post_load() handler. It will be removed when the post_load portion can be done at the machine level. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Cédric Le Goater	d114a66225	ppc/xics: remove the XICS list of ICS This is not used anymore. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Cédric Le Goater	c79b2fdd7b	ppc/xics: register the reset handler of ICS objects The reset of the ICS objects is currently handled by XICS but this can be done for each individual ICS. This also reduces the use of the XICS list of ICS. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Cédric Le Goater	2cd908d0ad	ppc/xics: use the QOM interface to resend irqs Also change the ICPState 'xics' backlink to be a XICSFabric, this removes the need of using qdev_get_machine() to get the QOM interface in some of the routines. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Cédric Le Goater	f7759e4331	ppc/xics: use the QOM interface to get irqs Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Cédric Le Goater	7844e12b28	ppc/xics: use the QOM interface under the sPAPR machine Add 'ics_get' and 'ics_resend' handlers to the sPAPR machine. These are relatively simple for a single ICS. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Cédric Le Goater	681bfaded6	ppc/xics: store the ICS object under the sPAPR machine A list of ICS objects was introduced under the XICS object for the PowerNV machine but, for the sPAPR machine, it brings extra complexity as there is only a single ICS. To simplify the code, let's add the ICS pointer under the sPAPR machine and try to reduce the use of this list where possible. Also, change the xics_spapr_*() routines to use an ICS object instead of an XICSState and change their name to reflect that these are specific to the sPAPR ICS object. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Cédric Le Goater	817bb6a446	ppc/xics: remove set_nr_servers() handler from XICSStateClass Today, the ICP (Interrupt Controller Presenter) objects are created by the 'nr_servers' property handler of the XICS object and a class handler. They are realized in the XICS object realize routine. Let's simplify the process by creating the ICP objects along with the XICS object at the machine level. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Cédric Le Goater	4e4169f7a2	ppc/xics: remove set_nr_irqs() handler from XICSStateClass Today, the ICS (Interrupt Controller Source) object is created and realized by the init and realize routines of the XICS object, but some of the parameters are only known at the machine level. These parameters are passed from the sPAPR machine to the ICS object in a rather convoluted way using property handlers and a class handler of the XICS object. The number of irqs required to allocate the IRQ state objects in the ICS realize routine is one of them. Let's simplify the process by creating the ICS object along with the XICS object at the machine level and link the ICS into the XICS list of ICSs at this level also. In the sPAPR machine, there is only a single ICS but that will change with the PowerNV machine. Also, QOMify the creation of the objects and get rid of the superfluous code. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
David Gibson	738d5db824	xics: XICS should not be a SysBusDevice Currently xics - the component of the IBM POWER interrupt controller representing the overall interrupt fabric / architecture is represented as a descendent of SysBusDevice. However, this is not really correct - the xics presents nothing in MMIO space so it should be an "unattached" device in the current QOM model. Since this device will always be created by the machine type, not created specifically from the command line, and because it has no migrated state it should be safe to move it around the device composition tree. Therefore this patch changes it to a descendent of TYPE_DEVICE, and makes it an unattached device. So that its reset handler still gets called correctly, we add a qdev_set_parent_bus() to attach it to sysbus. It's not really clear that's correct (instead of using register_reset()) but it appears to a common technique. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [clg corrected problems with reset] Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg folded together and updated commit message] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Greg Kurz	a8eeafda19	spapr/pci: populate PCI DT in reverse order Since commit `1d2d974244` "spapr_pci: enumerate and add PCI device tree", QEMU populates the PCI device tree in the opposite order compared to SLOF. Before `1d2d974244`: Populating /pci@800000020000000 00 0000 (D) : 1af4 1000 virtio [ net ] 00 0800 (D) : 1af4 1001 virtio [ block ] 00 1000 (D) : 1af4 1009 virtio [ network ] Populating /pci@800000020000000/unknown-legacy-device@2 7e5294b8 : /pci@800000020000000 7e52b998 : \|-- ethernet@0 7e52c0c8 : \|-- scsi@1 7e52c7e8 : +-- unknown-legacy-device@2 ok Since `1d2d974244`: Populating /pci@800000020000000 00 1000 (D) : 1af4 1009 virtio [ network ] Populating /pci@800000020000000/unknown-legacy-device@2 00 0800 (D) : 1af4 1001 virtio [ block ] 00 0000 (D) : 1af4 1000 virtio [ net ] 7e5e8118 : /pci@800000020000000 7e5ea6a0 : \|-- unknown-legacy-device@2 7e5eadb8 : \|-- scsi@1 7e5eb4d8 : +-- ethernet@0 ok This behaviour change is not actually a bug since no assumptions should be made on DT ordering. But it has no real justification either, other than being the consequence of the way fdt_add_subnode() inserts new elements to the front of the FDT rather than adding them to the tail. This patch reverts to the historical SLOF ordering by walking PCI devices in reverse order. This reconciles pseries with x86 machine types behavior. It is expected to make things easier when porting existing applications to power. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> (slight update to the changelog) Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
David Gibson	e57ca75ce3	target/ppc: Manage external HPT via virtual hypervisor The pseries machine type implements the behaviour of a PAPR compliant hypervisor, without actually executing such a hypervisor on the virtual CPU. To do this we need some hooks in the CPU code to make hypervisor facilities get redirected to the machine instead of emulated internally. For hypercalls this is managed through the cpu->vhyp field, which points to a QOM interface with a method implementing the hypercall. For the hashed page table (HPT) - also a hypervisor resource - we use an older hack. CPUPPCState has an 'external_htab' field which when non-NULL indicates that the HPT is stored in qemu memory, rather than within the guest's address space. For consistency - and to make some future extensions easier - this merges the external HPT mechanism into the vhyp mechanism. Methods are added to vhyp for the basic operations the core hash MMU code needs: map_hptes() and unmap_hptes() for reading the HPT, store_hpte() for updating it and hpt_mask() to retrieve its size. To match this, the pseries machine now sets these vhyp fields in its existing vhyp class, rather than reaching into the cpu object to set the external_htab field. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>	2017-03-01 11:23:39 +11:00
David Gibson	36778660d7	target/ppc: Eliminate htab_base and htab_mask variables CPUPPCState includes fields htab_base and htab_mask which store the base address (GPA) and size (as a mask) of the guest's hashed page table (HPT). These are set when the SDR1 register is updated. Keeping these in sync with the SDR1 is actually a little bit fiddly, and probably not useful for performance, since keeping them expands the size of CPUPPCState. It also makes some upcoming changes harder to implement. This patch removes these fields, in favour of calculating them directly from the SDR1 contents when necessary. This does make a change to the behaviour of attempting to write a bad value (invalid HPT size) to the SDR1 with an mtspr instruction. Previously, the bad value would be stored in SDR1 and could be retrieved with a later mfspr, but the HPT size as used by the softmmu would be, clamped to the allowed values. Now, writing a bad value is treated as a no-op. An error message is printed in both new and old versions. I'm not sure which behaviour, if either, matches real hardware. I don't think it matters that much, since it's pretty clear that if an OS writes a bad value to SDR1, it's not going to boot. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2017-03-01 11:23:39 +11:00
David Gibson	7222b94a83	target/ppc: Cleanup HPTE accessors for 64-bit hash MMU Accesses to the hashed page table (HPT) are complicated by the fact that the HPT could be in one of three places: 1) Within guest memory - when we're emulating a full guest CPU at the hardware level (e.g. powernv, mac99, g3beige) 2) Within qemu, but outside guest memory - when we're emulating user and supervisor instructions within TCG, but instead of emulating the CPU's hypervisor mode, we just emulate a hypervisor's behaviour (pseries in TCG or KVM-PR) 3) Within the host kernel - a pseries machine using KVM-HV acceleration. Mostly accesses to the HPT are handled by KVM, but there are a few cases where qemu needs to access it via a special fd for the purpose. In order to batch accesses to the fd in case (3), we use a somewhat awkward ppc_hash64_start_access() / ppc_hash64_stop_access() pair, which for case (3) reads / releases several HPTEs from the kernel as a batch (usually a whole PTEG). For cases (1) & (2) it just returns an address value. The actual HPTE load helpers then need to interpret the returned token differently in the 3 cases. This patch keeps the same basic structure, but simplfiies the details. First start_access() / stop_access() are renamed to map_hptes() and unmap_hptes() to make their operation more obvious. Second, map_hptes() now always returns a qemu pointer, which can always be used in the same way by the load_hpte() helpers. In case (1) it comes from address_space_map() in case (2) directly from qemu's HPT buffer and in case (3) from a temporary buffer read from the KVM fd. While we're at it, make things a bit more consistent in terms of types and variable names: avoid variables named 'index' (it shadows index(3) which can lead to confusing results), use 'hwaddr ptex' for HPTE indices and uint64_t for each of the HPTE words, use ptex throughout the call stack instead of pte_offset in some places (we still need that at the bottom layer, but nowhere else). Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
David Gibson	b7b0b1f13a	target/ppc: Merge cpu_ppc_set_vhyp() with cpu_ppc_set_papr() cpu_ppc_set_papr() sets up various aspects of CPU state for use with PAPR paravirtualized guests. However, it doesn't set the virtual hypervisor, so callers must also call cpu_ppc_set_vhyp() so that PAPR hypercalls are handled properly. This is a bit silly, so fold setting the virtual hypervisor into cpu_ppc_set_papr(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>	2017-03-01 11:23:39 +11:00
David Gibson	c6404adebf	pseries: Minor cleanups to HPT management hypercalls * Standardize on 'ptex' instead of 'pte_index' for HPTE index variables for consistency and brevity * Avoid variables named 'index'; shadowing index(3) from libc can lead to surprising bugs if the variable is removed, because compiler errors might not appear for remaining references * Clarify index calculations in h_enter() - we have two cases, H_EXACT where the exact HPTE slot is given, and !H_EXACT where we search for an empty slot within the hash bucket. Make the calculation more consistent between the cases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>	2017-03-01 11:23:39 +11:00
Greg Kurz	6244bb7e58	sysemu: support up to 1024 vCPUs Some systems can already provide more than 255 hardware threads. Bumping the QEMU limit to 1024 seems reasonable: - it has no visible overhead in top; - the limit itself has no effect on hot paths. Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Laurent Vivier	2530a1a5cf	spapr: generate DT node names When DT node names for PCI devices are generated by SLOF, they are generated according to the type of the device (for instance, ethernet for virtio-net-pci device). Node name for hotplugged devices is generated by QEMU. This patch adds the mechanic to QEMU to create the node name according to the device type too. The data structure has been roughly copied from OpenBIOS/OpenHackware, node names from SLOF. Example: Hotplugging some PCI cards with QEMU monitor: device_add virtio-tablet-pci device_add virtio-serial-pci device_add virtio-mouse-pci device_add virtio-scsi-pci device_add virtio-gpu-pci device_add ne2k_pci device_add nec-usb-xhci device_add intel-hda What we can see in linux device tree: for dir in /proc/device-tree/pci@800000020000000/@/; do echo $dir cat $dir/name echo done WITHOUT this patch: /proc/device-tree/pci@800000020000000/pci@0/ pci /proc/device-tree/pci@800000020000000/pci@1/ pci /proc/device-tree/pci@800000020000000/pci@2/ pci /proc/device-tree/pci@800000020000000/pci@3/ pci /proc/device-tree/pci@800000020000000/pci@4/ pci /proc/device-tree/pci@800000020000000/pci@5/ pci /proc/device-tree/pci@800000020000000/pci@6/ pci /proc/device-tree/pci@800000020000000/pci@7/ pci WITH this patch: /proc/device-tree/pci@800000020000000/communication-controller@1/ communication-controller /proc/device-tree/pci@800000020000000/display@4/ display /proc/device-tree/pci@800000020000000/ethernet@5/ ethernet /proc/device-tree/pci@800000020000000/input-controller@0/ input-controller /proc/device-tree/pci@800000020000000/mouse@2/ mouse /proc/device-tree/pci@800000020000000/multimedia-device@7/ multimedia-device /proc/device-tree/pci@800000020000000/scsi@3/ scsi /proc/device-tree/pci@800000020000000/usb-xhci@6/ usb-xhci Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Peter Maydell	28f997a82c	This is the MTTCG pull-request as posted yesterday. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJYsBZfAAoJEPvQ2wlanipElJ0H+QGoStSPeHrvKu7Q07v4F9zM Pvf05gRsaxvXl7UbwmXC4oKhvZf9rVJ6ITk0x/y0WvmK0mHCmNBWkC0nn5UFL5IH cdxetLz21Q+Ghpc36tZvqn2HYwRQFoEznge2LdtBDG0TyVA4jwquHU3HCG2D51zi BaImI6lYW1e4ejjZHw8cEInSxsj/HJZE4pPas2Tkci+uAnrJroErwBVRRcE/y/Tn aupl9TJFs2JdyJFNDibIm0kjB+i+jvCiLgYjbKZ/dR/+GZt73TtiBk/q9ZOFjdmT 7YFPI3F46QbGHoZahtzh0Xt7WMj94SlQgQ9OJ3zmNMfpXrze6Yc78xo/nbQ33U0= =wR0/ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-240217-1' into staging This is the MTTCG pull-request as posted yesterday. # gpg: Signature made Fri 24 Feb 2017 11:17:51 GMT # gpg: using RSA key 0xFBD0DB095A9E2A44 # gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" # Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44 * remotes/stsquad/tags/pull-mttcg-240217-1: (24 commits) tcg: enable MTTCG by default for ARM on x86 hosts hw/misc/imx6_src: defer clearing of SRC_SCR reset bits target-arm: ensure all cross vCPUs TLB flushes complete target-arm: don't generate WFE/YIELD calls for MTTCG target-arm/powerctl: defer cpu reset work to CPU context cputlb: introduce tlb_flush__all_cpus[_synced] cputlb: atomically update tlb fields used by tlb_reset_dirty cputlb: add tlb_flush_by_mmuidx async routines cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap cputlb: introduce tlb_flush_ async work. cputlb: tweak qemu_ram_addr_from_host_nofail reporting cputlb: add assert_cpu_is_self checks tcg: handle EXCP_ATOMIC exception for system emulation tcg: enable thread-per-vCPU tcg: enable tb_lock() for SoftMMU tcg: remove global exit_request tcg: drop global lock during TCG code execution tcg: rename tcg_current_cpu to tcg_current_rr_cpu tcg: add kick timer for single-threaded vCPU emulation tcg: add options for enabling MTTCG ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-02-25 18:43:52 +00:00
Jan Kiszka	8d04fb55de	tcg: drop global lock during TCG code execution This finally allows TCG to benefit from the iothread introduction: Drop the global mutex while running pure TCG CPU code. Reacquire the lock when entering MMIO or PIO emulation, or when leaving the TCG loop. We have to revert a few optimization for the current TCG threading model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not kicking it in qemu_cpu_kick. We also need to disable RAM block reordering until we have a more efficient locking mechanism at hand. Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here. These numbers demonstrate where we gain something: 20338 jan 20 0 331m 75m 6904 R 99 0.9 0:50.95 qemu-system-arm 20337 jan 20 0 331m 75m 6904 S 20 0.9 0:26.50 qemu-system-arm The guest CPU was fully loaded, but the iothread could still run mostly independent on a second core. Without the patch we don't get beyond 32206 jan 20 0 330m 73m 7036 R 82 0.9 1:06.00 qemu-system-arm 32204 jan 20 0 330m 73m 7036 S 21 0.9 0:17.03 qemu-system-arm We don't benefit significantly, though, when the guest is not fully loading a host CPU. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com> [FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex] Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com> [EGC: fixed iothread lock for cpu-exec IRQ handling] Signed-off-by: Emilio G. Cota <cota@braap.org> [AJB: -smp single-threaded fix, clean commit msg, BQL fixes] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com> [PM: target-arm changes] Acked-by: Peter Maydell <peter.maydell@linaro.org>	2017-02-24 10:32:45 +00:00
Peter Maydell	fb6971c110	hw/ppc/ppc405_uc.c: Avoid integer overflows When performing clock calculations, the ppc405_uc code has several places where it multiplies together two 32-bit variables and assigns the result to a 64-bit variable. This doesn't quite do what is intended because C will compute a 32-bit multiply result. Add casts to ensure we don't truncate the result. (Spotted by Coverity, CID 1005504, 1005505.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 14:28:53 +11:00
Thomas Huth	df58713396	hw/ppc/spapr: Check for valid page size when hot plugging memory On POWER, the valid page sizes that the guest can use are bound to the CPU and not to the memory region. QEMU already has some fancy logic to find out the right maximum memory size to tell it to the guest during boot (see getrampagesize() in the file target/ppc/kvm.c for more information). However, once we're booted and the guest is using huge pages already, it is currently still possible to hot-plug memory regions that does not support huge pages - which of course does not work on POWER, since the guest thinks that it is possible to use huge pages everywhere. The KVM_RUN ioctl will then abort with -EFAULT, QEMU spills out a not very helpful error message together with a register dump and the user is annoyed that the VM unexpectedly died. To avoid this situation, we should check the page size of hot-plugged DIMMs to see whether it is possible to use it in the current VM. If it does not fit, we can print out a better error message and refuse to add it, so that the VM does not die unexpectely and the user has a second chance to plug a DIMM with a matching memory backend instead. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1419466 Signed-off-by: Thomas Huth <thuth@redhat.com> [dwg: Fix a build error on 32-bit builds with KVM] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 14:28:53 +11:00
Igor Mammedov	c5514d0e4b	machine: replace query_hotpluggable_cpus() callback with has_hotpluggable_cpus flag Generic helper machine_query_hotpluggable_cpus() replaced target specific query_hotpluggable_cpus() callbacks so there is no need in it anymore. However inon NULL callback value is used to detect/report hotpluggable cpus support, therefore it can be removed completely. Replace it with MachineClass.has_hotpluggable_cpus boolean which is sufficient for the task. Suggested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00
Igor Mammedov	f2d672c248	machine: unify [pc_\|spapr_]query_hotpluggable_cpus() callbacks All callbacks FOO_query_hotpluggable_cpus() are practically the same except of setting vcpus_count to different values. Convert them to a generic machine_query_hotpluggable_cpus() callback by moving vcpus_count initialization to per machine specific callback possible_cpu_arch_ids(). Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00
Igor Mammedov	535455fdee	spapr: reuse machine->possible_cpus instead of cores[] Replace SPAPR specific cores[] array with generic machine->possible_cpus and store core objects there. It makes cores bookkeeping similar to x86 cpus and will allow to unify similar code. It would allow to replace cpu_index based NUMA node mapping with iproperty based one (for -device created cores) since possible_cpus carries board defined topology/layout. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00
Laurent Vivier	5b929608b9	spapr: replace debug printf with trace points Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00
Laurent Vivier	f4af7d4438	ppc4xx: replace debug printf with trace points Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00

1 2 3 4 5 ...

930 Commits