mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Kevin Wolf	19ebd13ed4	commit: Fix use after free in completion The final bdrv_set_backing_hd() could be working on already freed nodes because the commit job drops its references (through BlockBackends) to both overlay_bs and top already a bit earlier. One way to trigger the bug is hot unplugging a disk for which blockdev_mark_auto_del() cancels the block job. Fix this by taking BDS-level references while we're still using the nodes. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-06-09 13:46:13 +02:00
Kevin Wolf	49695eeb74	qemu-iotests: Block migration test Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-06-09 11:45:03 +02:00
Kevin Wolf	362fdf170c	migration/block: Clean up BBs in block_save_complete() We need to release any block migrations BlockBackends on the source before successfully completing the migration because otherwise inactivating the images will fail (inactivation only tolerates device BBs). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2017-06-09 11:45:03 +02:00
Kevin Wolf	f07fa4cbf0	migration: Inactivate images after .save_live_complete_precopy() Block migration may still access the image during its .save_live_complete_precopy() implementation, so we should only inactivate the image afterwards. Another reason for the change is that inactivating an image fails when there is still a non-device BlockBackend using it, which includes the BBs used by block migration. We want to give block migration a chance to release the BBs before trying to inactivate the image (this will be done in another patch). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2017-06-09 11:45:03 +02:00
Kevin Wolf	93c26503e0	block: Fix anonymous BBs in blk_root_inactivate() blk->name isn't an array, but a pointer that can be NULL. Checking for an anonymous BB must involve a NULL check first, otherwise we get crashes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2017-06-09 11:45:03 +02:00
Laurent Vivier	593080936a	Revert "spapr: fix memory hot-unplugging" This reverts commit `fe6824d126`. Conflicts hw/ppc/spapr_drc.c, because get_index() has been renamed spapr_get_index(). This didn't fix the problem. Once the hotplug has been started some memory is allocated and some structures are allocated. We don't free it when we ignore the unplug, and we can't because they can be in use by the kernel. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-09 12:35:46 +10:00
Greg Kurz	b1fd36c363	xics: drop ICPStateClass::cpu_setup() handler The cpu_setup() handler is only implemented by xics_kvm, where it really does a typical "realize" job. Moreover, the realize() handler is called shortly after cpu_setup(), on the same path. This patch converts xics_kvm to implement realize() instead of cpu_setup(). Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-09 12:17:59 +10:00
Greg Kurz	9ed656631d	xics: setup cpu at realize time Until recently, spapr used to allocate ICPState objects for the lifetime of the machine. They would only be associated to vCPUs in xics_cpu_setup() when plugging a CPU core. Now that ICPState objects have the same lifecycle as vCPUs, it is possible to associate them during realization. This patch hence open-codes xics_cpu_setup() in icp_realize(). The vCPU is passed as a property. Note that vCPU now needs to be realized first for the IRQs to be allocated. It also needs to resetted before ICPState realization in order to synchronize with KVM. Since ICPState objects are freed when unrealized, xics_cpu_destroy() isn't needed anymore and can be safely dropped. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-09 12:15:57 +10:00
Greg Kurz	100f738850	xics: pass appropriate types to realize() handlers. It makes more sense to pass an IPCState * to handlers of ICPStateClass instead of a DeviceState *, if only to benefit from compile time type checking. The same goes with ICSStateClass. While here, we also change the declaration of ICPStateClass in xics.h for consistency. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-09 12:12:34 +10:00
Greg Kurz	ad265631c0	xics: introduce macros for ICP/ICS link properties These properties are part of the XICS API. They deserve to appear explicitely in the XICS header file. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-09 12:12:34 +10:00
Thomas Huth	3b95410507	hw/cpu: core.c can be compiled as common object There does not seem to be any target specific code in core.c, so we can put it into "common-obj" instead of "obj" to compile it only once for all targets. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-09 12:02:55 +10:00
Marcel Apfelbaum	bc277a52fb	hw/pcie: fix the generic pcie root port to support migration Add msix state to pcie-root-ports's vmstate in order to support migration. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-06-08 22:02:37 +03:00
Haozhong Zhang	20fdef58a0	nvdimm acpi: fix region format interface code Per ACPI 6.2, section 5.2.25.6 and JEDEC Annex L Release 3, the current region format interface code 0x201 indicates the block addressed function interface 1, rather than a byte addressable interface. Fix it by using 0x301 which indicates the byte addressable no energy backed function interface 1. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-08 22:02:36 +03:00
Marc-André Lureau	277238f9f4	vhost-user-bridge: fix iov_restore_front() warning CC tests/vhost-user-bridge.o /home/dgilbert/git/qemu-world3/tests/vhost-user-bridge.c:228:23: warning: variables 'front' and 'iov' used in loop condition not modified in loop body [-Wfor-loop-analysis] for (cur = front; front != iov; cur++) { ^~~~~ ~~~ 1 warning generated. Fix the loop, document the function, and fix some related assert(). In practice, the loop bug was harmless because the front sg buffer is enough to discard/restore the header size. Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by: Jens Freimann <jfreiman@redhat.com>	2017-06-08 22:02:36 +03:00
Marc-André Lureau	27d4c3789d	test-char: start a /char/serial test Quite limited test, to check that the chardev can be created with a path and with the tty alias. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-06-08 17:58:13 +04:00
Marc-André Lureau	73119c2864	chardev: don't use alias names in parse_compat() "parport" is considered "old" since commit `88a946d32d`, when "parallel" was added. Similarly for "tty" in commit `d59044ef74`. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-06-08 17:57:58 +04:00
Marc-André Lureau	d203c64398	char: fix alias devices regression Fix regression from commit `4d43a603c7`, where the serial and parallel headers got removed from char.c, which broke the alias table. Move the HAVE_CHARDEV_SERIAL/HAVE_CHARDEV_PARPORT to osdep.h instead of being in separate headers. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-06-08 17:57:36 +04:00
Thomas Huth	4871dd4c3f	hw/ppc/spapr: Adjust firmware name for PCI bridges SLOF uses "pci" as name for PCI bridges nodes in the device tree instead of "pci-bridges", so booting via bootindex from a device behind a PCI bridge currently does not work since QEMU passes the wrong name in the "qemu,boot-list" property. Fix it by changing the name of the PCI bridge nodes to "pci" instead. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1459170 Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-08 14:38:27 +10:00
Greg Kurz	a4d4edce7a	xics: add reset() handler to ICPStateClass Taking into account that qemu_set_irq() returns immediatly if its first argument is NULL, icp_kvm_reset() largely duplicates icp_reset(). This patch introduces a reset() handler, so that the common logic can be implemented in icp_reset() only. While there we can also drop icp_kvm_realize() and icp_kvm_unrealize(). This causes icp-kvm to be realized in icp_realize(), which sets icp->xics, but it has no impact. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-08 14:38:27 +10:00
Greg Kurz	67b544d65f	pnv_core: drop reference on ICPState object during CPU realization Similarly to what was done to spapr with commit `249127d0df`, this patch ensures that we don't keep an extra reference on the ICPState object. Also since the object was just created and not reparented yet, the call to object_property_add_child() should never fail: let's pass &error_abort to make this clear. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-08 14:38:27 +10:00
David Gibson	7980833619	spapr: Rework DRC name handling DRC objects have a get_name method which returns the DRC name generated when the DRC is created. Replace that with a fixed spapr_drc_name() function which generates the name on the fly from other information. This means: * We get rid of a method with only one implementation, and only local callers * We don't have to carry the name string around for the lifetime of the DRC * We use information added to the class structure to generate the name in standard format, so we don't need an explicit switch on drc type any more We also eliminate the 'name' property; it's basically useless since the only information in it can easily be deduced from other things. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-06-08 14:38:27 +10:00
David Gibson	6304fd27ef	spapr: Fold spapr_phb_{add,remove}_pci_device() into their only callers Both functions are fairly short, and so are their callers. There's no particular logical distinction between them, so fold them together. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-06-08 14:38:27 +10:00
David Gibson	0be4e88621	spapr: Change DRC attach & detach methods to functions DRC objects have attach & detach methods, but there's only one implementation. Although there are some differences in its behaviour for different DRC types, the overall structure is the same, so while we might want different method implementations for some parts, we're unlikely to want them for the top-level functions. So, replace them with direct function calls. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-06-08 14:38:26 +10:00
David Gibson	cd74d27e42	spapr: Clean up handling of DR-indicator There are 3 types of "indicator" associated with hotplug in the PAPR spec the "allocation state", "isolation state" and "DR-indicator". The first two are intimately tied to the various state transitions associated with hotplug. The DR-indicator, however, is different and simpler. It's basically just a guest controlled variable which can be used by the guest to flag state or problems associated with a device. The idea is that the hypervisor can use it to present information back on management consoles (on some machines with PowerVM it may even control physical LEDs on the machine case associated with the relevant device). For that reason, there's only ever likely to be a single update implementation so the set_indicator_state method isn't useful. Replace it with a direct function call. While we're there, make some small associated cleanups: * PAPR doesn't use the term "indicator state", just "DR-indicator" and the allocation state and isolation state are also considered "indicators". Rename things to be less confusing * Fold set_indicator_state() and rtas_set_indicator_state() into a single rtas_set_dr_indicator() function. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-06-08 14:38:26 +10:00
David Gibson	7b7258f810	spapr: Clean up RTAS set-indicator In theory the RTAS set-indicator call can be used for a number of "indicators" defined by PAPR. In practice the only ones we're ever likely to implement are those used for Dynamic Reconfiguration (i.e. hotplug). Because of this, the current implementation determines the associated DRC object, before dispatching based on the type of indicator. However, this means we also need a check that we're dealing with a DR related indicator at all, which duplicates some of the logic from the switch further down. Even though it means a bit of code duplication, things work out cleaner if we delegate the DRC lookup to the individual indicator type functions - and it also allows some further cleanups. While we're there, remove references to "sensor", a copy/paste artefact from the related, but distinct "get-sensor" call. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-06-08 14:38:26 +10:00
David Gibson	454b580ae9	spapr: Don't misuse DR-indicator in spapr_recover_pending_dimm_state() With some combinations of migration and hotplug we can lost temporary state indicating how many DRCs (guest side hotplug handles) are still connected to a DIMM object in the process of removal. When we hit that situation spapr_recover_pending_dimm_state() is used to scan more extensively and work out the right number. It does this using drc->indicator state to determine what state of disconnection the DRC is in. However, this is not safe, because the indicator state is guest settable - in fact it's more-or-less a purely guest->host notification mechanism which should have no bearing on the internals of hotplug state management. So, replace the test for this with a test on drc->dev, which is a purely qemu side managed variable, and updated the same BQL critical section as the indicator state. This does introduce an off-by-one change, because the indicator state was updated before the call to spapr_lmb_release() on the current DRC, whereas drc->dev is updated afterwards. That's corrected by always decrementing the nr_lmbs value instead of only doing so in the case where we didn't have to recover information. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-06-08 14:38:26 +10:00
David Gibson	f224d35be9	spapr: Clean up DR entity sense handling DRC classes have an entity_sense method to determine (in a specific PAPR sense) the presence or absence of a device plugged into a DRC. However, we only have one implementation of the method, which explicitly tests for different DRC types. This changes it to instead have different method implementations for the two cases: "logical" and "physical" DRCs. While we're at it, the entity sense method always returns RTAS_OUT_SUCCESS, and the interesting value is returned via pass-by-reference. Simplify this to directly return the value we care about Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-06-08 14:38:26 +10:00
David Gibson	2c5534776b	pseries: Correct panic behaviour for pseries machine type The pseries machine type doesn't usually use the 'pvpanic' device as such, because it has a firmware/hypervisor facility with roughly the same purpose. The 'ibm,os-term' RTAS call notifies the hypervisor that the guest has crashed. Our implementation of this call was sending a GUEST_PANICKED qmp event; however, it was not doing the other usual panic actions, making its behaviour different from pvpanic for no good reason. To correct this, we should call qemu_system_guest_panicked() rather than directly sending the panic event. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com>	2017-06-08 14:38:18 +10:00
Greg Kurz	8a9e0e7b89	spapr: fix memory leak in spapr_memory_pre_plug() The string returned by object_property_get_str() is dynamically allocated. (Spotted by Coverity, CID 1375942) Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-08 11:05:31 +10:00
Greg Kurz	2d3e302ec2	target/ppc: fix memory leak in kvmppc_is_mem_backend_page_size_ok() The string returned by object_property_get_str() is dynamically allocated. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-08 11:05:31 +10:00
Greg Kurz	ec69355bef	target/ppc: pass const string to kvmppc_is_mem_backend_page_size_ok() This function has three implementations. Two are stubs that do nothing and the third one only passes the obj_path argument to: Object object_resolve_path(const char path, bool *ambiguous); Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-06-08 11:05:31 +10:00
Peter Maydell	bbfa326fc8	* virtio-scsi use-after-free fix (Fam) * SMM fixes and improvements for TCG (myself, Mihail) * irqchip and AddressSpaceDispatch cleanups and fixes (Peter) * Coverity fix (Stefano) * NBD cleanups and fixes (Vladimir, Eric, myself) * RTC accuracy improvements and code cleanups (Guangrong+Yunfang) * socket error reporting improvement (Daniel) * GDB XML description for SSE registers (Abdallah) * kvmclock update fix (Denis) * SMM memory savings (Gonglei) * -cpu 486 fix (myself) * various bugfixes (Roman, Peter, myself, Thomas) * rtc-test improvement (Guangrong) * migration throttling fix (Felipe) * create docs/ subdirectories (myself) -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAlk4KC8UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMCcgf8Dv25jkHHS1lhbq8bR/naatJuS73s sVveHeNp/7iZe5t4S0iLp5XRv3eDQD9AawMmQpJrKBqp21Q4e+iosnkgw3gIdUJX poD7DgAgoiOZgoPgYgNFRnmGscDX+2fBHvKWQJ0y1+lsbtpXxblGba6cBlMbDc5O dJrM7DGciNb4gFRtB7U2k9HMZx0rKdsdyLdpnKoiG4HdkzAL8SSrI10Kn9bw15tf aJqe3lm9i/gCM6zUA1ZOst4Nz5srPJkQUcuZ3HaYzB/nNNUbZF/01giWiUKCFVbt VRPmfA+zwcAWEiueFFjqEZ2ksO4RLUv4+yYSxN0JVibTFkOHncEgY6w9bw== =Z0/h -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * virtio-scsi use-after-free fix (Fam) * SMM fixes and improvements for TCG (myself, Mihail) * irqchip and AddressSpaceDispatch cleanups and fixes (Peter) * Coverity fix (Stefano) * NBD cleanups and fixes (Vladimir, Eric, myself) * RTC accuracy improvements and code cleanups (Guangrong+Yunfang) * socket error reporting improvement (Daniel) * GDB XML description for SSE registers (Abdallah) * kvmclock update fix (Denis) * SMM memory savings (Gonglei) * -cpu 486 fix (myself) * various bugfixes (Roman, Peter, myself, Thomas) * rtc-test improvement (Guangrong) * migration throttling fix (Felipe) * create docs/ subdirectories (myself) # gpg: Signature made Wed 07 Jun 2017 17:22:07 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (31 commits) docs: create config/, devel/ and spin/ subdirectories cpus: reset throttle_thread_scheduled after sleep kvm: don't register smram_listener when smm is off nbd: make it thread-safe, fix qcow2 over nbd target/i386: Add GDB XML description for SSE registers i386/kvm: do not zero out segment flags if segment is unusable or not present edu: fix memory leak on msi_broken platforms linuxboot_dma: compile for i486 kvmclock: update system_time_msr address forcibly nbd: Fully initialize client in case of failed negotiation sockets: improve error reporting if UNIX socket path is too long i386: fix read/write cr with icount option target/i386: use multiple CPU AddressSpaces target/i386: enable A20 automatically in system management mode virtio-scsi: Unset hotplug handler when unrealize exec: simplify phys_page_find() params nbd/client.c: use errp instead of LOG nbd: add errp to read_sync, write_sync and drop_sync nbd: add errp parameter to nbd_wr_syncv() nbd: read_sync and friends: return 0 on success ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-06-07 18:24:08 +01:00
Paolo Bonzini	ac06724a71	docs: create config/, devel/ and spin/ subdirectories Developer documentation should be its own manual. As a start, move all developer-oriented files to a separate directory. Also move non-text files to their own directories: docs/config/ for QEMU -readconfig input, and docs/spin/ for formal models to be used with the SPIN model checker. Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:03 +02:00
Felipe Franciosi	90bb0c0421	cpus: reset throttle_thread_scheduled after sleep Currently, the throttle_thread_scheduled flag is reset back to 0 before sleeping (as part of the throttling logic). Given that throttle_timer (well, any timer) may tick with a slight delay, it so happens that under heavy throttling (ie. close or on CPU_THROTTLE_PCT_MAX) the tick may schedule a further cpu_throttle_thread() work item after the flag reset, but before the previous sleep completed. This results on the vCPU thread sleeping continuously for potentially several seconds in a row. The chances of that happening can be drastically minimised by resetting the flag after the sleep. Signed-off-by: Felipe Franciosi <felipe@nutanix.com> Signed-off-by: Malcolm Crossley <malcolm@nutanix.com> Message-Id: <1495229390-18909-1-git-send-email-felipe@nutanix.com> Acked-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:03 +02:00
Gonglei	d870cfdea5	kvm: don't register smram_listener when smm is off If the user set disable smm by '-machine smm=off', we should not register smram_listener so that we can avoid waster memory in kvm since the added sencond address space. Meanwhile we should assign value of the global kvm_state before invoking the kvm_arch_init(), because pc_machine_is_smm_enabled() may use it by kvm_has_mm(). Signed-off-by: Gonglei <arei.gonglei@huawei.com> Message-Id: <1496316915-121196-1-git-send-email-arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Paolo Bonzini	6bdcc018a6	nbd: make it thread-safe, fix qcow2 over nbd NBD is not thread safe, because it accesses s->in_flight without a CoMutex. Fixing this will be required for multiqueue. CoQueue doesn't have spurious wakeups but, when another coroutine can run between qemu_co_queue_next's wakeup and qemu_co_queue_wait's re-locking of the mutex, the wait condition can become false and a loop is necessary. In fact, it turns out that the loop is necessary even without this multi-threaded scenario. A particular sequence of coroutine wakeups is happening ~80% of the time when starting a guest with qcow2 image served over NBD (i.e. qemu-nbd --format=raw, and QEMU's -drive option has -format=qcow2). This patch fixes that issue too. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Abdallah Bouassida	b8158192fa	target/i386: Add GDB XML description for SSE registers Add an XML description for SSE registers (XMM+MXCSR) for both X86 and X86-64 architectures in the GDB stub: - configure: Define gdb_xml_files for the X86 targets (32 and 64bit). - gdb-xml/i386-32bit-sse.xml & gdb-xml/i386-64bit-sse.xml: The XML files that contain a description of the XMM + MXCSR registers. - gdb-xml/i386-32bit.xml & gdb-xml/i386-64bit.xml: wrappers that include the XML file of the core registers and the other XML file of the SSE registers. - target/i386/cpu.c: Modify the gdb_core_xml_file to the new XML wrapper, modify the gdb_num_core_regs to fit the registers number defined in each XML file. Signed-off-by: Abdallah Bouassida <abdallah.bouassida@lauterbach.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Roman Pen	d45fc087c2	i386/kvm: do not zero out segment flags if segment is unusable or not present This is a fix for the problem [1], where VMCB.CPL was set to 0 and interrupt was taken on userspace stack. The root cause lies in the specific AMD CPU behaviour which manifests itself as unusable segment attributes on SYSRET[2]. Here in this patch flags are not touched even segment is unusable or is not present, therefore CPL (which is stored in DPL field) should not be lost and will be successfully restored on kvm/svm kernel side. Also current patch should not break desired behavior described in this commit: `4cae9c9796` ("target-i386: kvm: clear unusable segments' flags in migration") since present bit will be dropped if segment is unusable or is not present. This is the second part of the whole fix of the corresponding problem [1], first part is related to kvm/svm kernel side and does exactly the same: segment attributes are not zeroed out. [1] Message id: CAJrWOzD6Xq==b-zYCDdFLgSRMPM-NkNuTSDFEtX=7MreT45i7Q@mail.gmail.com [2] Message id: 5d120f358612d73fc909f5bfa47e7bd082db0af0.1429841474.git.luto@kernel.org Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Signed-off-by: Mikhail Sennikovskii <mikhail.sennikovskii@profitbricks.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Michael Chapman <mike@very.puzzling.org> Cc: qemu-devel@nongnu.org Message-Id: <20170601085604.12980-1-roman.penyaev@profitbricks.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Paolo Bonzini	c25a67f0c3	edu: fix memory leak on msi_broken platforms If msi_init fails, the thread has already been created and the mutex/condvar are not destroyed. Initialize everything only after the point where pci_edu_realize cannot fail. Reported-by: Markus Armbruster <armbru@redhat.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Paolo Bonzini	7e01838510	linuxboot_dma: compile for i486 The ROM uses the cmovne instruction, which is new in Pentium Pro and does not work when running QEMU with "-cpu 486". Avoid producing that instruction. Suggested-by: Richard W.M. Jones <rjones@redhat.com> Suggested-by: Thomas Huth <thuth@redhat.com> Reported-by: Rob Landley <rob@landley.net> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Denis Plotnikov	e2b6c1712e	kvmclock: update system_time_msr address forcibly Do an update of system_time_msr address every time before reading the value of tsc_timestamp from guest's kvmclock page. There is no other code paths which ensure that qemu has an up-to-date value of system_time_msr. So, force this update on guest's tsc_timestamp reading. This bug causes effect on those nested setups which turn off TPR access interception for L2 guests and that access being intercepted by L0 doesn't show up in L1. Linux bootstrap initiate kvmclock before APIC initializing causing TPR access. That's why on L1 guests, having TPR interception turned on for L2, the effect of the bug is not revealed. This patch fixes this problem by making sure it knows the correct system_time_msr address every time it is needed. Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com> Message-Id: <1496054944-25623-1-git-send-email-dplotnikov@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Eric Blake	df8ad9f128	nbd: Fully initialize client in case of failed negotiation If a non-NBD client connects to qemu-nbd, we would end up with a SIGSEGV in nbd_client_put() because we were trying to unregister the client's association to the export, even though we skipped inserting the client into that list. Easy trigger in two terminals: $ qemu-nbd -p 30001 --format=raw file $ nmap 127.0.0.1 -p 30001 nmap claims that it thinks it connected to a pago-services1 server (which probably means nmap could be updated to learn the NBD protocol and give a more accurate diagnosis of the open port - but that's not our problem), then terminates immediately, so our call to nbd_negotiate() fails. The fix is to reorder nbd_co_client_start() to ensure that all initialization occurs before we ever try talking to a client in nbd_negotiate(), so that the teardown sequence on negotiation failure doesn't fault while dereferencing a half-initialized object. While debugging this, I also noticed that nbd_update_server_watch() called by nbd_client_closed() was still adding a channel to accept the next client, even when the state was no longer RUNNING. That is fixed by making nbd_can_accept() pay attention to the current state. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451614 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170527030421.28366-1-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Daniel P. Berrange	ad9579aaa1	sockets: improve error reporting if UNIX socket path is too long The 'struct sockaddr_un' only allows 108 bytes for the socket path. If the user supplies a path, QEMU uses snprintf() to silently truncate it when too long. This is undesirable because the user will then be unable to connect to the path they asked for. If the user doesn't supply a path, QEMU builds one based on TMPDIR, but if that leads to an overlong path, it mistakenly uses error_setg_errno() with a stale errno value, because snprintf() does not set errno on truncation. In solving this the code needed some refactoring to ensure we don't pass 'un.sun_path' directly to any APIs which expect NUL-terminated strings, because the path is not required to be terminated. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <20170525155300.22743-1-berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Mihail Abakumov	5b003a40bb	i386: fix read/write cr with icount option Running Windows with icount causes a crash in instruction of write cr. This patch fixes it. Reading and writing cr cause an icount read because there are called cpu_get_apic_tpr and cpu_set_apic_tpr functions. So, there is need gen_io_start()/gen_io_end() calls. Signed-off-by: Mihail Abakumov <mikhail.abakumov@ispras.ru> Message-Id: <ffb376034ff184f2fcbe93d5317d9e76@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Paolo Bonzini	f8c45c6550	target/i386: use multiple CPU AddressSpaces This speeds up SMM switches. Later on it may remove the need to take the BQL, and it may also allow to reuse code between TCG and KVM. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Paolo Bonzini	c8bc83a4dd	target/i386: enable A20 automatically in system management mode Ignore env->a20_mask when running in system management mode. Reported-by: Anthony Xu <anthony.xu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1494502528-12670-1-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Peter Maydell	64175afc69	arm_gicv3: Fix ICC_BPR1 reset value when EL3 not implemented If EL3 is not implemented (ie only one security state) then the one and only ICC_BPR1 register behaves like the Non-secure ICC_BPR1 in an EL3-present configuration. In particular, its reset value is GIC_MIN_BPR_NS, not GIC_MIN_BPR. Correct the erroneous reset value; this fixes a problem where we might hit the assert added in commit `a89ff39ee9`. Reported-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 1496849369-30282-1-git-send-email-peter.maydell@linaro.org	2017-06-07 17:21:44 +01:00
Bruno Dominguez	11cde1c810	configure: split c and cxx extra flags There was no possibility to add specific cxx flags using the configure file. So A new entrance has been created to support it. Duplication of information in configure and rules.mak. Taking QEMU_CFLAGS and add them to QEMU_CXXFLAGS, now the value of QEMU_CXXFLAGS is stored in config-host.mak, so there is no need for it. The makefile for libvixl was adding flags for QEMU_CXXFLAGS in QEMU_CFLAGS because of the addition in rules.mak. That was removed, so adding them where it should be. Signed-off-by: Bruno Dominguez <bru.dominguez@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1496754467-20893-1-git-send-email-bru.dominguez@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-07 15:29:46 +01:00
Peter Maydell	b55a69fe5f	migration/next for 20170607 -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJZN8EJAAoJEPSH7xhYctcjZZ4P/js/OAwnjO7TF63XsZ2ORzBp 8/HCQAL2/0QojiwEeS/xFb2bNnTlISsybqFnESfYG0dGE+JQKpT1+kD9DR0qqAK2 //T7SkhBNdF9tEeAF5l7CSlc8UKO8nIb7kOtoRFbIbepFzypPpgqR74ORP4kOdmf ShEEFuQmgBj8fhg3QzX9gcpqH/e+DUPPMlxb0CPBGkqHNFpwMYA4dARVkbrmMLRH //lKfzBXCYoJ9/jAqTC1rHJaZBnxfU+Wm9+rMRleDSTSLUKFWlcoeGkPCAkEvy2a ABHxnd82gzFJzKT2D/ERirR8/Z35XNAlrfyqcOoxWtmbV0Y/Ee3IBm3jyqqWkLj3 A/kMIiBDRWk2fjCgHzt6NWQBbzL6iHpsX7Qr7yAckC5+a4P/434g7mYpjZfl4IU1 CfIO7DdSvya5lZ4a5Q14r2dwiFO5KRcyBFwsW2qC+MIJz5UNrSQDkQ5vu2mWz/xq a1FC3eC5GjUKHmc5eEMY1sDl4LemGoMtI0yAFdE9StQnXK96tWNK9MQ17S/Ti4Qs PY0mdco4dtqlUlT/miwmJtdS3q1zQgt8vyYtbOvbnMt70IOirFLsKH8fIfdBXbmN Xq+iQfH/UysB9JpnlYDE2ZjeHqHjA5k1rnDlFQDa7RG6vNmKwG2gJf1vWBuyKp17 cEqi0AhuXe4U0A2HKkFM =zm3b -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170607' into staging migration/next for 20170607 # gpg: Signature made Wed 07 Jun 2017 10:02:01 BST # gpg: using RSA key 0xF487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" # gpg: aka "Juan Quintela <quintela@trasno.org>" # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * remotes/juanquintela/tags/migration/20170607: qemu/migration: fix the double free problem on from_src_file ram: Make RAMState dynamic ram: Use MigrationStats for statistics ram: Move ZERO_TARGET_PAGE inside XBZRLE ram: Call migration_page_queue_free() at ram_migration_cleanup() ram: We only print throttling information sometimes ram: Unfold get_xbzrle_cache_stats() into populate_ram_info() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-06-07 15:06:42 +01:00
Roman Pen	528f449f59	coroutine-lock: do not touch coroutine after another one has been entered Submission of requests on linux aio is a bit tricky and can lead to requests completions on submission path: `44713c9e85` ("linux-aio: Handle io_submit() failure gracefully") `0ed93d84ed` ("linux-aio: process completions from ioq_submit()") That means that any coroutine which has been yielded in order to wait for completion can be resumed from submission path and be eventually terminated (freed). The following use-after-free crash was observed when IO throttling was enabled: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f5813dff700 (LWP 56417)] virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252 (gdb) bt #0 virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252 ^^^^^^^^^^^^^^ remember the address #1 virtqueue_fill (vq=0x5598b20d21b0, elem=0x7f5804009a30, len=1, idx=0) at virtio.c:282 #2 virtqueue_push (vq=0x5598b20d21b0, elem=elem@entry=0x7f5804009a30, len=<optimized out>) at virtio.c:308 #3 virtio_blk_req_complete (req=req@entry=0x7f5804009a30, status=status@entry=0 '\000') at virtio-blk.c:61 #4 virtio_blk_rw_complete (opaque=<optimized out>, ret=0) at virtio-blk.c:126 #5 blk_aio_complete (acb=0x7f58040068d0) at block-backend.c:923 #6 coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:78 (gdb) p * elem $8 = {index = 77, out_num = 2, in_num = 1, in_addr = 0x7f5804009ad8, out_addr = 0x7f5804009ae0, in_sg = 0x0, out_sg = 0x7f5804009a50} ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 'in_sg' and 'out_sg' are invalid. e.g. it is impossible that 'in_sg' is zero, instead its value must be equal to: (gdb) p/x 0x7f5804009ad8 + sizeof(elem->in_addr[0]) + 2 * sizeof(elem->out_addr[0]) $26 = 0x7f5804009af0 Seems 'elem' was corrupted. Meanwhile another thread raised an abort: Thread 12 (Thread 0x7f57f2ffd700 (LWP 56426)): #0 raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 qemu_coroutine_enter (co=0x7f5804009af0) at qemu-coroutine.c:113 #3 qemu_co_queue_run_restart (co=0x7f5804009a30) at qemu-coroutine-lock.c:60 #4 qemu_coroutine_enter (co=0x7f5804009a30) at qemu-coroutine.c:119 ^^^^^^^^^^^^^^^^^^ WTF?? this is equal to elem from crashed thread #5 qemu_co_queue_run_restart (co=0x7f57e7f16ae0) at qemu-coroutine-lock.c:60 #6 qemu_coroutine_enter (co=0x7f57e7f16ae0) at qemu-coroutine.c:119 #7 qemu_co_queue_run_restart (co=0x7f5807e112a0) at qemu-coroutine-lock.c:60 #8 qemu_coroutine_enter (co=0x7f5807e112a0) at qemu-coroutine.c:119 #9 qemu_co_queue_run_restart (co=0x7f5807f17820) at qemu-coroutine-lock.c:60 #10 qemu_coroutine_enter (co=0x7f5807f17820) at qemu-coroutine.c:119 #11 qemu_co_queue_run_restart (co=0x7f57e7f18e10) at qemu-coroutine-lock.c:60 #12 qemu_coroutine_enter (co=0x7f57e7f18e10) at qemu-coroutine.c:119 #13 qemu_co_enter_next (queue=queue@entry=0x5598b1e742d0) at qemu-coroutine-lock.c:106 #14 timer_cb (blk=0x5598b1e74280, is_write=<optimized out>) at throttle-groups.c:419 Crash can be explained by access of 'co' object from the loop inside qemu_co_queue_run_restart(): while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) { QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next); ^^^^^^^^^^^^^^^^^^^^ on each iteration 'co' is accessed, but 'co' can be already freed qemu_coroutine_enter(next); } When 'next' coroutine is resumed (entered) it can in its turn resume 'co', and eventually free it. That's why we see 'co' (which was freed) has the same address as 'elem' from the first backtrace. The fix is obvious: use temporary queue and do not touch coroutine after first qemu_coroutine_enter() is invoked. The issue is quite rare and happens every ~12 hours on very high IO and CPU load (building linux kernel with -j512 inside guest) when IO throttling is enabled. With the fix applied guest is running ~35 hours and is still alive so far. Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170601160847.23720-1-roman.penyaev@profitbricks.com Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Fam Zheng <famz@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: qemu-devel@nongnu.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-07 14:39:00 +01:00

1 2 3 4 5 ...

53921 Commits