mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Alexey G	7fb394ad8a	xen-mapcache: Fix the bug when overlapping emulated DMA operations may cause inconsistency in guest memory mappings Under certain circumstances normal xen-mapcache functioning may be broken by guest's actions. This may lead to either QEMU performing exit() due to a caught bad pointer (and with QEMU process gone the guest domain simply appears hung afterwards) or actual use of the incorrect pointer inside QEMU address space -- a write to unmapped memory is possible. The bug is hard to reproduce on a i440 machine as multiple DMA sources are required (though it's possible in theory, using multiple emulated devices), but can be reproduced somewhat easily on a Q35 machine using an emulated AHCI controller -- each NCQ queue command slot may be used as an independent DMA source ex. using READ FPDMA QUEUED command, so a single storage device on the AHCI controller port will be enough to produce multiple DMAs (up to 32). The detailed description of the issue follows. Xen-mapcache provides an ability to map parts of a guest memory into QEMU's own address space to work with. There are two types of cache lookups: - translating a guest physical address into a pointer in QEMU's address space, mapping a part of guest domain memory if necessary (while trying to reduce a number of such (re)mappings to a minimum) - translating a QEMU's pointer back to its physical address in guest RAM These lookups are managed via two linked-lists of structures. MapCacheEntry is used for forward cache lookups, while MapCacheRev -- for reverse lookups. Every guest physical address is broken down into 2 parts: address_index = phys_addr >> MCACHE_BUCKET_SHIFT; address_offset = phys_addr & (MCACHE_BUCKET_SIZE - 1); MCACHE_BUCKET_SHIFT depends on a system (32/64) and is equal to 20 for a 64-bit system (which assumed for the further description). Basically, this means that we deal with 1 MB chunks and offsets within those 1 MB chunks. All mappings are created with 1MB-granularity, i.e. 1MB/2MB/3MB etc. Most DMA transfers typically are less than 1MB, however, if the transfer crosses any 1MB border(s) - than a nearest larger mapping size will be used, so ex. a 512-byte DMA transfer with the start address 700FFF80h will actually require a 2MB range. Current implementation assumes that MapCacheEntries are unique for a given address_index and size pair and that a single MapCacheEntry may be reused by multiple requests -- in this case the 'lock' field will be larger than 1. On other hand, each requested guest physical address (with 'lock' flag) is described by each own MapCacheRev. So there may be multiple MapCacheRev entries corresponding to a single MapCacheEntry. The xen-mapcache code uses MapCacheRev entries to retrieve the address_index & size pair which in turn used to find a related MapCacheEntry. The 'lock' field within a MapCacheEntry structure is actually a reference counter which shows a number of corresponding MapCacheRev entries. The bug lies in ability for the guest to indirectly manipulate with the xen-mapcache MapCacheEntries list via a special sequence of DMA operations, typically for storage devices. In order to trigger the bug, guest needs to issue DMA operations in specific order and timing. Although xen-mapcache is protected by the mutex lock -- this doesn't help in this case, as the bug is not due to a race condition. Suppose we have 3 DMA transfers, namely A, B and C, where - transfer A crosses 1MB border and thus uses a 2MB mapping - transfers B and C are normal transfers within 1MB range - and all 3 transfers belong to the same address_index In this case, if all these transfers are to be executed one-by-one (without overlaps), no special treatment necessary -- each transfer's mapping lock will be set and then cleared on unmap before starting the next transfer. The situation changes when DMA transfers overlap in time, ex. like this: \|===== transfer A (2MB) =====\| \|===== transfer B (1MB) =====\| \|===== transfer C (1MB) =====\| time ---> In this situation the following sequence of actions happens: 1. transfer A creates a mapping to 2MB area (lock=1) 2. transfer B (1MB) tries to find available mapping but cannot find one because transfer A is still in progress, and it has 2MB size + non-zero lock. So transfer B creates another mapping -- same address_index, but 1MB size. 3. transfer A completes, making 1st mapping entry available by setting its lock to 0 4. transfer C starts and tries to find available mapping entry and sees that 1st entry has lock=0, so it uses this entry but remaps the mapping to a 1MB size 5. transfer B completes and by this time - there are two locked entries in the MapCacheEntry list with the SAME values for both address_index and size - the entry for transfer B actually resides farther in list while transfer C's entry is first 6. xen_ram_addr_from_mapcache() for transfer B gets correct address_index and size pair from corresponding MapCacheRev entry, but then it starts looking for MapCacheEntry with these values and finds the first entry -- which belongs to transfer C. At this point there may be following possible (bad) consequences: 1. xen_ram_addr_from_mapcache() will use a wrong entry->vaddr_base value in this statement: raddr = (reventry->paddr_index << MCACHE_BUCKET_SHIFT) + ((unsigned long) ptr - (unsigned long) entry->vaddr_base); resulting in an incorrent raddr value returned from the function. The (ptr - entry->vaddr_base) expression may produce both positive and negative numbers and its actual value may differ greatly as there are many map/unmap operations take place. If the value will be beyond guest RAM limits then a "Bad RAM offset" error will be triggered and logged, followed by exit() in QEMU. 2. If raddr value won't exceed guest RAM boundaries, the same sequence of actions will be performed for xen_invalidate_map_cache_entry() on DMA unmap, resulting in a wrong MapCacheEntry being unmapped while DMA operation which uses it is still active. The above example must be extended by one more DMA transfer in order to allow unmapping as the first mapping in the list is sort of resident. The patch modifies the behavior in which MapCacheEntry's are added to the list, avoiding duplicates. Signed-off-by: Alexey Gerasimenko <x1917x@gmail.com> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>	2017-07-21 17:37:06 -07:00
Igor Druzhinin	9e6bdb92c8	xen: fix compilation on 32-bit hosts Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>	2017-07-21 17:32:56 -07:00
Peter Maydell	b3e46a8914	Xen 2017/07/18 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZbokoAAoJEIlPj0hw4a6QgloP/jc9tVFrPjTDezDyPmXR4ls8 U/kvz5RCn2bu4y8h6U+FHK4BZ7DR1Ccd3Uq1qqDbnlyfcvJeISqqkN2RrnwUQEgV XMCEr+okQyiQV4H/MLvmUWtPHpHt3gSEBdoRdGHnkzA2dC/YsJ1F/khKgCh8wqWS GTeACabyDTb9L/QFdh//o7GtcI6qv/APGJ/rVVFrrVktp+lZuIZCGZ3hbJ8lopoI FSXuM7caVgIlNzP/6RmCoP91ibREPfbfL/yqgv0cW7kiOWVXWwriz6Mi/J2AzmCo jqgDqRzkLZPAl1WdZM7MosQIiY7ZlAGhpS9ArK5P4Kv7H6TYV7mkbiSap8SmjnZH NvSRLxgT3JjTE5evSodfaaQpjiX0KGaZX0JmpqXYPqOBSYal2lDUNFSokbeucp7w y3dBZGY0/9om+G34QzZNvPisYJ2F4Yr5DvCtue8hmkvLSw+z3251555wKQvc6TNx wob2h8b8h+YsfhvnSrN1R8w3OL69kGFlMz9PWEgB4opVacZqph/XsMKLARXCg+FD 83kCuJnV/WAannHpvQA8k4HO6GiKGtrulh6vv1QOlCJQokcK1mZt7Atot+cPcGU2 UTyhSaOv4sy07lPYzvv0B4MUHNObN/v/OoDygrf7WjCDKigH+RpsVruwAqFakoCB 09+PtQ26X2Vup+YtW5bG =qPq6 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20170718-tag' into staging Xen 2017/07/18 # gpg: Signature made Tue 18 Jul 2017 23:18:16 BST # gpg: using RSA key 0x894F8F4870E1AE90 # gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>" # gpg: aka "Stefano Stabellini <sstabellini@kernel.org>" # Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90 * remotes/sstabellini/tags/xen-20170718-tag: xen: don't use xenstore to save/restore physmap anymore xen/mapcache: introduce xen_replace_cache_entry() xen/mapcache: add an ability to create dummy mappings xen: move physmap saving into a separate function xen-platform: separate unplugging of NVMe disks xen_pt_msi.c: Check for xen_host_pci_get_* failures in xen_pt_msix_init() hw/xen: Set emu_mask for igd_opregion register Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-19 16:31:08 +01:00
Igor Druzhinin	331b5189d7	xen: don't use xenstore to save/restore physmap anymore If we have a system with xenforeignmemory_map2() implemented we don't need to save/restore physmap on suspend/restore anymore. In case we resume a VM without physmap - try to recreate the physmap during memory region restore phase and remap map cache entries accordingly. The old code is left for compatibility reasons. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>	2017-07-18 14:16:52 -07:00
Igor Druzhinin	5ba3d75645	xen/mapcache: introduce xen_replace_cache_entry() This new call is trying to update a requested map cache entry according to the changes in the physmap. The call is searching for the entry, unmaps it and maps again at the same place using a new guest address. If the mapping is dummy this call will make it real. This function makes use of a new xenforeignmemory_map2() call with an extended interface that was recently introduced in libxenforeignmemory [1]. [1] https://www.mail-archive.com/xen-devel@lists.xen.org/msg113007.html Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>	2017-07-18 14:16:09 -07:00
Igor Druzhinin	759235653d	xen/mapcache: add an ability to create dummy mappings Dummys are simple anonymous mappings that are placed instead of regular foreign mappings in certain situations when we need to postpone the actual mapping but still have to give a memory region to QEMU to play with. This is planned to be used for restore on Xen. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>	2017-07-18 14:12:20 -07:00
Igor Druzhinin	697b66d006	xen: move physmap saving into a separate function Non-functional change. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Paul Durrant <paul.durrant@citrix.com>	2017-07-18 14:12:19 -07:00
Stefano Stabellini	04d6da4ff6	xen-platform: separate unplugging of NVMe disks Commit `090fa1c8` "add support for unplugging NVMe disks..." extended the existing disk unplug flag to cover NVMe disks as well as IDE and SCSI. The recent thread on the xen-devel mailing list [1] has highlighted that this is not desirable behaviour: PV frontends should be able to distinguish NVMe disks from other types of disk and should have separate control over whether they are unplugged. This patch defines a new bit in the unplug mask for this purpose (see Xen commit [2]) and also tidies up the definitions of, and improves the comments regarding, the previously exiting bits in the protocol. [1] https://lists.xen.org/archives/html/xen-devel/2017-03/msg02924.html [2] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=1096aa02 Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>	2017-07-18 14:12:06 -07:00
John Snow	bbe3179a13	ahci: add ahci_get_num_ports Instead of reaching into the PCI state, allow the AHCIDevice to respond with how many ports it has. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20170623220926.11479-2-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2017-07-18 11:47:56 -04:00
Peter Maydell	98a99ce084	hw: Use new memory_region_init_{ram, rom, rom_device}() functions Use the new functions memory_region_init_{ram,rom,rom_device}() instead of manually calling the _nomigrate() version and then vmstate_register_ram_global(). Patch automatically created using coccinelle script: spatch --in-place -sp_file scripts/coccinelle/memory-region-init-ram.cocci -dir hw (As it turns out, there are no instances of the rom and rom_device functions that are caught by this script.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1499438577-7674-8-git-send-email-peter.maydell@linaro.org	2017-07-14 17:59:42 +01:00
Peter Maydell	1cfe48c1ce	memory: Rename memory_region_init_ram() to memory_region_init_ram_nomigrate() Rename memory_region_init_ram() to memory_region_init_ram_nomigrate(). This leaves the way clear for us to provide a memory_region_init_ram() which does handle migration. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1499438577-7674-4-git-send-email-peter.maydell@linaro.org	2017-07-14 17:59:42 +01:00
Peter Maydell	6c6076662d	* gdbstub fixes (Alex) * IOMMU MemoryRegion subclass (Alexey) * Chardev hotswap (Anton) * NBD_OPT_GO support (Eric) * Misc bugfixes * DEFINE_PROP_LINK (minus the ARM patches - Fam) * MAINTAINERS updates (Philippe) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJZaJejAAoJEL/70l94x66DwQ4H/0NUvh/Zfs64wE1iuZJACc24 1za02fFaB50vFDwQKWbM0GkHzDxoXBHk4Rvn92p+VSxpKtaAX4GRwCvxRA5GeUtm GAYbdIJUe0UELepKExrlUVzQcK9VfljoJpK3dZkP5Zzx83L2PAI/SexrZRibN2Uf yRI60uvlsMWU12nenzdVnYORd+TWDNKele7BhMrX/FX9wxaS1PlnsnKZggy6CU7G 8dwZJAZJ/s5tRGXyXyAQzLm5JZQCLnA6jxya540TbPeciFgbvvS2ydIitZ54vSPO VtmZ1rSWfTEbNF5xGD1Ztu8aAENr5/I05l6IjxZd45BdUCW3HxeJkc+7lE0K4uk= =wnVs -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * gdbstub fixes (Alex) * IOMMU MemoryRegion subclass (Alexey) * Chardev hotswap (Anton) * NBD_OPT_GO support (Eric) * Misc bugfixes * DEFINE_PROP_LINK (minus the ARM patches - Fam) * MAINTAINERS updates (Philippe) # gpg: Signature made Fri 14 Jul 2017 11:06:27 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (55 commits) spapr_rng: Convert to DEFINE_PROP_LINK cpu: Convert to DEFINE_PROP_LINK mips_cmgcr: Convert to DEFINE_PROP_LINK ivshmem: Convert to DEFINE_PROP_LINK dimm: Convert to DEFINE_PROP_LINK virtio-crypto: Convert to DEFINE_PROP_LINK virtio-rng: Convert to DEFINE_PROP_LINK virtio-scsi: Convert to DEFINE_PROP_LINK virtio-blk: Convert to DEFINE_PROP_LINK qdev: Add const qualifier to PropertyInfo definitions qmp: Use ObjectProperty.type if present qdev: Introduce DEFINE_PROP_LINK qdev: Introduce PropertyInfo.create qom: enforce readonly nature of link's check callback translate-all: remove redundant !tcg_enabled check in dump_exec_info vl: fix breakage of -tb-size nbd: Implement NBD_INFO_BLOCK_SIZE on client nbd: Implement NBD_INFO_BLOCK_SIZE on server nbd: Implement NBD_OPT_GO on client nbd: Implement NBD_OPT_GO on server ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-14 12:16:09 +01:00
Alexey Kardashevskiy	1221a47467	memory/iommu: introduce IOMMUMemoryRegionClass This finishes QOM'fication of IOMMUMemoryRegion by introducing a IOMMUMemoryRegionClass. This also provides a fastpath analog for IOMMU_MEMORY_REGION_GET_CLASS(). This makes IOMMUMemoryRegion an abstract class. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170711035620.4232-3-aik@ozlabs.ru> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 12:04:41 +02:00
Alexey Kardashevskiy	3df9d74806	memory/iommu: QOM'fy IOMMU MemoryRegion This defines new QOM object - IOMMUMemoryRegion - with MemoryRegion as a parent. This moves IOMMU-related fields from MR to IOMMU MR. However to avoid dymanic QOM casting in fast path (address_space_translate, etc), this adds an @is_iommu boolean flag to MR and provides new helper to do simple cast to IOMMU MR - memory_region_get_iommu. The flag is set in the instance init callback. This defines memory_region_is_iommu as memory_region_get_iommu()!=NULL. This switches MemoryRegion to IOMMUMemoryRegion in most places except the ones where MemoryRegion may be an alias. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20170711035620.4232-2-aik@ozlabs.ru> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 12:04:41 +02:00
Pranith Kumar	cb58a6d361	mttcg/i386: Patch instruction using async_safe_* framework In mttcg, calling pause_all_vcpus() during execution from the generated TBs causes a deadlock if some vCPU is waiting for exclusive execution in start_exclusive(). Fix this by using the aync_safe_* framework instead of pausing vcpus for patching instructions. CC: Paolo Bonzini <pbonzini@redhat.com> CC: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170712215143.19594-2-bobby.prani@gmail.com> [Get rid completely of the TCG-specific code. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 12:04:35 +02:00
Alistair Francis	88f83f3539	Convert error_report_err() to warn_report_err() Convert all uses of error_report_err("Warning:"... to use warn_report_err() instead. This helps standardise on a single method of printing warnings to the user. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <d8e088757186955f40f04ec4f4be7f640d3c8660.1499866456.git.alistair.francis@xilinx.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-07-13 13:50:24 +02:00
Alistair Francis	3dc6f86936	Convert error_report() to warn_report() Convert all uses of error_report("warning:"... to use warn_report() instead. This helps standardise on a single method of printing warnings to the user. All of the warnings were changed using these two commands: find ./* -type f -exec sed -i \ 's\|error_report(".*warning[,:] \|warn_report("\|Ig' {} + Indentation fixed up manually afterwards. The test-qdev-global-props test case was manually updated to ensure that this patch passes make check (as the test cases are case sensitive). Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Suggested-by: Thomas Huth <thuth@redhat.com> Cc: Jeff Cody <jcody@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Lieven <pl@kamp.de> Cc: Josh Durgin <jdurgin@redhat.com> Cc: "Richard W.M. Jones" <rjones@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Greg Kurz <groug@kaod.org> Cc: Rob Herring <robh@kernel.org> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Peter Chubb <peter.chubb@nicta.com.au> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Alexander Graf <agraf@suse.de> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Greg Kurz <groug@kaod.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed by: Peter Chubb <peter.chubb@data61.csiro.au> Acked-by: Max Reitz <mreitz@redhat.com> Acked-by: Marcel Apfelbaum <marcel@redhat.com> Message-Id: <e1cfa2cd47087c248dd24caca9c33d9af0c499b0.1499866456.git.alistair.francis@xilinx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-07-13 13:49:58 +02:00
Ross Lagerwall	6c808651e3	xen-platform: Cleanup network infrastructure when emulated NICs are unplugged When the guest unplugs the emulated NICs, cleanup the peer for each NIC as it is not needed anymore. Most importantly, this allows the tap interfaces which QEMU holds open to be closed and removed. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>	2017-07-07 11:11:12 -07:00
Paolo Bonzini	24d90a3cfd	vapic: use tcg_enabled Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-04 16:01:16 +02:00
Thomas Huth	2099935dbf	Move CONFIG_KVM related definitions to kvm_i386.h pc.h and sysemu/kvm.h are also included from common code (where CONFIG_KVM is not available), so the #defines that depend on CONFIG_KVM should not be declared here to avoid that anybody is using them in a wrong way. Since we're also going to poison CONFIG_KVM for common code, let's move them to kvm_i386.h instead. Most of the dummy definitions from sysemu/kvm.h are also unused since the code that uses them is only compiled for CONFIG_KVM (e.g. target/i386/kvm.c), so the unused defines are also simply dropped here instead of being moved. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1498454578-18709-3-git-send-email-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-04 14:30:03 +02:00
Peter Xu	552a1e01a4	intel_iommu: fix migration breakage on mr switch Migration is broken after the vfio integration work: qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address qemu-kvm: Failed to load ich9_ahci:ahci qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci' qemu-kvm: load of migration failed: Operation not permitted The problem is that vfio work introduced dynamic memory region switching (actually it is also used for future PT mode), and this memory region layout is not properly delivered to destination when migration happens. Solution is to rebuild the layout in post_load. Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1459906 Fixes: `558e0024` ("intel_iommu: allow dynamic switch of IOMMU region") Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-07-03 22:29:49 +03:00
Aleksandr Bezzubikov	4d7e7f2702	hw/acpi: remove dead acpi code Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-07-03 22:29:49 +03:00
Mao Zhongyi	c0e9067902	i386/kvm/pci-assign: Use errp directly rather than local_err In assigned_device_pci_cap_init(), first, error messages are filled to a local_err variable, then through error_propagate() pass to the parameter of errp. It leads to cumbersome code. In order to avoid the extra local_err and error_propagate(), drop it and use errp instead. Cc: pbonzini@redhat.com Cc: rth@twiddle.net Cc: ehabkost@redhat.com Cc: mst@redhat.com Cc: armbru@redhat.com Cc: marcel@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-07-03 22:29:49 +03:00
Mao Zhongyi	6b728b3116	i386/kvm/pci-assign: Fix return type of verify_irqchip_kernel() When the function no success value to transmit, it usually make the function return void. It has turned out not to be a success, because it means that the extra local_err variable and error_propagate() will be needed. It leads to cumbersome code, therefore, transmit success/ failure in the return value is worth. So fix the return type to avoid it. Cc: pbonzini@redhat.com Cc: rth@twiddle.net Cc: ehabkost@redhat.com Cc: mst@redhat.com Cc: armbru@redhat.com Cc: marcel@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-07-03 22:29:49 +03:00
Mao Zhongyi	2784127857	pci: Replace pci_add_capability2() with pci_add_capability() After the patch 'Make errp the last parameter of pci_add_capability()', pci_add_capability() and pci_add_capability2() now do exactly the same. So drop the wrapper pci_add_capability() of pci_add_capability2(), then replace the pci_add_capability2() with pci_add_capability() everywhere. Cc: pbonzini@redhat.com Cc: rth@twiddle.net Cc: ehabkost@redhat.com Cc: mst@redhat.com Cc: dmitry@daynix.com Cc: jasowang@redhat.com Cc: marcel@redhat.com Cc: alex.williamson@redhat.com Cc: armbru@redhat.com Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-07-03 22:29:49 +03:00
Mao Zhongyi	9a7c2a5970	pci: Make errp the last parameter of pci_add_capability() Add Error argument for pci_add_capability() to leverage the errp to pass info on errors. This way is helpful for its callers to make a better error handling when moving to 'realize'. Cc: pbonzini@redhat.com Cc: rth@twiddle.net Cc: ehabkost@redhat.com Cc: mst@redhat.com Cc: jasowang@redhat.com Cc: marcel@redhat.com Cc: alex.williamson@redhat.com Cc: armbru@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-07-03 22:29:49 +03:00
Ladi Prosek	8991c460be	intel_iommu: relax iq tail check on VTD_GCMD_QIE enable The VT-d spec (section 6.5.2) prescribes software to zero the Invalidation Queue Tail Register before enabling the VTD_GCMD_QIE Global Command Register bit. Windows Server 2012 R2 and possibly other older Windows versions violate the protocol and set a non-zero queue tail first, which in effect makes them crash early on boot with -device intel-iommu,intremap=on. This commit relaxes the check and instead of failing to enable VTD_GCMD_QIE with vtd_err_qi_enable, it behaves as if the tail register was set just after enabling VTD_GCMD_QIE (see vtd_handle_iqt_write). Signed-off-by: Ladi Prosek <lprosek@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-07-03 22:29:48 +03:00
Peter Xu	15c3850325	migration: move skip_section_footers Move it into MigrationState, revert its meaning and renaming it to send_section_footer, with a property bound to it. Same trick is played like previous patches. Removing savevm_skip_section_footers(). Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1498536619-14548-9-git-send-email-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2017-06-28 11:18:39 +02:00
Peter Xu	71dd4c1a56	migration: move skip_configuration out It was in SaveState but now moved to MigrationState altogether, reverted its meaning, then renamed to "send_configuration". Again, using HW_COMPAT_2_3 for old PC/SPAPR machines, and accel_register_prop() for xen_init(). Removing savevm_skip_configuration(). Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1498536619-14548-8-git-send-email-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2017-06-28 11:18:38 +02:00
Peter Xu	5272298c48	migration: move global_state.optional out Put it into MigrationState then we can use the properties to specify whether to enable storing global state. Removing global_state_set_optional() since now we can use HW_COMPAT_2_3 for x86/power, and AccelClass.global_props for Xen. Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1498536619-14548-6-git-send-email-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2017-06-28 11:18:38 +02:00
Peter Maydell	84e3d0725b	QAPI patches for 2017-06-09 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZSRWrAAoJEDhwtADrkYZTVOQP/RK8br2A1Cn7LeVG6jnKz5hJ OqyII77x8I2RachWvnwxQeHPMEVDuz3y2WjL+80s6peXlpR6y/13w7oX0f6aiJuo +T9khTqMv2I7HsM5UCsXAJpFPHT7r90b4x8nstY80YLGe7lA7L6yk6PGyCxHThwA mOiTKDw6/Xb/yZGrS2Favrun7juNpAs0Ec1IAkaA8xsEgVkd6tDv281rmHqvibl/ //90VfJp3nHFZ12FCQ1HzA42Eigtmo/fIk9LnAzBoYG0zw0cnzjuv0BNzs/JwuUZ /VskeD1cViQ4yzFnPpjOavjYjTN854/JTJzm7gZ7dTQ6/l3ykoY6NDE8p1BLuHlC p2RKkg20EeZlpOEtMQ4g6iyG6EUxaKcEiXmQ31LqN/LJwxTYbo5B5nCHMjrt4gxe MqFBJQSNsJ7QjZ7Qa7pADMCi/G0m7/0dN8vBqSr4vcbLVvdbw/yb/9s33wXGrUj1 PyXM2ymi+vvSqcXtNXKshsJLxJSJxO1tm2tRIANDTabQ00yxs8dOYnQnbQFR94fp 6nrE2PnjZqgqk69aNDJEbngj6Tgx44nyTr1+Q17juZf9nTCE5QmBE1J0IRoykCJn E8+T63ZxtIxVV2yLi5xBjmZaZtPyJRGGeUXunA10SuWrHzupEcBuhFhFYd2MFM5L fsojALN2K3Gdx2+CmAo2 =O9Vv -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-06-09-v2' into staging QAPI patches for 2017-06-09 # gpg: Signature made Tue 20 Jun 2017 13:31:39 BST # gpg: using RSA key 0x3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-qapi-2017-06-09-v2: (41 commits) tests/qdict: check more get_try_int() cases console: use get_uint() for "head" property i386/cpu: use get_uint() for "min-level"/"min-xlevel" properties numa: use get_uint() for "size" property pnv-core: use get_uint() for "core-pir" property pvpanic: use get_uint() for "ioport" property auxbus: use get_uint() for "addr" property arm: use get_uint() for "mp-affinity" property xen: use get_uint() for "max-ram-below-4g" property pc: use get_uint() for "hpet-intcap" property pc: use get_uint() for "apic-id" property pc: use get_uint() for "iobase" property acpi: use get_uint() for "pci-hole" properties acpi: use get_uint() for various acpi properties acpi: use get_uint() for "acpi-pcihp-io" properties platform-bus: use get_uint() for "addr" property bcm2835_fb: use {get, set}_uint() for "vcram-size" and "vcram-base" aspeed: use {set, get}_uint() for "ram-size" property pcihp: use get_uint() for "bsel" property pc-dimm: make "size" property uint64 ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-06-22 11:34:39 +01:00
Marc-André Lureau	4ccd89d294	xen: use get_uint() for "max-ram-below-4g" property TYPE_PC_MACHINE's property PC_MACHINE_MAX_RAM_BELOW_4G's getter and setter pc_machine_get_max_ram_below_4g() and pc_machine_set_max_ram_below_4g() use visit_type_size() Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170607163635.17635-34-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:33 +02:00
Marc-André Lureau	5d7fb0f254	pc: use get_uint() for "hpet-intcap" property TYPE_HPET's property HPET_INTCAP is defined with DEFINE_PROP_UINT32(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170607163635.17635-33-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:33 +02:00
Marc-André Lureau	c7b4efb4a0	pc: use get_uint() for "apic-id" property TYPE_X86_CPU's property "apic-id" is defined with DEFINE_PROP_UINT32(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170607163635.17635-32-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:33 +02:00
Marc-André Lureau	1ea1572adf	pc: use get_uint() for "iobase" property TYPE_ISA_FDC's property "iobase" is defined with DEFINE_PROP_UINT32(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170607163635.17635-31-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:33 +02:00
Marc-André Lureau	605553654f	acpi: use get_uint() for "pci-hole" properties Those properties use visit_type_uint() Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170607163635.17635-30-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:33 +02:00
Marc-André Lureau	b81bdbf3c7	acpi: use get_uint() for various acpi properties PIIX4: piix4_pm_add_propeties() defines these with object_property_add_uint*_ptr(). Q35: ich9_lpc_add_properties() and ich9_pm_add_properties() define them similarly, except for ACPI_PM_PROP_GPE0_BLK(). That one's getter ich9_pm_get_gpe0_blk() uses visit_type_uint32(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170607163635.17635-29-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:33 +02:00
Marc-André Lureau	35f91e5069	acpi: use get_uint() for "acpi-pcihp-io*" properties Those are defined with object_property_add_uint16_ptr() Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170607163635.17635-28-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:32 +02:00
Marc-André Lureau	446de8b68a	qdev: Use appropriate getter/setters type Based on the underlying type of the data accessed, use the appropriate getters/setters: * AcpiPmInfo members s3_disabled, s4_disabled are bool, member s4_val is an uint8_t * Property ACPI_PCIHP_IO_PROP is defined with object_property_add_uint32_ptr() * Property PCIE_HOST_MCFG_SIZE is implemented with visit_type_uint64() * PCIDevice property "addr" is backed by PCIDevice member devfn, which is an int32_t Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170607163635.17635-20-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [More verbose commit message] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:32 +02:00
Marc-André Lureau	5923f85fb8	qapi: update the qobject visitor to use QNUM_U64 Switch to use QNum/uint where appropriate to remove i64 limitation. The input visitor will cast i64 input to u64 for compatibility reasons (existing json QMP client already use negative i64 for large u64, and expect an implicit cast in qemu). Note: before the patch, uint64_t values above INT64_MAX are sent over json QMP as negative values, e.g. UINT64_MAX is sent as -1. After the patch, they are sent unmodified. Clearly a bug fix, but we have to consider compatibility issues anyway. libvirt should cope fine, because its parsing of unsigned integers accepts negative values modulo 2^64. There's hope that other clients will, too. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20170607163635.17635-12-marcandre.lureau@redhat.com> [check_native_list() tweaked for consistency with signed case] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:31 +02:00
Marc-André Lureau	01b2ffcedd	qapi: merge QInt and QFloat in QNum We would like to use a same QObject type to represent numbers, whether they are int, uint, or floats. Getters will allow some compatibility between the various types if the number fits other representations. Add a few more tests while at it. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170607163635.17635-7-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [parse_stats_intervals() simplified a bit, comment in test_visitor_in_int_overflow() tidied up, suppress bogus warnings] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-06-20 14:31:31 +02:00
Stefan Hajnoczi	7f3cf2d6e7	hw/i386: fix nvdimm check error path Commit `e987c37aee` ("hw/i386: check if nvdimm is enabled before plugging") introduced a check to reject nvdimm hotplug if -machine pc,nvdimm=on was not given. This check executes after pc_dimm_memory_plug() has already completed and does not reverse the effect of this function in the case of failure. Perform the check before calling pc_dimm_memory_plug(). This fixes the following abort: $ qemu -M accel=kvm -m 1G,slots=4,maxmem=8G \ -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G (qemu) device_add nvdimm,memdev=mem1 nvdimm is not enabled: missing 'nvdimm' in '-M' (qemu) device_add nvdimm,memdev=mem1 Core dumped The backtrace is: #0 0x00007fffdb5b191f in raise () at /lib64/libc.so.6 #1 0x00007fffdb5b351a in abort () at /lib64/libc.so.6 #2 0x00007fffdb5a9da7 in __assert_fail_base () at /lib64/libc.so.6 #3 0x00007fffdb5a9e52 in () at /lib64/libc.so.6 #4 0x000055555577a5fa in qemu_ram_set_idstr (new_block=0x555556747a00, name=<optimized out>, dev=dev@entry=0x555556705590) at qemu/exec.c:1709 #5 0x0000555555a0fe86 in vmstate_register_ram (mr=mr@entry=0x55555673a0e0, dev=dev@entry=0x555556705590) at migration/savevm.c:2293 #6 0x0000555555965088 in pc_dimm_memory_plug (dev=dev@entry=0x555556705590, hpms=hpms@entry=0x5555566bb0e0, mr=mr@entry=0x555556705630, align=<optimized out>, errp=errp@entry=0x7fffffffc660) at hw/mem/pc-dimm.c:110 #7 0x000055555581d89b in pc_dimm_plug (errp=0x7fffffffc6c0, dev=0x555556705590, hotplug_dev=<optimized out>) at qemu/hw/i386/pc.c:1713 #8 0x000055555581d89b in pc_machine_device_plug_cb (hotplug_dev=<optimized out>, dev=0x555556705590, errp=0x7fffffffc6c0) at qemu/hw/i386/pc.c:2004 #9 0x0000555555914da6 in device_set_realized (obj=<optimized out>, value=<optimized out>, errp=0x7fffffffc7e8) at hw/core/qdev.c:926 Cc: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-06-16 18:44:56 +03:00
Peter Xu	e7a3b91fdf	intel_iommu: cleanup vtd_interrupt_remap_msi() Move the memcpy upper into where needed, then share the trace so that we trace every correct remapping. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-06-16 18:44:56 +03:00
Peter Xu	b9313021f3	intel_iommu: cleanup vtd_{do_}iommu_translate() First, let vtd_do_iommu_translate() return a status, so that we explicitly knows whether error occured. Meanwhile, we make sure that IOMMUTLBEntry is filled in in that. Then, cleanup vtd_iommu_translate a bit. So even with PT we'll get a log now. Also, remove useless assignments. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-06-16 18:44:55 +03:00
Peter Xu	7feb51b709	intel_iommu: switching the rest DPRINTF to trace We have converted many of the DPRINTF() into traces. This patch does the last 100+ ones. To debug VT-d when error happens, let's try enable: -trace enable="vtd_err*" This should works just like the old GENERAL but of course better, since we don't need to recompile. Similar rules apply to the other modules. I was trying to make the prefix good enough for sub-module debugging. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-06-16 18:44:55 +03:00
Juan Quintela	c4b63b7cc5	migration: Move remaining exported functions to migration/misc.h Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Xu <peterx@redhat.com>	2017-06-13 11:00:45 +02:00
Juan Quintela	84a899de8c	migration: create global_state.c It don't belong anywhere else, just the global state where everybody can stick other things. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com>	2017-06-13 11:00:45 +02:00
Denis Plotnikov	e2b6c1712e	kvmclock: update system_time_msr address forcibly Do an update of system_time_msr address every time before reading the value of tsc_timestamp from guest's kvmclock page. There is no other code paths which ensure that qemu has an up-to-date value of system_time_msr. So, force this update on guest's tsc_timestamp reading. This bug causes effect on those nested setups which turn off TPR access interception for L2 guests and that access being intercepted by L0 doesn't show up in L1. Linux bootstrap initiate kvmclock before APIC initializing causing TPR access. That's why on L1 guests, having TPR interception turned on for L2, the effect of the bug is not revealed. This patch fixes this problem by making sure it knows the correct system_time_msr address every time it is needed. Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com> Message-Id: <1496054944-25623-1-git-send-email-dplotnikov@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-07 18:22:02 +02:00
Igor Mammedov	d41f3e750d	numa: make sure that all cpus have has_node_id set if numa is enabled It fixes/add missing _PXM object for non mapped CPU (x86) and missing fdt node (virt-arm). It ensures that possible_cpus contains complete mapping if numa is enabled by the time machine_init() is executed. As result non completely mapped CPUs: 1) appear in ACPI/fdt blobs 2) QMP query-hotpluggable-cpus command shows bound nodes for such CPUs 3) allows to drop checks for has_node_id in numa only code, reducing number of invariants incomplete mapping could produce 4) moves fixup/implicit node init from runtime numa_cpu_pre_plug() (when CPU object is created) to machine_numa_finish_init() which helps to fix [1, 2] and make possible_cpus complete source of numa mapping available even before CPUs are created. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1496161442-96665-4-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-06-05 14:59:08 -03:00
Igor Mammedov	a0ceb640d0	numa: consolidate cpu_preplug fixups/checks for pc/arm/spapr Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <1496161442-96665-2-git-send-email-imammedo@redhat.com> [ehabkost: Fix indentation] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-06-05 14:59:08 -03:00
Marc-André Lureau	f664b88247	Remove/replace sysemu/char.h inclusion Those are apparently unnecessary includes. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2017-06-02 11:33:52 +04:00
Stefan Hajnoczi	a3203e7dd3	pci, virtio, vhost: fixes A bunch of fixes all over the place. Most notably this fixes the new MTU feature when using vhost. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZK2bwAAoJECgfDbjSjVRpNBgIALmNG7VaixhNUlnfX1n1JBnh +HBP2zNfvi0q5roBuPFmlziKa3IBHb2Fcte4nb6QxmPg+uoaj39AOzfrrvz210kR h2j5Qk2bCdMeWBpxI+xDDScwi/Im23Y6KN1eZyMekFr2CaSGiqOHZPPdbsyEcHPB VylM0uHqSTZL5JAAzEuYlH+LLfPu91HoxMsIAdNuQX+qKyM2DZ4eICBQ0zA73USt OduZltcRMk7UpvQMqY+2iaEXapXQQEUGrP2Mo8ZyqeIl2ItC33GspqBQIKjuZdrr tpr/T1VWsLdZnURZXyELrFqrErDXvKaP9HROwvyLyYPXZF+pJ3LA7TopS5UmfNQ= =Z4xG -----END PGP SIGNATURE----- Merge remote-tracking branch 'mst/tags/for_upstream' into staging pci, virtio, vhost: fixes A bunch of fixes all over the place. Most notably this fixes the new MTU feature when using vhost. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Mon 29 May 2017 01:10:24 AM BST # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * mst/tags/for_upstream: acpi-test: update expected files pc: ACPI BIOS: use highest NUMA node for hotplug mem hole SRAT entry vhost-user: pass message as a pointer to process_message_reply() virtio_net: Bypass backends for MTU feature negotiation intel_iommu: turn off pt before 2.9 intel_iommu: support passthrough (PT) intel_iommu: allow dev-iotlb context entry conditionally intel_iommu: use IOMMU_ACCESS_FLAG() intel_iommu: provide vtd_ce_get_type() intel_iommu: renaming context entry helpers x86-iommu: use DeviceClass properties memory: remove the last param in memory_region_iommu_replay() memory: tune last param of iommu_ops.translate() Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-05-30 14:15:04 +01:00
Stefan Hajnoczi	d0eda02938	QAPI patches for 2017-05-23 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZJB4MAAoJEDhwtADrkYZTjhkP+wRaiZj9h4IJvcWoNEzfyuA1 kd7+Kx6QgfCmZE9vL2/mlOFddWL0fPtPffL/ZRu5UNgIILaCSPFsGkOGvXLZhaUW he5sqLCqMc2mxgB98HpbT0dzt0cOSCjdM5BxkFXeq/yPoDa0IiZiD8cpvj+FVwKi D0qGdrKKGCR3RteL4gr/kaXY/LXAZfuEjbAtylQx1aMHJ6CKmdSIVVVU2JJVIYhQ +dT/Xst0PSkJYk90wgmwpzPCqKR/N5zHFe8CyUoE67FxBhegdw19O3wlzU9DJ3N5 8Az+fbEjifWoMytTZR4H3snPJGwl6wxsh2UVj9SMCvebc0y278UPlGqiszvWBepa 1iZHHULH+yygHyUmX6CxjHOUW498ES2KGHx7qJJe8ebeJ4XuU7JcE+Sf4GQEAm8Q p6P5s3qXpuVjekCjmerUAtybr+hxEQC9fbAGqPq+r489jwjvUiETrMLbmEHyy/Xa fSUaW+f5kGI0GJS9FYcbcMy9w2130lTK2k4bZM0mSVlSsHA7W0GBDnzxUDtxo6uH oqMQgKIFWOBU5GkRUiL43vpiTIpiLCuG6PbQlgefQRPWdoODVxykuu2bq5hVaax8 8XMkkq7isG/J5esFc55L1qEUyrUDtVYx/LiHj0XXJikkGirXtp7b7l/TmFLZGsex UWWzFRbZnCVf2CKwdV6h =DNqn -----END PGP SIGNATURE----- Merge remote-tracking branch 'armbru/tags/pull-qapi-2017-05-23' into staging QAPI patches for 2017-05-23 # gpg: Signature made Tue 23 May 2017 12:33:32 PM BST # gpg: using RSA key 0x3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * armbru/tags/pull-qapi-2017-05-23: qapi-schema: Remove obsolete note from ObjectTypeInfo block: Use QDict helpers for --force-share shutdown: Expose bool cause in SHUTDOWN and RESET events shutdown: Add source information to SHUTDOWN and RESET shutdown: Preserve shutdown cause through replay shutdown: Prepare for use of an enum in reset/shutdown_request shutdown: Simplify shutdown_signal sockets: Plug memory leak in socket_address_flatten() scripts/qmp/qom-set: fix the value argument passed to srv.command() Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-05-30 09:33:40 +01:00
Ladi Prosek	ede24a0264	pc: ACPI BIOS: use highest NUMA node for hotplug mem hole SRAT entry For reasons unknown, Windows won't online all memory, both at command line and hot-plugged later, unless the hotplug mem hole SRAT entry specifies a node greater than or equal to the ones where memory is added. Using the highest node on the machine makes recent versions of Windows happy. With this example command line: ... \ -m 1024,slots=4,maxmem=32G \ -numa node,nodeid=0 \ -numa node,nodeid=1 \ -numa node,nodeid=2 \ -numa node,nodeid=3 \ -object memory-backend-ram,size=1G,id=mem-mem1 \ -device pc-dimm,id=dimm-mem1,memdev=mem-mem1,node=1 Windows reports a total of 1G of RAM without this commit and the expected 2G with this commit. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Laszlo Ersek <lersek@redhat.com>	2017-05-29 03:07:57 +03:00
Peter Xu	dbaabb25f4	intel_iommu: support passthrough (PT) Hardware support for VT-d device passthrough. Although current Linux can live with iommu=pt even without this, but this is faster than when using software passthrough. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Liu, Yi L <yi.l.liu@linux.intel.com> Reviewed-by: Jason Wang <jasowang@redhat.com>	2017-05-25 21:25:27 +03:00
Peter Xu	f80c98740e	intel_iommu: allow dev-iotlb context entry conditionally When device-iotlb is not specified, we should fail this check. A new function vtd_ce_type_check() is introduced. While I'm at it, clean up the vtd_dev_to_context_entry() a bit - replace many "else if" usage into direct if check. That'll make the logic more clear. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com>	2017-05-25 21:25:27 +03:00
Peter Xu	5a38cb5940	intel_iommu: use IOMMU_ACCESS_FLAG() We have that now, so why not use it. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com>	2017-05-25 21:25:27 +03:00
Peter Xu	127ff5c356	intel_iommu: provide vtd_ce_get_type() Helper to fetch VT-d context entry type. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com>	2017-05-25 21:25:27 +03:00
Peter Xu	8f7d7161dd	intel_iommu: renaming context entry helpers The old names are too long and less ordered. Let's start to use vtd_ce_*() as a pattern. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com>	2017-05-25 21:25:27 +03:00
Peter Xu	0b77d30a43	x86-iommu: use DeviceClass properties No reason to keep tens of lines if we can do it actually far shorter. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com>	2017-05-25 21:25:27 +03:00
Peter Xu	bf55b7afce	memory: tune last param of iommu_ops.translate() This patch converts the old "is_write" bool into IOMMUAccessFlags. The difference is that "is_write" can only express either read/write, but sometimes what we really want is "none" here (neither read nor write). Replay is an good example - during replay, we should not check any RW permission bits since thats not an actual IO at all. CC: Paolo Bonzini <pbonzini@redhat.com> CC: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jason Wang <jasowang@redhat.com>	2017-05-25 21:25:27 +03:00
Eric Blake	cf83f14005	shutdown: Add source information to SHUTDOWN and RESET Time to wire up all the call sites that request a shutdown or reset to use the enum added in the previous patch. It would have been less churn to keep the common case with no arguments as meaning guest-triggered, and only modified the host-triggered code paths, via a wrapper function, but then we'd still have to audit that I didn't miss any host-triggered spots; changing the signature forces us to double-check that I correctly categorized all callers. Since command line options can change whether a guest reset request causes an actual reset vs. a shutdown, it's easy to also add the information to reset requests. Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> [ppc parts] Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> [SPARC part] Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x parts] Message-Id: <20170515214114.15442-5-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-05-23 13:28:17 +02:00
Eric Blake	aedbe19297	shutdown: Prepare for use of an enum in reset/shutdown_request We want to track why a guest was shutdown; in particular, being able to tell the difference between a guest request (such as ACPI request) and host request (such as SIGINT) will prove useful to libvirt. Since all requests eventually end up changing shutdown_requested in vl.c, the logical change is to make that value track the reason, rather than its current 0/1 contents. Since command-line options control whether a reset request is turned into a shutdown request instead, the same treatment is given to reset_requested. This patch adds an internal enum ShutdownCause that describes reasons that a shutdown can be requested, and changes qemu_system_reset() to pass the reason through, although for now nothing is actually changed with regards to what gets reported. The enum could be exported via QAPI at a later date, if deemed necessary, but for now, there has not been a request to expose that much detail to end clients. For the most part, we turn 0 into SHUTDOWN_CAUSE_NONE, and 1 into SHUTDOWN_CAUSE_HOST_ERROR; the only specific case where we have enough information right now to use a different value is when we are reacting to a host signal. It will take a further patch to edit all call-sites that can trigger a reset or shutdown request to properly pass in any other reasons; this patch includes TODOs to point such places out. qemu_system_reset() trades its 'bool report' parameter for a 'ShutdownCause reason', with all non-zero values having the same effect; this lets us get rid of the weird #defines for VMRESET_* as synonyms for bools. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170515214114.15442-3-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-05-23 13:28:17 +02:00
Juan Quintela	68ba3b0743	migration: migration.h was not needed This files don't use any function from migration.h, so drop it. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2017-05-18 19:20:59 +02:00
Stefan Hajnoczi	adb354dd1e	pci, virtio, vhost: fixes A bunch of fixes that missed the release. Most notably we are reverting shpc back to enabled by default state as guests uses that as an indicator that hotplug is supported (even though it's unused). Unfortunately we can't fix this on the stable branch since that would break migration. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZHMOuAAoJECgfDbjSjVRp/5IH/3kOa7yV3KUi4QVbQV7WwBH3 LK+/jwIz4UhOZn+bS4qi+gjN6aFhNoBNDFmYsRTWKKdLMvZvkRBMDcv8DMIKeAyl kG/ispv8VI+GY/CRKnqzPm0FSulv8WPRryxkdGzK4oHiMv+4FpFR0v/n9NRHjwTA XNJ4k33IqBldXyZwwAzP5dT019EMvbn4bNrkLzlcF2w8mTWPf43eX/kIkRX0cAys 5IVTQVGEOwpnyV0jxJDP+aoVMrqv8xl88LLuRpTgWUo0UnxXL5/GZQOCCUN6DQ7M BOLmyyP9mT9k8iUI+fQsDxAtY7cL9torq+p985nQdH0nxmI3GCoufn9aJG0J9yc= =d34x -----END PGP SIGNATURE----- Merge remote-tracking branch 'mst/tags/for_upstream' into staging pci, virtio, vhost: fixes A bunch of fixes that missed the release. Most notably we are reverting shpc back to enabled by default state as guests uses that as an indicator that hotplug is supported (even though it's unused). Unfortunately we can't fix this on the stable branch since that would break migration. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Wed 17 May 2017 10:42:06 PM BST # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * mst/tags/for_upstream: exec: abstract address_space_do_translate() pci: deassert intx when pci device unrealize virtio: allow broken device to notify guest Revert "hw/pci: disable pci-bridge's shpc by default" acpi-defs: clean up open brace usage ACPI: don't call acpi_pcihp_device_plug_cb on xen iommu: Don't crash if machine is not PC_MACHINE pc: add 2.10 machine type pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot libvhost-user: fix crash when rings aren't ready hw/virtio: fix vhost user fails to startup when MQ hw/arm/virt: generate 64-bit addressable ACPI objects hw/acpi-defs: replace leading X with x_ in FADT field names Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-05-18 10:01:08 +01:00
Eduardo Habkost	040e99686a	kvmvapic: Remove user_creatable flag The kvmvapic device is only usable when created by apic_common_realize(), not using -device. Remove the user_creatable flag from the device class. Cc: Igor Mammedov <imammedo@redhat.com> Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Acked-by: Marcel Apfelbaum <marcel@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170503203604.31462-10-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-17 10:37:01 -03:00
Eduardo Habkost	6c4672cae7	ioapic: Remove user_creatable flag An ioapic device is already created by the q35 initialization code, and using "-device ioapic" or "-device kvm-ioapic" will always fail with "Only 1 ioapics allowed". Remove the user_creatable flag from the ioapic device classes. Cc: Igor Mammedov <imammedo@redhat.com> Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Marcel Apfelbaum <marcel@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170503203604.31462-9-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-17 10:37:01 -03:00
Eduardo Habkost	642c1e0546	kvmclock: Remove user_creatable flag kvmclock should be used by guests only when the appropriate CPUID feature flags are set on the VCPU, and it is automatically created by kvmclock_create() when those feature flags are set. This means creating a kvmclock device using -device is useless. Remove user_creatable from its device class. Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Thomas Huth <thuth@redhat.com> Acked-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170503203604.31462-8-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-17 10:37:01 -03:00
Eduardo Habkost	8ab5700ca0	iommu: Remove FIXME comment about user_creatable=true amd-iommu and intel-iommu are really meant to be used with -device, so they need user_creatable=true. Remove the FIXME comment. Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Acked-by: Marcel Apfelbaum <marcel@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170503203604.31462-5-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-17 10:37:01 -03:00
Eduardo Habkost	e4f4fb1eca	sysbus: Set user_creatable=false by default on TYPE_SYS_BUS_DEVICE commit `33cd52b5d7` unset cannot_instantiate_with_device_add_yet in TYPE_SYSBUS, making all sysbus devices appear on "-device help" and lack the "no-user" flag in "info qdm". To fix this, we can set user_creatable=false by default on TYPE_SYS_BUS_DEVICE, but this requires setting user_creatable=true explicitly on the sysbus devices that actually work with -device. Fortunately today we have just a few has_dynamic_sysbus=1 machines: virt, pc-q35-, ppce500, and spapr. virt, ppce500, and spapr have extra checks to ensure just a few device types can be instantiated: virt supports only TYPE_VFIO_CALXEDA_XGMAC, TYPE_VFIO_AMD_XGBE. * ppce500 supports only TYPE_ETSEC_COMMON. * spapr supports only TYPE_SPAPR_PCI_HOST_BRIDGE. This patch sets user_creatable=true explicitly on those 4 device classes. Now, the more complex cases: pc-q35-: q35 has no sysbus device whitelist yet (which is a separate bug). We are in the process of fixing it and building a sysbus whitelist on q35, but in the meantime we can fix the "-device help" and "info qdm" bugs mentioned above. Also, despite not being strictly necessary for fixing the q35 bug, reducing the list of user_creatable=true devices will help us be more confident when building the q35 whitelist. xen: We also have a hack at xen_set_dynamic_sysbus(), that sets has_dynamic_sysbus=true at runtime when using the Xen accelerator. This hack is only used to allow xen-backend devices to be dynamically plugged/unplugged. This means today we can use -device with the following 22 device types, that are the ones compiled into the qemu-system-x86_64 and qemu-system-i386 binaries: allwinner-ahci * amd-iommu * cfi.pflash01 * esp * fw_cfg_io * fw_cfg_mem * generic-sdhci * hpet * intel-iommu * ioapic * isabus-bridge * kvmclock * kvm-ioapic * kvmvapic * SUNW,fdtwo * sysbus-ahci * sysbus-fdc * sysbus-ohci * unimplemented-device * virtio-mmio * xen-backend * xen-sysdev This patch adds user_creatable=true explicitly to those devices, temporarily, just to keep 100% compatibility with existing behavior of q35. Subsequent patches will remove user_creatable=true from the devices that are really not meant to user-creatable on any machine, and remove the FIXME comment from the ones that are really supposed to be user-creatable. This is being done in separate patches because we still don't have an obvious list of devices that will be whitelisted by q35, and I would like to get each device reviewed individually. Cc: Alexander Graf <agraf@suse.de> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Alistair Francis <alistair.francis@xilinx.com> Cc: Beniamino Galvani <b.galvani@gmail.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Frank Blaschka <frank.blaschka@de.ibm.com> Cc: Gabriel L. Somlo <somlo@cmu.edu> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: John Snow <jsnow@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Pierre Morel <pmorel@linux.vnet.ibm.com> Cc: Prasad J Pandit <pjp@fedoraproject.org> Cc: qemu-arm@nongnu.org Cc: qemu-block@nongnu.org Cc: qemu-ppc@nongnu.org Cc: Richard Henderson <rth@twiddle.net> Cc: Rob Herring <robh@kernel.org> Cc: Shannon Zhao <zhaoshenglong@huawei.com> Cc: sstabellini@kernel.org Cc: Thomas Huth <thuth@redhat.com> Cc: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Acked-by: John Snow <jsnow@redhat.com> Acked-by: Juergen Gross <jgross@suse.com> Acked-by: Marcel Apfelbaum <marcel@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170503203604.31462-3-ehabkost@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [ehabkost: Small changes at sysbus_device_class_init() comments] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-17 10:37:01 -03:00
Eduardo Habkost	e90f2a8c3e	qdev: Replace cannot_instantiate_with_device_add_yet with !user_creatable cannot_instantiate_with_device_add_yet was introduced by commit `efec3dd631` to replace no_user. It was supposed to be a temporary measure. When it was introduced, we had 54 cannot_instantiate_with_device_add_yet=true lines in the code. Today (3 years later) this number has not shrunk: we now have 57 cannot_instantiate_with_device_add_yet=true lines. I think it is safe to say it is not a temporary measure, and we won't see the flag go away soon. Instead of a long field name that misleads people to believe it is temporary, replace it a shorter and less misleading field: user_creatable. Except for code comments, changes were generated using the following Coccinelle patch: @@ expression DC; @@ ( -DC->cannot_instantiate_with_device_add_yet = false; +DC->user_creatable = true; \| -DC->cannot_instantiate_with_device_add_yet = true; +DC->user_creatable = false; ) @@ typedef ObjectClass; expression dc; identifier class, data; @@ static void device_class_init(ObjectClass class, void data) { ... dc->hotpluggable = true; +dc->user_creatable = true; ... } @@ @@ struct DeviceClass { ... -bool cannot_instantiate_with_device_add_yet; +bool user_creatable; ... } @@ expression DC; @@ ( -!DC->cannot_instantiate_with_device_add_yet +DC->user_creatable \| -DC->cannot_instantiate_with_device_add_yet +!DC->user_creatable ) Cc: Alistair Francis <alistair.francis@xilinx.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Thomas Huth <thuth@redhat.com> Acked-by: Alistair Francis <alistair.francis@xilinx.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Acked-by: Marcel Apfelbaum <marcel@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170503203604.31462-2-ehabkost@redhat.com> [ehabkost: kept "TODO remove once we're there" comment] Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-17 10:37:00 -03:00
Stefano Stabellini	1ff7c5986a	xen/mapcache: store dma information in revmapcache entries for debugging The Xen mapcache is able to create long term mappings, they are called "locked" mappings. The third parameter of the xen_map_cache call specifies if a mapping is a "locked" mapping. >From the QEMU point of view there are two kinds of long term mappings: [a] device memory mappings, such as option roms and video memory [b] dma mappings, created by dma_memory_map & friends After certain operations, ballooning a VM in particular, Xen asks QEMU kindly to destroy all mappings. However, certainly [a] mappings are present and cannot be removed. That's not a problem as they are not affected by balloonning. The real problem is that if there are any mappings of type [b], any outstanding dma operations could fail. This is a known shortcoming. In other words, when Xen asks QEMU to destroy all mappings, it is an error if any [b] mappings exist. However today we have no way of distinguishing [a] from [b]. Because of that, we cannot even print a decent warning. This patch introduces a new "dma" bool field to MapCacheRev entires, to remember if a given mapping is for dma or is a long term device memory mapping. When xen_invalidate_map_cache is called, we print a warning if any [b] mappings exist. We ignore [a] mappings. Mappings created by qemu_map_ram_ptr are assumed to be [a], while mappings created by address_space_map->qemu_ram_ptr_length are assumed to be [b]. The goal of the patch is to make debugging and system understanding easier. Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>	2017-05-16 11:49:09 -07:00
Igor Mammedov	ea2650724c	pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu() Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <1494415802-227633-9-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:49 -03:00
Igor Mammedov	93b2a8cb0b	pc: add node-id property to CPU it will allow switching from cpu_index to property based numa mapping in follow up patches. PS: patch changes default value of CPUState::numa_node from 0 to CPU_UNSET_NUMA_NODE_ID. The only place for x86 that would affected is monitor's 'infor numa' command which uses that field. However legacy 0 value is still preserved by pc_cpu_pre_plug() in this patch if user/numa.c hasn't set it explicitly, so there is no change in behavior. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1494415802-227633-4-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:48 -03:00
Igor Mammedov	ea089eebbd	numa: move source of default CPUs to NUMA node mapping into boards Originally CPU threads were by default assigned in round-robin fashion. However it was causing issues in guest since CPU threads from the same socket/core could be placed on different NUMA nodes. Commit `fb43b73b` (pc: fix default VCPU to NUMA node mapping) fixed it by grouping threads within a socket on the same node introducing cpu_index_to_socket_id() callback and commit `20bb648d` (spapr: Fix default NUMA node allocation for threads) reused callback to fix similar issues for SPAPR machine even though socket doesn't make much sense there. As result QEMU ended up having 3 default distribution rules used by 3 targets /virt-arm, spapr, pc/. In effort of moving NUMA mapping for CPUs into possible_cpus, generalize default mapping in numa.c by making boards decide on default mapping and let them explicitly tell generic numa code to which node a CPU thread belongs to by replacing cpu_index_to_socket_id() with @cpu_index_to_instance_props() which provides default node_id assigned by board to specified cpu_index. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1494415802-227633-2-git-send-email-imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:48 -03:00
Laurent Vivier	3bfe57165b	numa: equally distribute memory on nodes When there are more nodes than available memory to put the minimum allowed memory by node, all the memory is put on the last node. This is because we put (ram_size / nb_numa_nodes) & ~((1 << mc->numa_mem_align_shift) - 1); on each node, and in this case the value is 0. This is particularly true with pseries, as the memory must be aligned to 256MB. To avoid this problem, this patch uses an error diffusion algorithm [1] to distribute equally the memory on nodes. We introduce numa_auto_assign_ram() function in MachineClass to keep compatibility between machine type versions. The legacy function is used with pseries-2.9, pc-q35-2.9 and pc-i440fx-2.9 (and previous), the new one with all others. Example: qemu-system-ppc64 -S -nographic -nodefaults -monitor stdio -m 1G -smp 8 \ -numa node -numa node -numa node \ -numa node -numa node -numa node Before: (qemu) info numa 6 nodes node 0 cpus: 0 6 node 0 size: 0 MB node 1 cpus: 1 7 node 1 size: 0 MB node 2 cpus: 2 node 2 size: 0 MB node 3 cpus: 3 node 3 size: 0 MB node 4 cpus: 4 node 4 size: 0 MB node 5 cpus: 5 node 5 size: 1024 MB After: (qemu) info numa 6 nodes node 0 cpus: 0 6 node 0 size: 0 MB node 1 cpus: 1 7 node 1 size: 256 MB node 2 cpus: 2 node 2 size: 0 MB node 3 cpus: 3 node 3 size: 256 MB node 4 cpus: 4 node 4 size: 256 MB node 5 cpus: 5 node 5 size: 256 MB [1] https://en.wikipedia.org/wiki/Error_diffusion Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170502162955.1610-2-lvivier@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> [ehabkost: s/ram_size/size/ at numa_default_auto_assign_ram()] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:47 -03:00
He Chen	0f203430dd	numa: Allow setting NUMA distance for different NUMA nodes This patch is going to add SLIT table support in QEMU, and provides additional option `dist` for command `-numa` to allow user set vNUMA distance by QEMU command. With this patch, when a user wants to create a guest that contains several vNUMA nodes and also wants to set distance among those nodes, the QEMU command would like: ``` -numa node,nodeid=0,cpus=0 \ -numa node,nodeid=1,cpus=1 \ -numa node,nodeid=2,cpus=2 \ -numa node,nodeid=3,cpus=3 \ -numa dist,src=0,dst=1,val=21 \ -numa dist,src=0,dst=2,val=31 \ -numa dist,src=0,dst=3,val=41 \ -numa dist,src=1,dst=2,val=21 \ -numa dist,src=1,dst=3,val=31 \ -numa dist,src=2,dst=3,val=21 \ ``` Signed-off-by: He Chen <he.chen@linux.intel.com> Message-Id: <1493260558-20728-1-git-send-email-he.chen@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-05-11 16:08:37 -03:00
Eduardo Habkost	ef0e8fc768	iommu: Don't crash if machine is not PC_MACHINE Currently it's possible to crash QEMU using "-device *-iommu" and "-machine none": $ qemu-system-x86_64 -machine none -device amd-iommu qemu/hw/i386/amd_iommu.c:1140:amdvi_realize: Object 0x55627dafbc90 is not an instance of type generic-pc-machine Aborted (core dumped) $ qemu-system-x86_64 -machine none -device intel-iommu qemu/hw/i386/intel_iommu.c:2972:vtd_realize: Object 0x56292ec0bc90 is not an instance of type generic-pc-machine Aborted (core dumped) Fix amd-iommu and intel-iommu to ensure the current machine is really a TYPE_PC_MACHINE instance at their realize methods. Resulting error messages: $ qemu-system-x86_64 -machine none -device amd-iommu qemu-system-x86_64: -device amd-iommu: Machine-type 'none' not supported by amd-iommu $ qemu-system-x86_64 -machine none -device intel-iommu qemu-system-x86_64: -device intel-iommu: Machine-type 'none' not supported by intel-iommu Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-05-10 22:04:23 +03:00
Peter Xu	465238d9f8	pc: add 2.10 machine type CC: "Michael S. Tsirkin" <mst@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Richard Henderson <rth@twiddle.net> CC: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-05-10 22:04:23 +03:00
Igor Mammedov	98e753a6e5	pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot Since 2.7 commit (`b2a575a` Add optionrom compatible with fw_cfg DMA version) regressed migration during firmware exection time by abusing fwcfg.dma_enabled property to decide loading dma version of option rom AND by mistake disabling DMA for 2.6 and earlier globally instead of only for option rom. so 2.6 machine type guest is broken when it already runs firmware in DMA mode but migrated to qemu-2.7(pc-2.6) at that time; a) qemu-2.6:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=ioport) b) qemu-2.7:pc2.6 (fwcfg.dma=off,firmware=ioport,oprom=ioport) to: a b from a OK FAIL b OK OK So we currently have broken forward migration from qemu-2.6 to qemu-2.[789] that however could be fixed for 2.10 by re-enabling DMA for 2.[56] machine types and allowing dma capable option rom only since 2.7. As result qemu should end up with: c) qemu-2.10:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=ioport) to: a b c from a OK FAIL OK b OK OK OK c OK FAIL OK where forward migration from qemu-2.6 to qemu-2.10 should work again leaving only qemu-2.[789]:pc-2.6 broken. Reported-by: Eduardo Habkost <ehabkost@redhat.com> Analyzed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-05-10 22:04:23 +03:00
Ard Biesheuvel	5ee8534731	hw/acpi-defs: replace leading X with x_ in FADT field names At the request of Michael, replace the leading capital X in the FADT field name Xfacs and Xdsdt with lower case x + underscore. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-05-10 22:04:23 +03:00
Phil Dennis-Jordan	6103451aeb	hw/i386: Build-time assertion on pc/q35 reset register being identical. This adds a clarifying comment and build time assert to the FADT reset register field initialisation: the reset register is the same on both machine types. Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu> Message-Id: <1489558827-28971-3-git-send-email-phil@philjordan.eu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-05-03 12:29:40 +02:00
Phil Dennis-Jordan	77af8a2b95	hw/i386: Use Rev3 FADT (ACPI 2.0) instead of Rev1 to improve guest OS support. This updates the FADT generated for x86/64 machine types from Revision 1 to 3. (Based on ACPI standard 2.0 instead of 1.0) The intention is to expose the reset register information to guest operating systems which require it, specifically OS X/macOS. Revision 1 FADTs do not contain the fields relating to the reset register. The new layout and contents remains backwards-compatible with operating systems which only support ACPI 1.0, as the existing fields are not modified by this change, as the 64-bit and 32-bit variants are allowed to co-exist according to the ACPI 2.0 standard. No regressions became apparent in tests with a range of Windows (XP-10) and Linux versions. The BIOS tables test suite's FADT checksum test has also been updated to reflect the new FADT layout and content. Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu> Message-Id: <1489558827-28971-2-git-send-email-phil@philjordan.eu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-05-03 12:29:40 +02:00
Peter Maydell	52e94ea5de	Xen 2017/04/21 + fix -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJY/5EdAAoJEIlPj0hw4a6QHj8P/1D3iZq8vKyLvnkioYC00ao8 dhuCFV1Dk0dl/QXvDXxNAHFqmHp2PzHeHF2V19bTbUh9QnY3WH9aSsMQ7YVPDzLb 6ZezbxXd1l8+IOrWh+ch+x6jiUpRAeHGjhSpFxQEmH5THBmPtLHBFhAeeGpVq6CT MPFlN0mr1snyMMbK2aKCVTHgh5Wip83/t/kijIq4RmPwXdJbwxx55PL8jxC/NZHq /NbAZzAT149fmItfBNnsRTfNoDs7fSmgM9VelYsnJNHs6qkYYfewxys4vudePG8n ZcjCXMv9vdCSTfXfYY9nxhfGCgkkRSvBWrzARRIUPxTXGAmEU52LJrccpG9AEFRZ Kgz1JYr0y+u/g+RsCJvAglvmciawXgZH8GIR4sFl2iv4u6cx8PxNRgjTdt7YX4Je V7+scmOLB0U5wkI7PUcPV1v2fwKJHFfMdyik257otfMCJiTCnWGJfCZGR4JAdy3C NHuG4eIj2XLjZ7IQIo3mg+EfdCWdfVMKtXj0MAFsP6Kcr6sY/ef5sx/wB61k3NU1 Y4ctI/LAOONsjlC2yzJ4aZR4LgyFcVcwx8FKOK7niCcCgVLYGWpyiHU3kFKYlmKM FKIlGPPOK8WuRV9NUGDI9A8XENyFXGyiR2xGCotfgHj+FSSqRzhKH4ecfgNimJVY wjTzZmBa68pLHdXaJ+SV =OfeB -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20170421-v2-tag' into staging Xen 2017/04/21 + fix # gpg: Signature made Tue 25 Apr 2017 19:10:37 BST # gpg: using RSA key 0x894F8F4870E1AE90 # gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>" # gpg: aka "Stefano Stabellini <sstabellini@kernel.org>" # Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90 * remotes/sstabellini/tags/xen-20170421-v2-tag: (21 commits) move xen-mapcache.c to hw/i386/xen/ move xen-hvm.c to hw/i386/xen/ move xen-common.c to hw/xen/ add xen-9p-backend to MAINTAINERS under Xen xen/9pfs: build and register Xen 9pfs backend xen/9pfs: send responses back to the frontend xen/9pfs: implement in/out_iov_from_pdu and vmarshal/vunmarshal xen/9pfs: receive requests from the frontend xen/9pfs: connect to the frontend xen/9pfs: introduce Xen 9pfs backend 9p: introduce a type for the 9p header xen: import ring.h from xen configure: use pkg-config for obtaining xen version xen: additionally restrict xenforeignmemory operations xen: use libxendevice model to restrict operations xen: use 5 digit xen versions xen: use libxendevicemodel when available configure: detect presence of libxendevicemodel xen: create wrappers for all other uses of xc_hvm_XXX() functions xen: rename xen_modified_memory() to xen_hvm_modified_memory() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-26 10:22:31 +01:00
Anthony Xu	28b99f473b	move xen-mapcache.c to hw/i386/xen/ move xen-mapcache.c to hw/i386/xen/ Signed-off -by: Anthony Xu <anthony.xu@intel.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>	2017-04-25 11:04:34 -07:00
Anthony Xu	93d43e7e11	move xen-hvm.c to hw/i386/xen/ move xen-hvm.c to hw/i386/xen/ Signed-off -by: Anthony Xu <anthony.xu@intel.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>	2017-04-25 11:04:34 -07:00
Fam Zheng	021c9d25b3	error: Apply error_propagate_null.cocci again Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170421122710.15373-15-famz@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-04-24 09:13:45 +02:00
Peter Xu	dd4d607e40	intel_iommu: enable remote IOTLB This patch is based on Aviv Ben-David (<bd.aviv@gmail.com>)'s patch upstream: "IOMMU: enable intel_iommu map and unmap notifiers" https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01453.html However I removed/fixed some content, and added my own codes. Instead of translate() every page for iotlb invalidations (which is slower), we walk the pages when needed and notify in a hook function. This patch enables vfio devices for VT-d emulation. And, since we already have vhost DMAR support via device-iotlb, a natural benefit that this patch brings is that vt-d enabled vhost can live even without ATS capability now. Though more tests are needed. Signed-off-by: Aviv Ben-David <bdaviv@cs.technion.ac.il> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1491562755-23867-10-git-send-email-peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-04-20 15:22:41 -03:00
Peter Xu	558e0024a4	intel_iommu: allow dynamic switch of IOMMU region This is preparation work to finally enabled dynamic switching ON/OFF for VT-d protection. The old VT-d codes is using static IOMMU address space, and that won't satisfy vfio-pci device listeners. Let me explain. vfio-pci devices depend on the memory region listener and IOMMU replay mechanism to make sure the device mapping is coherent with the guest even if there are domain switches. And there are two kinds of domain switches: (1) switch from domain A -> B (2) switch from domain A -> no domain (e.g., turn DMAR off) Case (1) is handled by the context entry invalidation handling by the VT-d replay logic. What the replay function should do here is to replay the existing page mappings in domain B. However for case (2), we don't want to replay any domain mappings - we just need the default GPA->HPA mappings (the address_space_memory mapping). And this patch helps on case (2) to build up the mapping automatically by leveraging the vfio-pci memory listeners. Another important thing that this patch does is to seperate IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not depend on the DMAR region (like before this patch). It should be a standalone region, and it should be able to be activated without DMAR (which is a common behavior of Linux kernel - by default it enables IR while disabled DMAR). Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1491562755-23867-9-git-send-email-peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-04-20 15:22:41 -03:00
Peter Xu	f06a696dc9	intel_iommu: provide its own replay() callback The default replay() don't work for VT-d since vt-d will have a huge default memory region which covers address range 0-(2^64-1). This will normally consumes a lot of time (which looks like a dead loop). The solution is simple - we don't walk over all the regions. Instead, we jump over the regions when we found that the page directories are empty. It'll greatly reduce the time to walk the whole region. To achieve this, we provided a page walk helper to do that, invoking corresponding hook function when we found an page we are interested in. vtd_page_walk_level() is the core logic for the page walking. It's interface is designed to suite further use case, e.g., to invalidate a range of addresses. Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1491562755-23867-8-git-send-email-peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-04-20 15:22:41 -03:00
Jason Wang	10315b9b28	intel_iommu: use the correct memory region for device IOTLB notification We have a specific memory region for DMAR now, so it's wrong to trigger the notifier with the root region. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1491562755-23867-7-git-send-email-peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-04-20 15:22:41 -03:00
Eric Blake	0d3ef78829	trace: Avoid abuse of amdvi_mmio_read hw/i386/trace-events has an amdvi_mmio_read trace that is used for both normal reads (listing the register name, address, size, and offset) and for an error case (abusing the register name to show an error message, the address to show the maximum value supported, then shoehorning address and size into the size and offset parameters). The change from a wide address to a narrower size parameter could truncate a (rather-large) bogus read attempt, so it's better to create a separate dedicated trace with correct types, rather than abusing the trace mechanism. Broken since its introduction in commit `d29a09c`. [Change trace event argument type from hwaddr to uint64_t since user-defined types should not be used for trace events. This fixes a build failure with LTTng UST. --Stefan] Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-03-24 09:21:42 +00:00
Paul Durrant	8f25e75441	xen: create wrappers for all other uses of xc_hvm_XXX() functions This patch creates inline wrapper functions in xen_common.h for all open coded calls to xc_hvm_XXX() functions outside of xen_common.h so that use of xen_xc can be made implicit. This again is in preparation for the move to using libxendevicemodel. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>	2017-03-22 11:47:39 -07:00
Eduardo Habkost	ca2edcd35c	kvmclock: Don't crash QEMU if KVM is disabled Most machines don't allow sysbus devices like "kvmclock" to be created from the command-line, but some of them do (the ones with has_dynamic_sysbus=true). In those cases, it's possible to manually create a kvmclock device without KVM being enabled, making QEMU crash: $ qemu-system-x86_64 -machine q35,accel=tcg -device kvmclock Segmentation fault (core dumped) This changes kvmclock's realize method to return an error if KVM is disabled, to ensure it won't crash QEMU. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170309185046.17555-1-ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-14 13:26:36 +01:00
Peter Maydell	9a81b792cc	virtio, pc: fixes, features virtio support for region caches broke a bunch of stuff - fixing most of it though it's not ideal. Still pondering the right way to fix it. New: VM gen ID and hotplug for PXB. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJYt7llAAoJECgfDbjSjVRp+r4H/1cmQ4F67H8oSOAT8xuAQFku OdHoVRJMWf7CRvZ7JqVke/a877d+h6ZpfW5dZQ7hp7O7rkPiuPHa5PVb0WGwDqrD scSOIvDPxJm19pnfZoF4zx+Ov45W5ahF+gwwm/sJU232ApLqOmAjs0FUxidkadQE f5Jrjs20WO2Vkkcd3U7Zl31myre0V7AbwIm7dB/8B+dpL6bJcxSvlM4krwLdBY6S lLs9V6ypRzjUxS3MDANL75KNrO/zys55J+Pa4sEh4+H0OX71v9Icl3s1zaM8J/EN VPjdqhDvJuEahc50FbJyRZQGIzOZ6PcGMsKUHKlxoVmDYZ6Pv5lOnpaLZRT6HMk= =ITdO -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging virtio, pc: fixes, features virtio support for region caches broke a bunch of stuff - fixing most of it though it's not ideal. Still pondering the right way to fix it. New: VM gen ID and hotplug for PXB. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 02 Mar 2017 06:19:17 GMT # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: hw/pxb-pcie: fix PCI Express hotplug support tests/acpi: update DSDT after last patch acpi: simplify _OSC virtio: unbreak virtio-pci with IOMMU after caching ring translations virtio: add missing region cache init in virtio_load() virtio: invalidate memory in vring_set_avail_event() virtio: guard vring access when setting notification virtio: check for vring setup in virtio_queue_empty MAINTAINERS: Add VM Generation ID entries tests: Move reusable ACPI code into a utility file qmp/hmp: add query-vm-generation-id and 'info vm-generation-id' commands ACPI: Add Virtual Machine Generation ID support ACPI: Add vmgenid blob storage to the build tables docs: VM Generation ID device description linker-loader: Add new 'write pointer' command Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-03-03 10:09:03 +00:00
Marcel Apfelbaum	077dd74239	hw/pxb-pcie: fix PCI Express hotplug support Add the missing osc method for pxb-pcie devices as APCI spec recommends, see 6.2.9.1 OSC Implementation Example for PCI Host Bridge Devices, ACPI 3.0a: It is recommended that a machine with multiple host bridge devices should report the same capabilities for all host bridges, and also negotiate control of the features described in the Control Field in the same way for all host bridges. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-03-02 07:31:26 +02:00
Michael S. Tsirkin	b3c782db20	acpi: simplify _OSC Our _OSC method has a bunch of unused code loading data into external CTRL and SUPP fields which are then never used. Drop this in favor of a single local variable. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>	2017-03-02 07:29:10 +02:00
Ben Warren	d03637bcfb	ACPI: Add Virtual Machine Generation ID support This implements the VM Generation ID feature by passing a 128-bit GUID to the guest via a fw_cfg blob. Any time the GUID changes, an ACPI notify event is sent to the guest The user interface is a simple device with one parameter: - guid (string, must be "auto" or in UUID format xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) Signed-off-by: Ben Warren <ben@skyportsystems.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Tested-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-03-02 07:14:27 +02:00
Igor Mammedov	f0c9d64a68	pc: pcihp: avoid adding ACPI_PCIHP_PROP_BSEL twice PCI hotplug for bridges was introduced only since 2.0 however acpi_set_bsel()->object_property_add_uint32_ptr(bus, ACPI_PCIHP_PROP_BSEL) didn't take in account that for legacy mode (1.7) when PCI hotplug for bridges is unavailable and ACPI_PCIHP_PROP_BSEL property the only bus "PCI.0' has been created earlier at acpi_pcihp_init() time. We managed to live with it only because of error rised by adding a duplicate property in acpi_set_bsel() has been ignored which resulted in useless leaking of just allocated (int)bus_bsel. Issue affects only 1.7 machine type as ACPI tables supported by QEMU were introduced at that time, but there wasn't PCI hotplug for bridges till the next release (2.0). Fix it by removing duplicate ACPI_PCIHP_PROP_BSEL intialization in acpi_pcihp_init() and doing it only in one place acpi_set_pci_info(). PS: do not ignore error returned by object_property_add_uint32_ptr() and abort QEMU since it's programming error which should be fixed instead of being ignored. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1470211497-116801-1-git-send-email-imammedo@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> [ Marc-André - Remove now unused ACPI_PCIHP_LEGACY_SIZE ] Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2017-03-01 11:51:05 +04:00
Peter Maydell	28f997a82c	This is the MTTCG pull-request as posted yesterday. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJYsBZfAAoJEPvQ2wlanipElJ0H+QGoStSPeHrvKu7Q07v4F9zM Pvf05gRsaxvXl7UbwmXC4oKhvZf9rVJ6ITk0x/y0WvmK0mHCmNBWkC0nn5UFL5IH cdxetLz21Q+Ghpc36tZvqn2HYwRQFoEznge2LdtBDG0TyVA4jwquHU3HCG2D51zi BaImI6lYW1e4ejjZHw8cEInSxsj/HJZE4pPas2Tkci+uAnrJroErwBVRRcE/y/Tn aupl9TJFs2JdyJFNDibIm0kjB+i+jvCiLgYjbKZ/dR/+GZt73TtiBk/q9ZOFjdmT 7YFPI3F46QbGHoZahtzh0Xt7WMj94SlQgQ9OJ3zmNMfpXrze6Yc78xo/nbQ33U0= =wR0/ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-240217-1' into staging This is the MTTCG pull-request as posted yesterday. # gpg: Signature made Fri 24 Feb 2017 11:17:51 GMT # gpg: using RSA key 0xFBD0DB095A9E2A44 # gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" # Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44 * remotes/stsquad/tags/pull-mttcg-240217-1: (24 commits) tcg: enable MTTCG by default for ARM on x86 hosts hw/misc/imx6_src: defer clearing of SRC_SCR reset bits target-arm: ensure all cross vCPUs TLB flushes complete target-arm: don't generate WFE/YIELD calls for MTTCG target-arm/powerctl: defer cpu reset work to CPU context cputlb: introduce tlb_flush__all_cpus[_synced] cputlb: atomically update tlb fields used by tlb_reset_dirty cputlb: add tlb_flush_by_mmuidx async routines cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap cputlb: introduce tlb_flush_ async work. cputlb: tweak qemu_ram_addr_from_host_nofail reporting cputlb: add assert_cpu_is_self checks tcg: handle EXCP_ATOMIC exception for system emulation tcg: enable thread-per-vCPU tcg: enable tb_lock() for SoftMMU tcg: remove global exit_request tcg: drop global lock during TCG code execution tcg: rename tcg_current_cpu to tcg_current_rr_cpu tcg: add kick timer for single-threaded vCPU emulation tcg: add options for enabling MTTCG ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-02-25 18:43:52 +00:00

1 2 3 4 5 ...

1219 Commits