mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Paolo Bonzini	b855f8d175	scsi: build qemu-pr-helper Introduce a privileged helper to run persistent reservation commands. This lets virtual machines send persistent reservations without using CAP_SYS_RAWIO or out-of-tree patches. The helper uses Unix permissions and SCM_RIGHTS to restrict access to processes that can access its socket and prove that they have an open file descriptor for a raw SCSI device. The next patch will also correct the usage of persistent reservations with multipath devices. It would also be possible to support for Linux's IOC_PR_* ioctls in the future, to support NVMe devices. For now, however, only SCSI is supported. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-22 21:07:24 +02:00
Paolo Bonzini	7c9e527659	scsi, file-posix: add support for persistent reservation management It is a common requirement for virtual machine to send persistent reservations, but this currently requires either running QEMU with CAP_SYS_RAWIO, or using out-of-tree patches that let an unprivileged QEMU bypass Linux's filter on SG_IO commands. As an alternative mechanism, the next patches will introduce a privileged helper to run persistent reservation commands without expanding QEMU's attack surface unnecessarily. The helper is invoked through a "pr-manager" QOM object, to which file-posix.c passes SG_IO requests for PERSISTENT RESERVE OUT and PERSISTENT RESERVE IN commands. For example: $ qemu-system-x86_64 -device virtio-scsi \ -object pr-manager-helper,id=helper0,path=/var/run/qemu-pr-helper.sock -drive if=none,id=hd,driver=raw,file.filename=/dev/sdb,file.pr-manager=helper0 -device scsi-block,drive=hd or: $ qemu-system-x86_64 -device virtio-scsi \ -object pr-manager-helper,id=helper0,path=/var/run/qemu-pr-helper.sock -blockdev node-name=hd,driver=raw,file.driver=host_device,file.filename=/dev/sdb,file.pr-manager=helper0 -device scsi-block,drive=hd Multiple pr-manager implementations are conceivable and possible, though only one is implemented right now. For example, a pr-manager could: - talk directly to the multipath daemon from a privileged QEMU (i.e. QEMU links to libmpathpersist); this makes reservation work properly with multipath, but still requires CAP_SYS_RAWIO - use the Linux IOC_PR_* ioctls (they require CAP_SYS_ADMIN though) - more interestingly, implement reservations directly in QEMU through file system locks or a shared database (e.g. sqlite) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-22 01:06:51 +02:00
Alexey Kardashevskiy	092aa2fc65	memory: Share special empty FlatView This shares an cached empty FlatView among address spaces. The empty FV is used every time when a root MR renders into a FV without memory sections which happens when MR or its children are not enabled or zero-sized. The empty_view is not NULL to keep the rest of memory API intact; it also has a dispatch tree for the same reason. On POWER8 with 255 CPUs, 255 virtio-net, 40 PCI bridges guest this halves the amount of FlatView's in use (557 -> 260) and dispatch tables (~800000 -> ~370000). In an unrelated experiment with 112 non-virtio devices on x86 ("-M pc"), only 4 FlatViews are alive, and about ~2000 are created at startup. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-16-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-22 01:06:51 +02:00
Paolo Bonzini	e673ba9af9	memory: seek FlatView sharing candidates among children subregions A container can be used instead of an alias to allow switching between multiple subregions. In this case we cannot directly share the subregions (since they only belong to a single parent), but if the subregions are aliases we can in turn walk those. This is not enough to remove all source of quadratic FlatView creation, but it enables sharing of the PCI bus master FlatViews (and their AddressSpaceDispatch structures) across all PCI devices. For 112 virtio-net-pci devices, boot time is reduced from 25 to 10 seconds and memory consumption from 1.4 to 1 G. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-22 01:06:51 +02:00
Paolo Bonzini	02d9651d6a	memory: trace FlatView creation and destruction Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-22 01:06:51 +02:00
Alexey Kardashevskiy	202fc01b05	memory: Create FlatView directly This avoids usual memory_region_transaction_commit() which rebuilds all FVs. On POWER8 with 255 CPUs, 255 virtio-net, 40 PCI bridges guest this brings down the boot time from 25s to 20s and reduces the amount of temporary FVs allocated during machine constructon (~800000 -> ~640000) and amount of temporary dispatch trees (~370000 -> ~300000), the total memory footprint goes down (18G -> 17G). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-18-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-22 01:06:51 +02:00
Alexey Kardashevskiy	b516572f31	memory: Get rid of address_space_init_shareable Since FlatViews are shared now and ASes not, this gets rid of address_space_init_shareable(). This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-17-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-22 01:06:51 +02:00
Alexey Kardashevskiy	5e8fd947e2	memory: Rework "info mtree" to print flat views and dispatch trees This adds a new "-d" switch to "info mtree" to print dispatch tree internals. This changes the way "-f" is handled - it prints now flat views and associated address spaces. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-15-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:38 +02:00
Alexey Kardashevskiy	67ace39b25	memory: Do not allocate FlatView in address_space_init This creates a new AS object without any FlatView as memory_region_transaction_commit() may want to reuse the empty FV. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-14-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:38 +02:00
Alexey Kardashevskiy	967dc9b119	memory: Share FlatView's and dispatch trees between address spaces This allows sharing flat views between address spaces (AS) when the same root memory region is used when creating a new address space. This is done by walking through all ASes and caching one FlatView per a physical root MR (i.e. not aliased). This removes search for duplicates from address_space_init_shareable() as FlatViews are shared elsewhere and keeping as::ref_count correct seems an unnecessary and useless complication. This should cause no change and memory use or boot time yet. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-13-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:38 +02:00
Alexey Kardashevskiy	0221848764	memory: Move address_space_update_ioeventfds So it is called (twice) from the same function. This is to make the next patches a bit simpler. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-12-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:38 +02:00
Alexey Kardashevskiy	9bf561e36c	memory: Alloc dispatch tree where topology is generared This is to make next patches simpler. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-11-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:38 +02:00
Alexey Kardashevskiy	89c177bbdd	memory: Store physical root MR in FlatView Address spaces get to keep a root MR (alias or not) but FlatView stores the actual MR as this is going to be used later on to decide whether to share a particular FlatView or not. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-10-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy	8629d3fcb7	memory: Rename mem_begin/mem_commit/mem_add helpers This renames some helpers to reflect better what they do. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-9-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy	9950322a59	memory: Cleanup after switching to FlatView We store AddressSpaceDispatch* in FlatView anyway so there is no need to carry it from mem_add() to register_subpage/register_multipage. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-8-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy	166206845f	memory: Switch memory from using AddressSpace to FlatView FlatView's will be shared between AddressSpace's and subpage_t and MemoryRegionSection cannot store AS anymore, hence this change. In particular, for: typedef struct subpage_t { MemoryRegion iomem; - AddressSpace as; + FlatView fv; hwaddr base; uint16_t sub_section[]; } subpage_t; struct MemoryRegionSection { MemoryRegion mr; - AddressSpace address_space; + FlatView *fv; hwaddr offset_within_region; Int128 size; hwaddr offset_within_address_space; bool readonly; }; This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-7-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy	c775252378	memory: Remove AddressSpace pointer from AddressSpaceDispatch AS in ASD is only used to pass AS from mem_begin() to register_subpage() to store it in MemoryRegionSection, we can do this directly now. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-6-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy	66a6df1dc6	memory: Move AddressSpaceDispatch from AddressSpace to FlatView As we are going to share FlatView's between AddressSpace's, and AddressSpaceDispatch is a structure to perform quick lookup in FlatView, this moves ASD to FlatView. After previosly open coded ASD rendering, we can also remove as->next_dispatch as the new FlatView pointer is stored on a stack and set to an AS atomically. flatview_destroy() is executed under RCU instead of address_space_dispatch_free() now. This makes mem_begin/mem_commit to work with ASD and mem_add with FV as later on mem_add will be taking FV as an argument anyway. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-5-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy	cc94cd6d36	memory: Move FlatView allocation to a helper This moves a FlatView allocation and initialization to a helper. While we are nere, replace g_new with g_new0 to not to bother if we add new fields in the future. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-4-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy	9a62e24f45	memory: Open code FlatView rendering We are going to share FlatView's between AddressSpace's and per-AS memory listeners won't suit the purpose anymore so open code the dispatch tree rendering. Since there is a good chance that dispatch_listener was the only listener, this avoids address_space_update_topology_pass() if there is no registered listeners; this should improve starting time. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-3-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy	e76bb18f7e	exec: Explicitly export target AS from address_space_translate_internal This adds an AS** parameter to address_space_do_translate() to make it easier for the next patch to share FlatViews. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-2-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Paolo Bonzini	447b0d0b9e	memory: avoid "resurrection" of dead FlatViews It's possible for address_space_get_flatview() as it currently stands to cause a use-after-free for the returned FlatView, if the reference count is incremented after the FlatView has been replaced by a writer: thread 1 thread 2 RCU thread ------------------------------------------------------------- rcu_read_lock read as->current_map set as->current_map flatview_unref '--> call_rcu flatview_ref [ref=1] rcu_read_unlock flatview_destroy <badness> Since FlatViews are not updated very often, we can just detect the situation using a new atomic op atomic_fetch_inc_nonzero, similar to Linux's atomic_inc_not_zero, which performs the refcount increment only if it hasn't already hit zero. This is similar to Linux commit de09a9771a53 ("CRED: Fix get_task_cred() and task_state() to not resurrect dead credentials", 2010-07-29). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Paolo Bonzini	db81b99537	atomic: update documentation Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 14:47:42 +02:00
KONRAD Frederic	05e015f73c	memory: avoid a name clash with access macro This avoids a name clash with the access macro on windows 64: make CHK version_gen.h CC aarch64-softmmu/memory.o /home/konrad/qemu/memory.c: In function 'access_with_adjusted_size': /home/konrad/qemu/memory.c:591:73: error: macro "access" passed 7 arguments, \ but takes just 2 (size - access_size - i) * 8, access_mask, attrs); ^ Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com> Message-Id: <1505988260-8483-1-git-send-email-frederic.konrad@adacore.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 14:08:17 +02:00
David Hildenbrand	3110cdbd8a	kvm: drop wrong assertion creating problems with pflash pflash toggles mr->romd_mode. So this assert does not always hold. 1) a device was added with !mr->romd_mode, therefore effectively not creating a kvm slot as we want to trap every access (add = false). 2) mr->romd_mode was toggled on before remove it. There is now actually no slot to remove and the assert is wrong. So let's just drop the assert. Reported-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170920145025.19403-1-david@redhat.com> Tested-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 12:40:08 +02:00
Pavel Butsykin	55289fb036	virtio-serial: add enable_backend callback We should guarantee that RAM will not be modified while VM has a stopped state, otherwise it can lead to negative consequences during post-copy migration. In RUN_STATE_FINISH_MIGRATE step, it's expected that RAM on source side will not be modified as this could lead to non-consistent vm state on the destination side. Also RAM access during postcopy-ram migration with enabled release-ram capability can lead to sad consequences. Let's add enable_backend() callback to avoid undesirable virtioqueue changes in the guest memory. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Message-Id: <20170919120733.22020-1-pbutsykin@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 11:51:49 +02:00
Peter Maydell	b62b7ed0fc	These patches fix regressions in 2.10 -----BEGIN PGP SIGNATURE----- iEYEABECAAYFAlnCD9sACgkQAvw66wEB28KlTQCeNUO473jgHQvT2x4igufeBSN3 3cIAn16yvNgIxMJRjIQMeMjkY77ZNMJo =LovD -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging These patches fix regressions in 2.10 # gpg: Signature made Wed 20 Sep 2017 07:51:07 BST # gpg: using DSA key 0x02FC3AEB0101DBC2 # gpg: Good signature from "Greg Kurz <groug@kaod.org>" # gpg: aka "Greg Kurz <groug@free.fr>" # gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>" # gpg: aka "Gregory Kurz (Groug) <groug@free.fr>" # gpg: aka "[jpeg image of size 3330]" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2 * remotes/gkurz/tags/for-upstream: 9pfs: check the size of transport buffer before marshaling 9pfs: fix name_to_path assertion in v9fs_complete_rename() 9pfs: fix readdir() for 9p2000.u Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-09-20 20:33:48 +01:00
Peter Maydell	d3f5433c7b	Machine/CPU/NUMA queue, 2017-09-19 -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJZwXs9AAoJECgHk2+YTcWmS18P/1OceEmetatwBQ6YWURA0VfU CB+nCHxijlWXU8UiwoTVDOXc11P00V5VO0mMFouUbv+o+/80qyAfNl2DhDVq7ZXo nhIFHKXUR1BX3YbcqfIonH8xCMGtrAexghhhaGPQbx450PqmGyyjBsZsXttNge1S Yt+jzgD9drsZPYw4ZLJjT/tnWI/+8kDPn37jVujzRzApTD9/fq+77ZZq7q25RzQH ISa5OXOQk6pq+tPHvaIFXfOqfhILcpM7u/X7MYVwiA1oBqcLXTDCjDX3cKavKade +i3yrH7ahNgUO1DyT2bMZX6NAFoZDBrwlYgpkw8n+Yf+EUcdPDOHHEcEeaMpdTGx wgWbQrs+xzIg/ulRb8Qqe9FwdXGQbehfFGofd+gnGQ0XuxekT82in+ucMOivQO3x W/azGnzoz6D83stJFIZ93S69SRswqBuj2R8mu821yzqx1EUSNXgfKXz9OPwwFed5 El9YN127F/VAyp0av1CsOg1XgqIujMUGRxGf7eQBfkh1R3C/g2XNPTvR3yaY9L5B zuMJfWLF6r6zL53ymt7/9UVEim295Lia3mNGS5/Min5QGms7edphsDsrubzXsZGq 2owWfAU/KeDH9gNVNNkZdLcEcS5TEz+2oGPR5oeDeB/QlzVdNQ3FeTVlzFavxNQa 8nrzeFcw7VNrIx2gdvsY =LQ8U -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging Machine/CPU/NUMA queue, 2017-09-19 # gpg: Signature made Tue 19 Sep 2017 21:17:01 BST # gpg: using RSA key 0x2807936F984DC5A6 # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * remotes/ehabkost/tags/machine-next-pull-request: MAINTAINERS: Update git URLs for my trees hw/acpi-build: Fix SRAT memory building in case of node 0 without RAM NUMA: Replace MAX_NODES with nb_numa_nodes in for loop numa: cpu: calculate/set default node-ids after all -numa CLI options are parsed arm: drop intermediate cpu_model -> cpu type parsing and use cpu type directly pc: use generic cpu_model parsing vl.c: convert cpu_model to cpu type and set of global properties before machine_init() cpu: make cpu_generic_init() abort QEMU on error qom: cpus: split cpu_generic_init() on feature parsing and cpu creation parts hostmem-file: Add "discard-data" option osdep: Define QEMU_MADV_REMOVE vl: Clean up user-creatable objects when exiting Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-09-20 17:35:36 +01:00
Jan Dakinevich	772a73692e	9pfs: check the size of transport buffer before marshaling v9fs_do_readdir_with_stat() should check for a maximum buffer size before an attempt to marshal gathered data. Otherwise, buffers assumed as misconfigured and the transport would be broken. The patch brings v9fs_do_readdir_with_stat() in conformity with v9fs_do_readdir() behavior. Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com> [groug, regression caused my commit `8d37de41ca` # 2.10] Signed-off-by: Greg Kurz <groug@kaod.org>	2017-09-20 08:48:52 +02:00
Jan Dakinevich	4d8bc7334b	9pfs: fix name_to_path assertion in v9fs_complete_rename() The third parameter of v9fs_co_name_to_path() must not contain `/' character. The issue is most likely related to 9p2000.u protocol only. Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com> [groug, regression caused by commit `f57f587857` # 2.10] Signed-off-by: Greg Kurz <groug@kaod.org>	2017-09-20 08:48:52 +02:00
Jan Dakinevich	6069537f43	9pfs: fix readdir() for 9p2000.u If the client is using 9p2000.u, the following occurs: $ cd ${virtfs_shared_dir} $ mkdir -p a/b/c $ ls a/b ls: cannot access 'a/b/a': No such file or directory ls: cannot access 'a/b/b': No such file or directory a b c instead of the expected: $ ls a/b c This is a regression introduced by commit f57f5878578a; local_name_to_path() now resolves ".." and "." in paths, and v9fs_do_readdir_with_stat()->stat_to_v9stat() then copies the basename of the resulting path to the response. With the example above, this means that "." and ".." are turned into "b" and "a" respectively... stat_to_v9stat() currently assumes it is passed a full canonicalized path and uses it to do two different things: 1) to pass it to v9fs_co_readlink() in case the file is a symbolic link 2) to set the name field of the V9fsStat structure to the basename part of the given path It only has two users: v9fs_stat() and v9fs_do_readdir_with_stat(). v9fs_stat() really needs 1) and 2) to be performed since it starts with the full canonicalized path stored in the fid. It is different for v9fs_do_readdir_with_stat() though because the name we want to put into the V9fsStat structure is the d_name field of the dirent actually (ie, we want to keep the "." and ".." special names). So, we only need 1) in this case. This patch hence adds a basename argument to stat_to_v9stat(), to be used to set the name field of the V9fsStat structure, and moves the basename logic to v9fs_stat(). Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com> (groug, renamed old name argument to path and updated changelog) Signed-off-by: Greg Kurz <groug@kaod.org>	2017-09-20 08:48:51 +02:00
Eduardo Habkost	e3d038b89f	MAINTAINERS: Update git URLs for my trees List the branches where I queue patches for Machine Core, NUMA, Memory Backends, and X86. Update the NUMA section to list the "machine-next" branch instead of "numa". Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170901153928.17058-1-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-09-19 16:53:13 -03:00
Eduardo Habkost	4926403c25	hw/acpi-build: Fix SRAT memory building in case of node 0 without RAM Currently, Using the fisrt node without memory on the machine makes QEMU unhappy. With this example command line: ... \ -m 1024M,slots=4,maxmem=32G \ -numa node,nodeid=0 \ -numa node,mem=1024M,nodeid=1 \ -numa node,nodeid=2 \ -numa node,nodeid=3 \ Guest reports "No NUMA configuration found" and the NUMA topology is wrong. This is because when QEMU builds ACPI SRAT, it regards node 0 as the default node to deal with the memory hole(640K-1M). this means the node0 must have some memory(>1M), but, actually it can have no memory. Fix this problem by cut out the 640K hole in the same way the PCI 4G hole does. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Message-Id: <1504231805-30957-2-git-send-email-douly.fnst@cn.fujitsu.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-09-19 16:51:33 -03:00
Dou Liyang	f51878ba86	NUMA: Replace MAX_NODES with nb_numa_nodes in for loop In QEMU, the number of the NUMA nodes is determined by parse_numa_opts(). Then, QEMU uses it for iteration, for example: for (i = 0; i < nb_numa_nodes; i++) However, in memory_region_allocate_system_memory(), it uses MAX_NODES not nb_numa_nodes. So, replace MAX_NODES with nb_numa_nodes to keep code consistency and reduce the loop times. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Message-Id: <1503387936-3483-1-git-send-email-douly.fnst@cn.fujitsu.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-09-19 16:51:33 -03:00
Igor Mammedov	79e0793614	numa: cpu: calculate/set default node-ids after all -numa CLI options are parsed Calculating default node-ids for CPUs in possible_cpu_arch_ids() is rather fragile since defaults calculation uses nb_numa_nodes but callback might be potentially called early before all -numa CLI options are parsed, which would lead to cpus assigned only upto nb_numa_nodes at the time possible_cpu_arch_ids() is called. Issue was introduced by (`7c88e65` numa: mirror cpu to node mapping in MachineState::possible_cpus) and for example CLI: -smp 4 -numa node,cpus=0 -numa node would set props.node-id in possible_cpus array for every non explicitly mapped CPU to the first node. Issue is not visible to guest nor to mgmt interface due to 1) implictly mapped cpus are forced to the first node in case of partial mapping 2) in case of default mapping possible_cpu_arch_ids() is called after all -numa options are parsed (resulting in correct mapping). However it's fragile to rely on late execution of possible_cpu_arch_ids(), therefore add machine specific callback that returns node-id for CPU and use it to calculate/ set defaults at machine_numa_finish_init() time when all -numa options are parsed. Reported-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1496314408-163972-1-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-09-19 16:51:33 -03:00
Peter Maydell	c51700273a	Assorted s390x patches: - introduce virtio-gpu-ccw, with virtio-gpu endian fixes - lots of cleanup in the s390x code - make device_add work for s390x cpus - enable seccomp on s390x - an ivshmem endian fix - set the reserved DHCP client architecture id for netboot - fixes in the css and pci support -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZwUhRAAoJEN7Pa5PG8C+vX78QAJCD4fhVgMb9cjsJTyYPn68e b5+SXRbHqb2eqd/QLQOPdWLpbJCsgu04HWa1oa+L3vM/HkMUQoDv7EzUBWLI3gxe rHHvRxSajRPTCCpDkr6YvKi1IZP0ZqgqOmBwHgDfPI3AzcVELUWlyubdLuwWdwZw /1F0VvMpYmWPnFAcR/el3kaSSqXRxR7dgsCnfAgLajNk0BrF3n2ZES88mbZ86glD gFuaFyiCAwpCoLaMjNkDBbwePqG4LnCbsqYcyet7MV0uryEj2FmmZXjPRxP5cN8i 1idIim7q+cbWRzJNCC/SQuJ6tl/ZrJ8cH3N8ok5s/K00E4++3CvcKaXj2ab83fDa oHZa+HvH1hpdKOjBN2lWcZccV0f4DPLMKXqvh4i/l58h7HvJtA86bXNqi1UD8alv 3HPhjrGyNZJ1kPOIQvuUbFHcxb0ILwObmSwD6VrrYFp6EsoIEGLwhnPlL9GG0sT+ 8e9DmT+AkXxM+3GtTSg1qL9QhF7ntJLRkpsKHc0WH8ESJoqzlHZV3BC5KB8s9OKE 4+7jC6+hx97ivwXbzPegK61l3quhTbldtDKmoJ57fSn5eDhasO3TrCCNNen80YyP L3iJuEEO0hRkcQLWBem1fpWpDgQzV1Li4DVuIk8ZldRDiUas4qtGdQ0WCSIvR1f/ TLPomp0p0rP2eoiazGCs =vUS6 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20170919-v2' into staging Assorted s390x patches: - introduce virtio-gpu-ccw, with virtio-gpu endian fixes - lots of cleanup in the s390x code - make device_add work for s390x cpus - enable seccomp on s390x - an ivshmem endian fix - set the reserved DHCP client architecture id for netboot - fixes in the css and pci support # gpg: Signature made Tue 19 Sep 2017 17:39:45 BST # gpg: using RSA key 0xDECF6B93C6F02FAF # gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>" # gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>" # gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>" # gpg: aka "Cornelia Huck <cohuck@kernel.org>" # gpg: aka "Cornelia Huck <cohuck@redhat.com>" # Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF * remotes/cohuck/tags/s390x-20170919-v2: (38 commits) MAINTAINERS/s390x: add terminal3270.c virtio-ccw: Create a virtio gpu device for the ccw bus virtio-gpu: Handle endian conversion s390x/ccw: create s390 phb for compat reasons as well configure: Allow --enable-seccomp on s390x, too virtio-ccw: remove stale comments on endianness s390x: allow CPU hotplug in random core-id order s390x: generate sclp cpu information from possible_cpus s390x: get rid of cpu_s390x_create() s390x: get rid of cpu_states and use possible_cpus instead s390x: implement query-hotpluggable-cpus s390x: CPU hot unplug via device_del cannot work for now s390x: allow cpu hotplug via device_add s390x: print CPU definitions in sorted order target/s390x: rename next_cpu_id to next_core_id target/s390x: use "core-id" for cpu number/address/id handling target/s390x: set cpu->id for linux user when realizing s390x: allow only 1 CPU with TCG target/s390x: use program_interrupt() in per_check_exception() target/s390x: use trigger_pgm_exception() in s390_cpu_handle_mmu_fault() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-09-19 18:08:48 +01:00
Christian Borntraeger	9d1c444921	MAINTAINERS/s390x: add terminal3270.c Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com> Message-Id: <20170918130455.144262-1-borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
Farhan Ali	1f8ad88935	virtio-ccw: Create a virtio gpu device for the ccw bus Wire up the virtio-gpu device for the CCW bus. The virtio-gpu is a virtio-1 device, so disable revision 0. Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <6c53f939cf2d64b66d2a6878b29c9bf3820f3d5b.1505485574.git.alifm@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
Farhan Ali	1715d6b59c	virtio-gpu: Handle endian conversion Virtio GPU code currently only supports litte endian format, and so using the Virtio GPU device on a big endian machine does not work. Let's fix it by supporting the correct host cpu byte order. Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com> Message-Id: <dc748e15f36db808f90b4f2393bc29ba7556a9f6.1505485574.git.alifm@linux.vnet.ibm.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
Cornelia Huck	8ad9087c4a	s390x/ccw: create s390 phb for compat reasons as well `d32bd032d8` ("s390x/ccw: create s390 phb conditionally") made registering the s390 pci host bridge conditional on presense of the zpci facility bit. Sadly, that breaks migration from machines that did not use the cpu model (2.7 and previous). Create the s390 phb for pre-cpu model machines as well: We can tweak s390_has_feat() to always indicate the zpci facility bit when no cpu model is available (on 2.7 and previous compat machines). Fixes: `d32bd032d8` ("s390x/ccw: create s390 phb conditionally") Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
Thomas Huth	3aa35fcffc	configure: Allow --enable-seccomp on s390x, too libseccomp supports s390x since version 2.3.0, and I was able to start a VM with "-sandbox on" without any obvious problems by using this patch, so it should be safe to allow --enable-seccomp on s390x nowadays, too. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1505385363-27717-1-git-send-email-thuth@redhat.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Eduardo Otubo <otubo@redhat.com> Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
Halil Pasic	5ef5475868	virtio-ccw: remove stale comments on endianness We have two stale comments suggesting one should think about virtio config space endianness a bit longer. We have just done that, and came to the conclusion we are fine as is: it's the responsibility of the virtio device and not of the transport (and that is how it works now). Putting the responsibility into the transport isn't even possible, because the transport would have to know about the config space layout of each device. Let us remove the stale comments. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Suggested-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20170914105535.47941-1-pasic@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
David Hildenbrand	a1422723f7	s390x: allow CPU hotplug in random core-id order SCLP correctly indicates the core-id aka. CPU address for each available CPU. As the core-id corresponds to cpu_index, also a newly created kvm vcpu gets assigned this core-id as vcpu id. So SIGP in the kernel works correctly (it uses the vcpu id to lookup the correct CPU). So there should be nothing hindering us from hotplugging CPUs in random core-id order. This now makes sure that the output from "query-hotpluggable-cpus" is completely true. Until now, a specific order is implicit. Performance vice, hotplugging CPUs in non-sequential order might not be the best thing to do, as VCPU lookup inside KVM might be a little slower. But that doesn't hinder us from supporting it. next_core_id is now used by linux user only. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170913132417.24384-23-david@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
David Hildenbrand	bb535bb67e	s390x: generate sclp cpu information from possible_cpus This is the first step to allow hot plugging of CPUs in a non-sequential order. If a cpu is available ("plugged") can directly be decided by looking at the cpu state pointer. This makes sure, that really only cpus attached to the machine are reported. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170913132417.24384-22-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
David Hildenbrand	524d18d8bd	s390x: get rid of cpu_s390x_create() Now that there is only one user of cpu_s390x_create() left, make cpu creation look like on x86. - Perform the model/properties split and checks in s390_init_cpus() - Parse features only once without having to remember if already parsed - Pass only the typename to s390x_new_cpu() - Use the typename of an existing CPU for hotplug via cpu-add Acked-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170913132417.24384-21-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
David Hildenbrand	2b44178d87	s390x: get rid of cpu_states and use possible_cpus instead Now that we have possible_cpus, we can get rid of the global variable and rewrite s390_cpu_addr2state() to use it. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170913132417.24384-20-david@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
David Hildenbrand	4dc3b15188	s390x: implement query-hotpluggable-cpus CPU hotplug is only possible on a per core basis on s390x. So let's add possible_cpus and wire everything up properly. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170913132417.24384-19-david@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
David Hildenbrand	f2f3beb004	s390x: CPU hot unplug via device_del cannot work for now device_del on a CPU will currently do nothing. Let's emit an error telling that this is will currently not work (there is no architecture support on s390x). Error message copied from ppc. (qemu) device_del cpu1 device_del cpu1 CPU hot unplug not supported on this machine Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170913132417.24384-18-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
David Hildenbrand	0347ab8469	s390x: allow cpu hotplug via device_add E.g. the following now works: device_add host-s390-cpu,id=cpu1,core-id=1 The system will perform the same checks as when using cpu_add: - If the core_id is already in use - If the next sequential core_id isn't used - If core-id >= max_cpu is specified In addition, mixed CPU models are checked. E.g. if starting with -cpu host and trying to hotplug "qemu-s390-cpu": "Mixed CPU models are not supported on s390x." Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170913132417.24384-17-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00
David Hildenbrand	99aa6bf29b	s390x: print CPU definitions in sorted order Other architectures provide nicely sorted lists, let's do it similarly on s390x. While at it, clean up the code we have to touch either way. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170913132417.24384-16-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2017-09-19 18:31:32 +02:00

1 2 3 4 5 ...

56142 Commits