mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Emilio G. Cota	0ac20318ce	tcg: remove tb_lock Use mmap_lock in user-mode to protect TCG state and the page descriptors. In !user-mode, each vCPU has its own TCG state, so no locks needed. Per-page locks are used to protect the page descriptors. Per-TB locks are used in both modes to protect TB jumps. Some notes: - tb_lock is removed from notdirty_mem_write by passing a locked page_collection to tb_invalidate_phys_page_fast. - tcg_tb_lookup/remove/insert/etc have their own internal lock(s), so there is no need to further serialize access to them. - do_tb_flush is run in a safe async context, meaning no other vCPU threads are running. Therefore acquiring mmap_lock there is just to please tools such as thread sanitizer. - Not visible in the diff, but tb_invalidate_phys_page already has an assert_memory_lock. - cpu_io_recompile is !user-only, so no mmap_lock there. - Added mmap_unlock()'s before all siglongjmp's that could be called in user-mode while mmap_lock is held. + Added an assert for !have_mmap_lock() after returning from the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic. Performance numbers before/after: Host: AMD Opteron(tm) Processor 6376 ubuntu 17.04 ppc64 bootup+shutdown time 700 +-+--+----+------+------------+-----------+--------------+-+ \| + + + + + B \| \| before *B* ** * \| \|tb lock removal ###D### * \| 600 +-+ * +-+ \| ** # \| \| B #D \| \| *** * ## \| 500 +-+ *** ### +-+ \| * *** ### \| \| B # ## \| \| ** * #D# \| 400 +-+ ## +-+ \| ### \| \| ## \| \| # ## \| 300 +-+ * B* #D# +-+ \| B *** ### \| \| * ** #### \| \| * *** ### \| 200 +-+ B B #D# +-+ \| #B * ## # \| \| #* ## \| \| + D##D# + + + + \| 100 +-+--+----+------+------------+-----------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/HwmBHXe debian jessie aarch64 bootup+shutdown time 90 +-+--+-----+-----+------------+------------+------------+--+-+ \| + + + + + + \| \| before *B* B \| 80 +tb lock removal ###D### D +-+ \| ### \| \| ## \| 70 +-+ # +-+ \| ## \| \| # \| 60 +-+ B ## +-+ \| * ## \| \| * #D \| 50 +-+ * ## +-+ \| * ### \| \| B* ### \| 40 +-+ ** # ## +-+ \| #D# \| \| B* ### \| 30 +-+ B*B #### +-+ \| B * * # ### \| \| B ###D# \| 20 +-+ D ##D## +-+ \| D# \| \| + + + + + + \| 10 +-+--+-----+-----+------------+------------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/iGpGFtv The gains are high for 4-8 CPUs. Beyond that point, however, unrelated lock contention significantly hurts scalability. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:48 -10:00
Peter Maydell	6d7b9a6c3b	Make memory_region_access_valid() take a MemTxAttrs argument As part of plumbing MemTxAttrs down to the IOMMU translate method, add MemTxAttrs as an argument to memory_region_access_valid(). Its callers either have an attrs value to hand, or don't care and can use MEMTXATTRS_UNSPECIFIED. The callsite in flatview_access_valid() is part of a recursive loop flatview_access_valid() -> memory_region_access_valid() -> subpage_accepts() -> flatview_access_valid(); we make it pass MEMTXATTRS_UNSPECIFIED for now, until the next several commits have plumbed an attrs parameter through the rest of the loop and we can add an attrs parameter to flatview_access_valid(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180521140402.23318-8-peter.maydell@linaro.org	2018-05-31 16:32:35 +01:00
Paolo Bonzini	48564041a7	exec: reintroduce MemoryRegion caching MemoryRegionCache was reverted to "normal" address_space_* operations for 2.9, due to lack of support for IOMMUs. Reinstate the optimizations, caching only the IOMMU translation at address_cache_init but not the IOMMU lookup and target AddressSpace translation are not cached; now that MemoryRegionCache supports IOMMUs, it becomes more widely applicable too. The inlined fast path is defined in memory_ldst_cached.inc.h, while the slow path uses memory_ldst.inc.c as before. The smaller fast path causes a little code size reduction in MemoryRegionCache users: hw/virtio/virtio.o text size before: 32373 hw/virtio/virtio.o text size after: 31941 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-05-09 00:13:38 +02:00
Paolo Bonzini	785a507ec7	memory: inline some performance-sensitive accessors These accessors are called from inlined functions, and the call sequence is much more expensive than just inlining the access. Move the struct declaration to memory-internal.h so that exec.c and memory.c can both use an inline function. Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-03-06 14:01:27 +01:00
Peter Maydell	9d70618c68	memory-internal.h: Remove obsolete claim that header is obsolete The memory-internal.h header claims that it is for "obsolete exec.c functions" which "will be removed soon". This statement was added in 2011, six years ago, but the header is still here. (Admittedly none of the prototypes added in commit `67d95c153b` are still in the header.) It's convenient to have a place to put prototypes for functions which are used internally to the various .c files of the memory system or by the accel/tcg code, which is inevitably fairly closely coupled. So keep the header but update the comments to reflect what we're actually using it for. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <1511276888-17834-1-git-send-email-peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-02-05 18:09:45 +01:00
Peter Maydell	2726627197	exec.c: Factor out before/after actions for notdirty memory writes The function notdirty_mem_write() has a sequence of actions it has to do before and after the actual business of writing data to host RAM to ensure that dirty flags are correctly updated and we flush any TCG translations for the region. We need to do this also in other places that write directly to host RAM, most notably the TCG atomic helper functions. Pull out the before and after pieces into their own functions. We use an API where the prepare function stashes the various bits of information about the write into a struct for the complete function to use, because in the calls for the atomic helpers the place where the complete function will be called doesn't have the information to hand. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1511201308-23580-2-git-send-email-peter.maydell@linaro.org	2017-11-21 12:09:25 +00:00
Alexey Kardashevskiy	5e8fd947e2	memory: Rework "info mtree" to print flat views and dispatch trees This adds a new "-d" switch to "info mtree" to print dispatch tree internals. This changes the way "-f" is handled - it prints now flat views and associated address spaces. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-15-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:38 +02:00
Alexey Kardashevskiy	8629d3fcb7	memory: Rename mem_begin/mem_commit/mem_add helpers This renames some helpers to reflect better what they do. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-9-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy	166206845f	memory: Switch memory from using AddressSpace to FlatView FlatView's will be shared between AddressSpace's and subpage_t and MemoryRegionSection cannot store AS anymore, hence this change. In particular, for: typedef struct subpage_t { MemoryRegion iomem; - AddressSpace as; + FlatView fv; hwaddr base; uint16_t sub_section[]; } subpage_t; struct MemoryRegionSection { MemoryRegion mr; - AddressSpace address_space; + FlatView *fv; hwaddr offset_within_region; Int128 size; hwaddr offset_within_address_space; bool readonly; }; This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-7-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy	66a6df1dc6	memory: Move AddressSpaceDispatch from AddressSpace to FlatView As we are going to share FlatView's between AddressSpace's, and AddressSpaceDispatch is a structure to perform quick lookup in FlatView, this moves ASD to FlatView. After previosly open coded ASD rendering, we can also remove as->next_dispatch as the new FlatView pointer is stored on a stack and set to an AS atomically. flatview_destroy() is executed under RCU instead of address_space_dispatch_free() now. This makes mem_begin/mem_commit to work with ASD and mem_add with FV as later on mem_add will be taking FV as an argument anyway. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-5-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy	9a62e24f45	memory: Open code FlatView rendering We are going to share FlatView's between AddressSpace's and per-AS memory listeners won't suit the purpose anymore so open code the dispatch tree rendering. Since there is a good chance that dispatch_listener was the only listener, this avoids address_space_update_topology_pass() if there is no registered listeners; this should improve starting time. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-3-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-21 23:19:37 +02:00
Paolo Bonzini	6e48e8f9e0	memory: unregister AddressSpace MemoryListener within BQL address_space_destroy_dispatch is called from an RCU callback and hence outside the iothread mutex (BQL). However, after address_space_destroy no new accesses can hit the destroyed AddressSpace so it is not necessary to observe changes to the memory map. Move the memory_listener_unregister call earlier, to make it thread-safe again. Reported-by: Alex Williamson <alex.williamson@redhat.com> Fixes: `374f2981d1` Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2015-02-10 10:25:44 -07:00
Juan Quintela	220c3ebddb	memory: split cpu_physical_memory_* functions to its own include All the functions that use ram_addr_t should be here. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	a2f4d5bef2	memory: make cpu_physical_memory_reset_dirty() take a length parameter We have an end parameter in all the callers, and this make it coherent with the rest of cpu_physical_memory_* functions, that also take a length parameter. Once here, move the start/end calculation to tlb_reset_dirty_range_all() as we don't need it here anymore. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	a2cd8c852d	memory: s/dirty/clean/ in cpu_physical_memory_is_dirty() All uses except one really want the other meaning. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	a461e389f4	memory: cpu_physical_memory_clear_dirty_range() now uses bitmap operations We were clearing a range of bits, so use bitmap_clear(). Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	5b9a3a5f77	memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations We were setting a range of bits, so use bitmap_set(). Note: xen has always been wrong, and should have used start instead of addr from the beginning. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	1bafff0c7c	memory: use find_next_bit() to find dirty bits This operation is way faster than doing it bit by bit. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	ace694cccc	memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range Now all functions use the same wording that bitops/bitmap operations Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	94833c896d	memory: cpu_physical_memory_get_dirty() is used as returning a bool Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	9f2c43e41a	memory: make cpu_physical_memory_get_dirty() the main function And make cpu_physical_memory_get_dirty_flag() to use it. It used to be the other way around. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	c1427a3f84	memory: unfold cpu_physical_memory_set_dirty_flag() Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	4f13bb80a2	memory: unfold cpu_physical_memory_set_dirty() in its only user Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	86a49582db	memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	1ab4c8ceaa	memory: split dirty bitmap into three After all the previous patches, spliting the bitmap gets direct. Note: For some reason, I have to move DIRTY_MEMORY_* definitions to the beginning of memory.h to make compilation work. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	e8a97cafc4	memory: cpu_physical_memory_clear_dirty_flag() result is never used Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	7a5b558c9d	memory: make sure that client is always inside range Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	5215919291	memory: cpu_physical_memory_mask_dirty_range() always clears a single flag Document it Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:54 +01:00
Juan Quintela	75218e7f2b	memory: cpu_physical_memory_set_dirty_range() always dirty all flags So remove the flag argument and do it directly. After this change, there is nothing else using cpu_physical_memory_set_dirty_flags() so remove it. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:53 +01:00
Juan Quintela	63995cebfa	memory: set single dirty flags when possible Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:53 +01:00
Juan Quintela	36187e2ca0	memory: all users of cpu_physical_memory_get_dirty used only one flag So cpu_physical_memory_get_dirty_flags is not needed anymore Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:53 +01:00
Juan Quintela	4f08cabe9e	memory: make cpu_physical_memory_is_dirty return bool Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:53 +01:00
Juan Quintela	7e5609a85e	exec: create function to get a single dirty bit Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2014-01-13 14:04:53 +01:00
Juan Quintela	a1390db4df	memory: create function to set a single dirty bit Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2014-01-13 14:04:53 +01:00
Juan Quintela	e2da99d582	memory: cpu_physical_memory_set_dirty_flags() result is never used So return void. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2014-01-13 14:04:53 +01:00
Jan Kiszka	b40acf99be	ioport: Switch dispatching to memory core layer The current ioport dispatcher is a complex beast, mostly due to the need to deal with old portio interface users. But we can overcome it without converting all portio users by embedding the required base address of a MemoryRegionPortio access into that data structure. That removes the need to have the additional MemoryRegionIORange structure in the loop on every access. To handle old portio memory ops, we simply install dispatching handlers for portio memory regions when registering them with the memory core. This removes the need for the old_portio field. We can drop the additional aliasing of ioport regions and also the special address space listener. cpu_in and cpu_out now simply call address_space_read/write. And we can concentrate portio handling in a single source file. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-07-04 17:42:44 +02:00
Paolo Bonzini	1db8abb102	memory: move private types to exec.c Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-06-20 16:32:46 +02:00
Paolo Bonzini	d2702032b4	memory: export memory_region_access_valid to exec.c We'll use it to implement address_space_access_valid. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-05-29 16:27:11 +02:00
Paolo Bonzini	d197063fcf	memory: move unassigned_mem_ops to memory.c reservation_ops is already doing the same thing. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-05-29 16:26:56 +02:00
Paolo Bonzini	ee983cb3cc	exec: make qemu_get_ram_ptr private It is a private interface between exec.c and memory.c. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-05-24 18:42:21 +02:00
Paolo Bonzini	c72dd2d04b	exec: remove useless declarations from memory-internal.h Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-04-15 18:19:26 +02:00
Paolo Bonzini	0d09e41a51	hw: move headers to include/ Many of these should be cleaned up with proper qdev-/QOM-ification. Right now there are many catch-all headers in include/hw/ARCH depending on cpu.h, and this makes it necessary to compile these files per-target. However, fixing this does not belong in these patches. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-04-08 18:13:10 +02:00
Paolo Bonzini	022c62cbbc	exec: move include files to include/exec/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:31 +01:00

43 Commits