cur_map is not used anymore; instead, each AddressSpaceDispatch
has its own nodes/sections pair. The priorities of the
MemoryListeners, and in the future RCU, guarantee that the
nodes/sections are not freed while they are still in use.
(In fact, next_map itself is not needed except to free the data on the
next update).
To avoid incorrect use, replace cur_map with a temporary copy that
is only valid while the topology is being updated. If you use it,
the name prev_map makes it clear that you're doing something weird.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After this patch, AddressSpaceDispatch holds a constistent tuple of
(phys_map, nodes, sections). This will be important when updates
of the topology will run concurrently with reads.
cur_map is not used anymore except for freeing it at the end of the
topology update.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This same treatment previously done to phys_node_map and phys_sections
is now applied to the dispatch field of AddressSpace. Topology updates
use as->next_dispatch while accesses use as->dispatch.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will help having two copies of AddressSpaceDispatch during the
recreation of the radix tree (one being built, and one that is complete
and will be protected by RCU). We do not want to have to unregister and
re-register the listener.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, phys_node_map and phys_sections are shared by all
of the AddressSpaceDispatch. When updating mem topology, all
AddressSpaceDispatch will rebuild dispatch tables sequentially
on them. In order to prepare for RCU access, leave the old
memory map alive while the next one is being accessed.
When rebuilding, the new dispatch tables will build and lookup
next_map; after all dispatch tables are rebuilt, we can switch
to next_* and free the previous table.
Based on a patch from Liu Ping Fan.
Signed-off-by: Liu Ping Fan <qemulist@gmail.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sections like phys_section_unassigned always have fixed address
in phys_sections. Declared as macro, so we can use them
when having more than one phys_sections array.
Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Liu Ping Fan <qemulist@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Object reference counts will soon be changed outside the BQL. So we need
to use atomics in object_ref/unref.
Based on a patch by Liu Ping Fan.
Signed-off-by: Liu Ping Fan <qemulist@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With this change, a FlatView can be used even after a concurrent
update has replaced it. Because we do not yet have RCU, we use a
mutex to protect the small critical sections that read/write the
as->current_map pointer. Accesses to the FlatView can be done
outside the mutex.
If a MemoryRegion will be used after the FlatView is unref-ed (or after
a MemoryListener callback is returned), a reference has to be added to
that MemoryRegion. memory_region_find already does it for the region
that it returns. The same will be done for address_space_translate
as soon as the dispatch tree is also converted to RCU-style.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is the first step towards converting as->current_map to
RCU-style updates, where the FlatView updates run concurrently
with uses of an old FlatView.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We will soon require accesses to as->current_map to be placed under
a lock (with reference counting so as to keep the critical section
small). To simplify this change, always fetch as->current_map into
a local variable and access it through that variable.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We're already using them in several places, but __sync builtins are just
too ugly to type, and do not provide seqcst load/store operations.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are using the same struct name for two devices. 8250 is widespread
enough that this causes some confusion, rename the other instance.
Reviewed-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The iothread mutex might be released between map and unmap, so the
mapped region might disappear.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
First of all, rename "todo" to "done".
Second, clearly separate the case of done == 0 with the case of done != 0.
This will help handling reference counting in the next patch.
Third, this test:
if (memory_region_get_ram_addr(mr) + xlat != raddr + todo) {
does not guarantee that the memory region is the same across two iterations
of the while loop. For example, you could have two blocks:
A) size 640 K, mapped at physical address 0, ram_addr_t 0
B) size 64 K, mapped at physical address 0xa0000, ram_addr_t 0xa0000
then mapping 1 M starting at physical address zero will erroneously treat
B as the continuation of block A. qemu_ram_ptr_length ensures that no
invalid memory is accessed, but it is still a pointless complication of
the algorithm. The patch makes the logic clearer with an explicit test
that the memory region is the same.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After the next patch it would not be used elsewhere anyway. Also,
the _nofail and the standard versions of this function return different
things, which is confusing. Removing the function from the public headers
limits the confusion.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add ref/unref calls at the following places:
- places where memory regions are stashed by a listener and
used outside the BQL (including in Xen or KVM).
- memory_region_find callsites
- creation of aliases and containers (only the aliased/contained
region gets a reference to avoid loops)
- around calls to del_subregion/add_subregion, where the region
could disappear after the first call
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This new API will avoid having too many memory_region_ref/unref
in paths that currently use memory_region_find.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Whenever memory regions are accessed outside the BQL, they need to be
preserved against hot-unplug. MemoryRegions actually do not have their
own reference count; they piggyback on a QOM object, their "owner".
The owner is set at creation time, and there is a function to retrieve
the owner.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not bother visiting the radix tree when an address space is destroyed.
After the previous patch, this has become a pointless exercise. When
called from address_space_destroy_dispatch, all you're doing is zeroing
out a structure that will be freed as soon as you come back. When called
from mem_begin, when phys_page_set_level will call phys_map_node_alloc the
radix tree's array will be zeroed too.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
phys_sections_clear is invoked after the dispatch tree has been
destroyed. This leaves a window where phys_sections_nb > 0 but the
subpages are not valid anymore, which is a recipe for use-after-free
bugs.
Move the destruction of subpages in phys_sections_clear. We will
still destroy the subpages when an address space is cleaned up,
because address_space_destroy will clear as->root and commit the
change before it calls address_space_destroy_dispatch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This decouples memory.h from ioport.h, concentrating all portio related
types in a single header.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In case the latter may vanish one day, make sure the vmport read handler
type will remain unaffected. This is also conceptually cleaner.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>