Commit Graph

1124 Commits

Author SHA1 Message Date
Zhang Yi
2ac0f1621c util/mmap-alloc: Add a 'is_pmem' parameter to qemu_ram_mmap
besides the existing 'shared' flags, we are going to add
'is_pmem' to qemu_ram_mmap(), which indicated the memory backend
file is a persist memory.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com>
Reviewed-by: Pankaj Gupta <pagupta@redhat.com>
Message-Id: <786c46862cfeb253ee0ea2f44d62ffe76edb7fa4.1549555521.git.yi.z.zhang@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Pankaj Gupta <pagupta@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-04-25 14:17:36 -03:00
Eduardo Habkost
5b863f3e2f cpu: Fix crash with empty -cpu option
Fix the following crash:

  $ qemu-system-x86_64 -cpu ''
  qemu-system-x86_64: qom/cpu.c:291: cpu_class_by_name: \
      Assertion `cpu_model && cc->class_by_name' failed.

Regression test script included.

Fixes: 99193d8f2e ("cpu: drop unnecessary NULL check and cpu_common_class_by_name()")
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190418034501.5038-1-ehabkost@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-04-25 14:17:35 -03:00
Eduardo Habkost
c1c8cfe5f9 cpu: Rename parse_cpu_model() to parse_cpu_option()
The "model[,option...]" string parsed by the function is not just
a CPU model.  Rename the function and its argument to indicate it
expects the full "-cpu" option to be provided.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190417025944.16154-2-ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-04-25 14:17:35 -03:00
David Hildenbrand
905b7ee4d6 exec: Introduce qemu_maxrampagesize() and rename qemu_getrampagesize()
Rename qemu_getrampagesize() to qemu_minrampagesize(). While at it,
properly rename find_max_supported_pagesize() to
find_min_backend_pagesize().

s390x is actually interested into the maximum ram pagesize, so
introduce and use qemu_maxrampagesize().

Add a TODO, indicating that looking at any mapped memory backends is not
100% correct in some cases.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190417113143.5551-3-david@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-04-25 13:47:27 +02:00
Markus Armbruster
90c84c5600 qom/cpu: Simplify how CPUClass:cpu_dump_state() prints
CPUClass method dump_statistics() takes an fprintf()-like callback and
a FILE * to pass to it.  Most callers pass fprintf() and stderr.
log_cpu_state() passes fprintf() and qemu_log_file.
hmp_info_registers() passes monitor_fprintf() and the current monitor
cast to FILE *.  monitor_fprintf() casts it right back, and is
otherwise identical to monitor_printf().

The callback gets passed around a lot, which is tiresome.  The
type-punning around monitor_fprintf() is ugly.

Drop the callback, and call qemu_fprintf() instead.  Also gets rid of
the type-punning, since qemu_fprintf() takes NULL instead of the
current monitor cast to FILE *.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-15-armbru@redhat.com>
2019-04-18 22:18:59 +02:00
Markus Armbruster
b6b71cb5c6 memory: Clean up how mtree_info() prints
mtree_info() takes an fprintf()-like callback and a FILE * to pass to
it, and so do its helper functions.  Passing around callback and
argument is rather tiresome.

Its only caller hmp_info_mtree() passes monitor_printf() cast to
fprintf_function and the current monitor cast to FILE *.

The type-punning is technically undefined behaviour, but works in
practice.  Clean up: drop the callback, and call qemu_printf()
instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-9-armbru@redhat.com>
2019-04-18 22:18:59 +02:00
David Gibson
7d5489e6d1 exec: Only count mapped memory backends for qemu_getrampagesize()
qemu_getrampagesize() works out the minimum host page size backing any of
guest RAM.  This is required in a few places, such as for POWER8 PAPR KVM
guests, because limitations of the hardware virtualization mean the guest
can't use pagesizes larger than the host pages backing its memory.

However, it currently checks against *every* memory backend, whether or not
it is actually mapped into guest memory at the moment.  This is incorrect.

This can cause a problem attempting to add memory to a POWER8 pseries KVM
guest which is configured to allow hugepages in the guest (e.g.
-machine cap-hpt-max-page-size=16m).  If you attempt to add non-hugepage,
you can (correctly) create a memory backend, however it (correctly) will
throw an error when you attempt to map that memory into the guest by
'device_add'ing a pc-dimm.

What's not correct is that if you then reset the guest a startup check
against qemu_getrampagesize() will cause a fatal error because of the new
memory object, even though it's not mapped into the guest.

This patch corrects the problem by adjusting find_max_supported_pagesize()
(called from qemu_getrampagesize() via object_child_foreach) to exclude
non-mapped memory backends.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
2019-03-29 14:24:08 +11:00
Wei Yang
494d199727 exec.c: refactor function flatview_add_to_dispatch()
flatview_add_to_dispatch() registers page based on the condition of
*section*, which may looks like this:

    |s|PPPPPPP|s|

where s stands for subpage and P for page.

The procedure of this function could be described as:

    - register first subpage
    - register page
    - register last subpage

This means the procedure could be simplified into these three steps
instead of a loop iteration.

This patch refactors the function into three corresponding steps and
adds some comment to clarify it.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190311054252.6094-1-richardw.yang@linux.intel.com>
[Paolo: move exit before adjustment of remain.offset_within_*,
 otherwise int128_get64 fails when a region is 2^64 bytes long]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-11 16:51:42 +01:00
Yury Kotov
fbd162e629 migration: Add an ability to ignore shared RAM blocks
If ignore-shared capability is set then skip shared RAMBlocks during the
RAM migration.
Also, move qemu_ram_foreach_migratable_block (and rename) to the
migration code, because it requires access to the migration capabilities.

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Message-Id: <20190215174548.2630-4-yury-kotov@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-03-06 10:49:17 +00:00
Yury Kotov
754cb9c0eb exec: Change RAMBlockIterFunc definition
Currently, qemu_ram_foreach_* calls RAMBlockIterFunc with many
block-specific arguments. But often iter func needs RAMBlock*.
This refactoring is needed for fast access to RAMBlock flags from
qemu_ram_foreach_block's callback. The only way to achieve this now
is to call qemu_ram_block_from_host (which also enumerates blocks).

So, this patch reduces complexity of
qemu_ram_foreach_block() -> cb() -> qemu_ram_block_from_host()
from O(n^2) to O(n).

Fix RAMBlockIterFunc definition and add some functions to read
RAMBlock* fields witch were passed.

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Message-Id: <20190215174548.2630-2-yury-kotov@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-03-06 10:49:17 +00:00
Li Zhijian
0c249ff71c unify len and addr type for memory/address APIs
Some address/memory APIs have different type between
'hwaddr/target_ulong addr' and 'int len'. It is very unsafe, especially
some APIs will be passed a non-int len by caller which might cause
overflow quietly.
Below is an potential overflow case:
    dma_memory_read(uint32_t len)
      -> dma_memory_rw(uint32_t len)
        -> dma_memory_rw_relaxed(uint32_t len)
          -> address_space_rw(int len) # len overflow

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Peter Crosthwaite <crosthwaite.peter@gmail.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-05 16:50:18 +01:00
Murilo Opsfelder Araujo
53adb9d43e mmap-alloc: fix hugetlbfs misaligned length in ppc64
The commit 7197fb4058 ("util/mmap-alloc:
fix hugetlb support on ppc64") fixed Huge TLB mappings on ppc64.

However, we still need to consider the underlying huge page size
during munmap() because it requires that both address and length be a
multiple of the underlying huge page size for Huge TLB mappings.
Quote from "Huge page (Huge TLB) mappings" paragraph under NOTES
section of the munmap(2) manual:

  "For munmap(), addr and length must both be a multiple of the
  underlying huge page size."

On ppc64, the munmap() in qemu_ram_munmap() does not work for Huge TLB
mappings because the mapped segment can be aligned with the underlying
huge page size, not aligned with the native system page size, as
returned by getpagesize().

This has the side effect of not releasing huge pages back to the pool
after a hugetlbfs file-backed memory device is hot-unplugged.

This patch fixes the situation in qemu_ram_mmap() and
qemu_ram_munmap() by considering the underlying page size on ppc64.

After this patch, memory hot-unplug releases huge pages back to the
pool.

Fixes: 7197fb4058
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-04 18:44:20 +11:00
Peter Maydell
5601be3b01 exec.c: Don't reallocate IOMMUNotifiers that are in use
The tcg_register_iommu_notifier() code has a GArray of
TCGIOMMUNotifier structs which it has registered by passing
memory_region_register_iommu_notifier() a pointer to the embedded
IOMMUNotifier field. Unfortunately, if we need to enlarge the
array via g_array_set_size() this can cause a realloc(), which
invalidates the pointer that memory_region_register_iommu_notifier()
put into the MemoryRegion's iommu_notify list. This can result
in segfaults.

Switch the GArray to holding pointers to the TCGIOMMUNotifier
structs, so that we can individually allocate and free them.

Cc: qemu-stable@nongnu.org
Fixes: 1f871c5e6b ("exec.c: Handle IOMMUs in address_space_translate_for_iotlb()")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190128174241.5860-1-peter.maydell@linaro.org
2019-02-01 14:55:45 +00:00
Stefan Hajnoczi
047be4ed24 memory: add memory_region_flush_rom_device()
ROM devices go via MemoryRegionOps->write() callbacks for write
operations and do not dirty/invalidate that memory.  Device emulation
must be able to mark memory ranges that have been modified internally
(e.g. using memory_region_get_ram_ptr()).

Introduce the memory_region_flush_rom_device() API for this purpose.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20190123212234.32068-2-stefanha@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: fix block comment style]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-01-29 11:46:04 +00:00
Peter Maydell
ea7a5330b7 exec.c: Use correct attrs in cpu_memory_rw_debug()
In the softmmu version of cpu_memory_rw_debug(), we ask the
CPU for the attributes to use for the virtual memory access,
and we correctly use those to identify the address space
index. However, we were not passing them in to the
address_space_write_rom() and address_space_rw() functions.

The effect of this was that a memory access from the gdbstub
to a device which had behaviour that was sensitive to the
memory attributes (such as some ARMv8M NVIC registers) was
incorrectly always performed as if non-secure, rather than
using the right security state for the CPU's current state.

Fixes: https://bugs.launchpad.net/qemu/+bug/1812091

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190117133834.7480-1-peter.maydell@linaro.org
2019-01-29 11:46:04 +00:00
Paolo Bonzini
f481ee2d5e qemu/queue.h: typedef QTAILQ heads
This will be needed when we change the QTAILQ head and elem structs
to unions.  However, it is also consistent with the usage elsewhere
in QEMU for other list head structs (see for example FsMountList).

Note that most QTAILQs only need their name in order to do backwards
walks.  Those do not break with the struct->union change, and anyway
the change will also remove the need to name heads when doing backwards
walks, so those are not touched here.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-11 15:46:55 +01:00
Paolo Bonzini
b58deb344d qemu/queue.h: leave head structs anonymous unless necessary
Most list head structs need not be given a name.  In most cases the
name is given just in case one is going to use QTAILQ_LAST, QTAILQ_PREV
or reverse iteration, but this does not apply to lists of other kinds,
and even for QTAILQ in practice this is only rarely needed.  In addition,
we will soon reimplement those macros completely so that they do not
need a name for the head struct.  So clean up everything, not giving a
name except in the rare case where it is necessary.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-11 15:46:55 +01:00
Peter Maydell
3c8133f973 Rename cpu_physical_memory_write_rom() to address_space_write_rom()
The API of cpu_physical_memory_write_rom() is odd, because it
takes an AddressSpace, unlike all the other cpu_physical_memory_*
access functions. Rename it to address_space_write_rom(), and
bring its API into line with address_space_write().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20181122133507.30950-3-peter.maydell@linaro.org
2018-12-14 13:30:48 +00:00
Peter Maydell
75693e1411 exec.c: Rename cpu_physical_memory_write_rom_internal()
Rename cpu_physical_memory_write_rom_internal() to
address_space_write_rom_internal(), and make it take
MemTxAttrs and return a MemTxResult. This brings its
API into line with address_space_write().

This is an internal function to exec.c; fixing its API
will allow us to change the global function
cpu_physical_memory_write_rom().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20181122133507.30950-2-peter.maydell@linaro.org
2018-12-14 13:30:48 +00:00
Emilio G. Cota
5005e2537d exec: introduce tlb_init
Paves the way for the addition of a per-TLB lock.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20181009174557.16125-4-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18 18:58:10 -07:00
Thomas Huth
c95ac10340 cpu: Provide a proper prototype for target_words_bigendian() in a header
We've got three places already that provide a prototype for this
function in a .c file - that's ugly. Let's provide a proper prototype
in a header instead, with a proper description why this function should
not be used in most cases.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2018-10-17 08:41:43 +02:00
Hikaru Nishida
d5dbde4645 hostmem-file: make available memory-backend-file on POSIX-based hosts
Before this change, memory-backend-file object is valid for Linux hosts
only because hostmem-file.c is compiled only on Linux hosts.
However, other POSIX-based hosts (such as macOS) can support
memory-backend-file object in the same way as on Linux hosts.
This patch makes hostmem-file.c and related functions to be compiled on
all POSIX-based hosts to make available memory-backend-file on them.

Signed-off-by: Hikaru Nishida <hikarupsp@gmail.com>
Message-Id: <20180924123205.29651-1-hikarupsp@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-02 19:09:13 +02:00
Peter Maydell
55f4e79d79 pc: fixes
This includes nvdimm persistence fixes queued before the release.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbepoTAAoJECgfDbjSjVRpLioH/3BPps8FLh4x2gZSq3B+u72O
 RYUA3I3TilEGyc9yf8o7e1Hf+pQAJBEmulcnKxXFVWZIJ1GVLPt4NZCMQGiPDnJL
 +RCT/Q64PUy09hRjddAasikrvXa4YOsRgBgJJToO7v9PSQSaU3fC7O3hNea7KcF/
 C4SSqkUgxyDhCCYHHblpKxFz/wtwy4ZaCGSdozIdmKNPJ6/ye8wOQ1Mq9e1Mwp18
 S6ilJub5IwB6aM2KVMmX4AFomF4u2cn153ts8fI+Dyo4/NE6P4+viDlz3BOBKdzm
 kmd49h6/n4Lenoo4oI1yNHSuIJJTVfvnoLu6rG7mPbQKgxNd1uN4KuUIygU5PCY=
 =Xcaj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc: fixes

This includes nvdimm persistence fixes queued before the release.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 20 Aug 2018 11:38:11 BST
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  migration/ram: ensure write persistence on loading all data to PMEM.
  migration/ram: Add check and info message to nvdimm post copy.
  mem/nvdimm: ensure write persistence to PMEM in label emulation
  hostmem-file: add the 'pmem' option
  configure: add libpmem support
  memory, exec: switch file ram allocation functions to 'flags' parameters
  memory, exec: Expose all memory block related flags.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-21 10:23:53 +01:00
Peter Maydell
55a7cb144d accel/tcg: Check whether TLB entry is RAM consistently with how we set it up
We set up TLB entries in tlb_set_page_with_attrs(), where we have
some logic for determining whether the TLB entry is considered
to be RAM-backed, and thus has a valid addend field. When we
look at the TLB entry in get_page_addr_code(), we use different
logic for determining whether to treat the page as RAM-backed
and use the addend field. This is confusing, and in fact buggy,
because the code in tlb_set_page_with_attrs() correctly decides
that rom_device memory regions not in romd mode are not RAM-backed,
but the code in get_page_addr_code() thinks they are RAM-backed.
This typically results in "Bad ram pointer" assertion if the
guest tries to execute from such a memory region.

Fix this by making get_page_addr_code() just look at the
TLB_MMIO bit in the code_address field of the TLB, which
tlb_set_page_with_attrs() sets if and only if the addend
field is not valid for code execution.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180713150945.12348-1-peter.maydell@linaro.org
2018-08-14 17:17:19 +01:00
Junyan He
a4de8552b2 hostmem-file: add the 'pmem' option
When QEMU emulates vNVDIMM labels and migrates vNVDIMM devices, it
needs to know whether the backend storage is a real persistent memory,
in order to decide whether special operations should be performed to
ensure the data persistence.

This boolean option 'pmem' allows users to specify whether the backend
storage of memory-backend-file is a real persistent memory. If
'pmem=on', QEMU will set the flag RAM_PMEM in the RAM block of the
corresponding memory region. If 'pmem' is set while lack of libpmem
support, a error is generated.

Signed-off-by: Junyan He <junyan.he@intel.com>
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-08-10 13:29:39 +03:00
Junyan He
cbfc017103 memory, exec: switch file ram allocation functions to 'flags' parameters
As more flag parameters besides the existing 'share' are going to be
added to following functions
memory_region_init_ram_from_file
qemu_ram_alloc_from_fd
qemu_ram_alloc_from_file
let's switch them to use the 'flags' parameters so as to ease future
flag additions.

The existing 'share' flag is converted to the RAM_SHARED bit in ram_flags,
and other flag bits are ignored by above functions right now.

Signed-off-by: Junyan He <junyan.he@intel.com>
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2018-08-10 13:29:39 +03:00
Junyan He
b0e5de9381 memory, exec: Expose all memory block related flags.
We need to use these flags in other files rather than just in exec.c,
For example, RAM_SHARED should be used when create a ram block from file.
We expose them the exec/memory.h

Signed-off-by: Junyan He <junyan.he@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-08-10 13:29:39 +03:00
Paolo Bonzini
c40d479207 tcg: simplify !CONFIG_TCG handling of tb_invalidate_*
There is no need for a stub, since tb_invalidate_phys_addr can be excised
altogether when TCG is disabled.  This is a bit cleaner since it avoids
using code that is clearly specific to user-mode emulation (it calls
mmap_lock/unlock) for the !CONFIG_TCG case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-02 15:41:18 +02:00
Philippe Mathieu-Daudé
646f34fa54 tcg: Fix --disable-tcg build breakage
Fix the --disable-tcg breakage introduced by 8bca9a03ec:

    $ configure --disable-tcg
    [...]
    $ make -C i386-softmmu exec.o
    make: Entering directory 'i386-softmmu'
      CC      exec.o
    In file included from source/qemu/exec.c:62:0:
    source/qemu/include/exec/ram_addr.h:96:6: error: conflicting types for ‘tb_invalidate_phys_range’
     void tb_invalidate_phys_range(ram_addr_t start, ram_addr_t end);
          ^~~~~~~~~~~~~~~~~~~~~~~~
    In file included from source/qemu/exec.c:24:0:
    source/qemu/include/exec/exec-all.h:309:6: note: previous declaration of ‘tb_invalidate_phys_range’ was here
     void tb_invalidate_phys_range(target_ulong start, target_ulong end);
          ^~~~~~~~~~~~~~~~~~~~~~~~
    source/qemu/exec.c:1043:6: error: conflicting types for ‘tb_invalidate_phys_addr’
     void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
          ^~~~~~~~~~~~~~~~~~~~~~~
    In file included from source/qemu/exec.c:24:0:
    source/qemu/include/exec/exec-all.h:308:6: note: previous declaration of ‘tb_invalidate_phys_addr’ was here
     void tb_invalidate_phys_addr(target_ulong addr);
          ^~~~~~~~~~~~~~~~~~~~~~~
    make: *** [source/qemu/rules.mak:69: exec.o] Error 1
    make: Leaving directory 'i386-softmmu'

Tested to build x86_64-softmmu and i386-softmmu targets.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180629200710.27626-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-07-02 13:42:05 +01:00
David Hildenbrand
61362b71c1 exec: check that alignment is a power of two
Right now we can crash QEMU using e.g.

qemu-system-x86_64 -m 256M,maxmem=20G,slots=2 \
 -object memory-backend-file,id=mem0,size=12288,mem-path=/dev/zero,align=12288 \
 -device pc-dimm,id=dimm1,memdev=mem0

qemu-system-x86_64: util/mmap-alloc.c:115:
 qemu_ram_mmap: Assertion `is_power_of_2(align)' failed

Fix this by adding a proper check.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180607154705.6316-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:31 +02:00
Paolo Bonzini
8bca9a03ec move public invalidate APIs out of translate-all.{c,h}, clean up
Place them in exec.c, exec-all.h and ram_addr.h.  This removes
knowledge of translate-all.h (which is an internal header) from
several files outside accel/tcg and removes knowledge of
AddressSpace from translate-all.c (as it only operates on ram_addr_t).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:30 +02:00
Eric Auger
a99761d3c8 exec: Fix MAP_RAM for cached access
When an IOMMUMemoryRegion is in front of a virtio device,
address_space_cache_init does not set cache->ptr as the memory
region is not RAM. However when the device performs an access,
we end up in glue() which performs the translation and then uses
MAP_RAM. This latter uses the unset ptr and returns a wrong value
which leads to a SIGSEV in address_space_lduw_internal_cached_slow,
for instance.

In slow path cache->ptr is NULL and MAP_RAM must redirect to
qemu_map_ram_ptr((mr)->ram_block, ofs).

As MAP_RAM, IS_DIRECT and INVALIDATE are the same in _cached_slow
and non cached mode, let's remove those macros.

This fixes the use cases featuring vIOMMU (Intel and ARM SMMU)
which lead to a SIGSEV.

Fixes: 48564041a7 (exec: reintroduce MemoryRegion caching)
Signed-off-by: Eric Auger <eric.auger@redhat.com>

Message-Id: <1528895946-28677-1-git-send-email-eric.auger@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:30 +02:00
David Hildenbrand
c136180c90 postcopy: drop ram_pages parameter from postcopy_ram_incoming_init()
Not needed. Don't expose last_ram_page().

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180620202736.21399-1-david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-06-27 13:28:31 +02:00
Emilio G. Cota
f28d0dfdce tcg: fix --disable-tcg build breakage
Fix the --disable-tcg breakage introduced by tb_lock's removal by
relying on the fact that tcg_enabled() is set to 0 at
compile-time under --disable-tcg.

While at it, add further asserts to fix builds that enable both
--disable-tcg and --enable-debug, which were broken even before
tb_lock's removal.

Tested to build x86_64-softmmu and i386-softmmu targets.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-22 18:55:24 +01:00
Emilio G. Cota
0ac20318ce tcg: remove tb_lock
Use mmap_lock in user-mode to protect TCG state and the page descriptors.
In !user-mode, each vCPU has its own TCG state, so no locks needed.
Per-page locks are used to protect the page descriptors.

Per-TB locks are used in both modes to protect TB jumps.

Some notes:

- tb_lock is removed from notdirty_mem_write by passing a
  locked page_collection to tb_invalidate_phys_page_fast.

- tcg_tb_lookup/remove/insert/etc have their own internal lock(s),
  so there is no need to further serialize access to them.

- do_tb_flush is run in a safe async context, meaning no other
  vCPU threads are running. Therefore acquiring mmap_lock there
  is just to please tools such as thread sanitizer.

- Not visible in the diff, but tb_invalidate_phys_page already
  has an assert_memory_lock.

- cpu_io_recompile is !user-only, so no mmap_lock there.

- Added mmap_unlock()'s before all siglongjmp's that could
  be called in user-mode while mmap_lock is held.
  + Added an assert for !have_mmap_lock() after returning from
    the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic.

Performance numbers before/after:

Host: AMD Opteron(tm) Processor 6376

                 ubuntu 17.04 ppc64 bootup+shutdown time

  700 +-+--+----+------+------------+-----------+------------*--+-+
      |    +    +      +            +           +           *B    |
      |         before ***B***                            ** *    |
      |tb lock removal ###D###                         ***        |
  600 +-+                                           ***         +-+
      |                                           **         #    |
      |                                        *B*          #D    |
      |                                     *** *         ##      |
  500 +-+                                ***           ###      +-+
      |                             * ***           ###           |
      |                            *B*          # ##              |
      |                          ** *          #D#                |
  400 +-+                      **            ##                 +-+
      |                      **           ###                     |
      |                    **           ##                        |
      |                  **         # ##                          |
  300 +-+  *           B*          #D#                          +-+
      |    B         ***        ###                               |
      |    *       **       ####                                  |
      |     *   ***      ###                                      |
  200 +-+   B  *B     #D#                                       +-+
      |     #B* *   ## #                                          |
      |     #*    ##                                              |
      |    + D##D#     +            +           +            +    |
  100 +-+--+----+------+------------+-----------+------------+--+-+
           1    8      16      Guest CPUs       48           64
  png: https://imgur.com/HwmBHXe

              debian jessie aarch64 bootup+shutdown time

  90 +-+--+-----+-----+------------+------------+------------+--+-+
     |    +     +     +            +            +            +    |
     |         before ***B***                                B    |
  80 +tb lock removal ###D###                              **D  +-+
     |                                                   **###    |
     |                                                 **##       |
  70 +-+                                             ** #       +-+
     |                                             ** ##          |
     |                                           **  #            |
  60 +-+                                       *B  ##           +-+
     |                                       **  ##               |
     |                                    ***  #D                 |
  50 +-+                               ***   ##                 +-+
     |                             * **   ###                     |
     |                           **B*  ###                        |
  40 +-+                     ****  # ##                         +-+
     |                   ****     #D#                             |
     |             ***B**      ###                                |
  30 +-+    B***B**        ####                                 +-+
     |    B *   *     # ###                                       |
     |     B       ###D#                                          |
  20 +-+   D  ##D##                                             +-+
     |      D#                                                    |
     |    +     +     +            +            +            +    |
  10 +-+--+-----+-----+------------+------------+------------+--+-+
          1     8     16      Guest CPUs        48           64
  png: https://imgur.com/iGpGFtv

The gains are high for 4-8 CPUs. Beyond that point, however, unrelated
lock contention significantly hurts scalability.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 08:18:48 -10:00
Peter Maydell
1f871c5e6b exec.c: Handle IOMMUs in address_space_translate_for_iotlb()
Currently we don't support board configurations that put an IOMMU
in the path of the CPU's memory transactions, and instead just
assert() if the memory region fonud in address_space_translate_for_iotlb()
is an IOMMUMemoryRegion.

Remove this limitation by having the function handle IOMMUs.
This is mostly straightforward, but we must make sure we have
a notifier registered for every IOMMU that a transaction has
passed through, so that we can flush the TLB appropriately
when any of the IOMMUs change their mappings.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-5-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell
2c91bcf273 iommu: Add IOMMU index argument to translate method
Add an IOMMU index argument to the translate method of
IOMMUs. Since all of our current IOMMU implementations
support only a single IOMMU index, this has no effect
on the behaviour.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-4-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell
6d3ede5410 exec.c: Use stn_p() and ldn_p() instead of explicit switches
Now we have stn_p() and ldn_p() we can use them in various
functions in exec.c that used to have their own switch-on-size code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611171007.4165-4-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell
22672c6075 exec.c: Don't accidentally sign-extend 4-byte loads in subpage_read()
In subpage_read() we perform a load of the data into a local buffer
which we then access using ldub_p(), lduw_p(), ldl_p() or ldq_p()
depending on its size, storing the result into the uint64_t *data.
Since ldl_p() returns an 'int', this means that for the 4-byte
case we will sign-extend the data, whereas for 1 and 2 byte
reads we zero-extend it.

This ought not to matter since the caller will likely ignore values in
the high bytes of the data, but add a cast so that we're consistent.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611171007.4165-3-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell
2d54f19401 cputlb: Pass cpu_transaction_failed() the correct physaddr
The API for cpu_transaction_failed() says that it takes the physical
address for the failed transaction. However we were actually passing
it the offset within the target MemoryRegion. We don't currently
have any target CPU implementations of this hook that require the
physical address; fix this bug so we don't get confused if we ever
do add one.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611125633.32755-3-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell
b74588a493 migration/next for 20180604
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJbFLygAAoJEPSH7xhYctcjszUP/j7/21sHXwPyB5y4CQHsivLL
 18MnXTkjgSlue3fuPKI9ZzD2VZq19AtwarLuW3IFf15adKfeV0hhZsIMUmu0XTAz
 jgol8YYgoYlk3/on8Ux8PWV9RWKHh7vWAF7hjwJOnI5CFF7gXz6kVjUx03A7RSnB
 1a92lvgl5P8AbK5HZ/i08nOaAxCUWIVIoQeZf3BlZ5WdJE0j+d0Mvp/S8NEq5VPD
 P6oYJZy64QFGotmV1vtYD4G/Hqe8XBw2LoSVgjPSmQpd1m4CXiq0iyCxtC8HT+iq
 0P0eHATRglc5oGhRV6Mi5ixH2K2MvpshdyMKVAbkJchKBHn4k3m6jefhnUXuucwC
 fq9x7VIYzQ80S+44wJri683yUWP3Qmbb6mWYBb1L0jgbTkY1wx431IYMtDVv5q4R
 oXtbe77G6lWKHH0SFrU8TY/fqJvBwpat+QwOoNsluHVjzAzpOpcVokB8pLj592/1
 bHkDwvv7i+B37muOF3PFy++BoliwpRiNbVC5Btw4mnbOoY+AeJHMRBuhJHU6EyJv
 GOFiQ5w5XEXEu98giNqHcc0AC8w7iTRGI472UzpS0AG20wpc1/2jF0yM99stauwD
 oIu3MU4dQlz6QRy8KFnJShqjYNYuOs59USLiYM79ABa9J+JxJxxTAaqRbUFojQR6
 KN2i0QQJ+6MO4Skjbu/3
 =K0eJ
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20180604' into staging

migration/next for 20180604

# gpg: Signature made Mon 04 Jun 2018 05:14:24 BST
# gpg:                using RSA key F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20180604:
  migration: not wait RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect
  migration: remove unnecessary variables len in QIOChannelRDMA
  migration: Don't activate block devices if using -S
  migration: discard non-migratable RAMBlocks
  migration: introduce decompress-error-check

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-04 12:54:00 +01:00
Cédric Le Goater
b895de5027 migration: discard non-migratable RAMBlocks
On the POWER9 processor, the XIVE interrupt controller can control
interrupt sources using MMIO to trigger events, to EOI or to turn off
the sources. Priority management and interrupt acknowledgment is also
controlled by MMIO in the presenter sub-engine.

These MMIO regions are exposed to guests in QEMU with a set of 'ram
device' memory mappings, similarly to VFIO, and the VMAs are populated
dynamically with the appropriate pages using a fault handler.

But, these regions are an issue for migration. We need to discard the
associated RAMBlocks from the RAM state on the source VM and let the
destination VM rebuild the memory mappings on the new host in the
post_load() operation just before resuming the system.

To achieve this goal, the following introduces a new RAMBlock flag
RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
vmstate_unregister_ram() routines. This flag is then used by the
migration to identify RAMBlocks to discard on the source. Some checks
are also performed on the destination to make sure nothing invalid was
sent.

This change impacts the boston, malta and jazz mips boards for which
migration compatibility is broken.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-06-04 05:46:15 +02:00
Peter Maydell
afd76ffba9 * Linux header upgrade (Peter)
* firmware.json definition (Laszlo)
 * IPMI migration fix (Corey)
 * QOM improvements (Alexey, Philippe, me)
 * Memory API cleanups (Jay, me, Tristan, Peter)
 * WHPX fixes and improvements (Lucian)
 * Chardev fixes (Marc-André)
 * IOMMU documentation improvements (Peter)
 * Coverity fixes (Peter, Philippe)
 * Include cleanup (Philippe)
 * -clock deprecation (Thomas)
 * Disable -sandbox unless CONFIG_SECCOMP (Yi Min Zhao)
 * Configurability improvements (me)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAlsRd2UUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPG8Qf+M85E8xAQ/bhs90tAymuXkUUsTIFF
 uI76K8eM0K3b2B+vGckxh1gyN5O3GQaMEDL7vITfqbX+EOH5U2lv8V9JRzf2YvbG
 Zahjd4pOCYzR0b9JENA1r5U/J8RntNrBNXlKmGTaXOaw9VCXlZyvgVd9CE3z/e2M
 0jSXMBdF4LB3UzECI24Va8ejJxdSiJcqXA2j3J+pJFxI698i+Z5eBBKnRdo5TVe5
 jl0TYEsbS6CLwhmbLXmt3Qhq+ocZn7YH9X3HjkHEdqDUeYWyT9jwUpa7OHFrIEKC
 ikWm9er4YDzG/vOC0dqwKbShFzuTpTJuMz5Mj4v8JjM/iQQFrp4afjcW2g==
 =RS/B
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Linux header upgrade (Peter)
* firmware.json definition (Laszlo)
* IPMI migration fix (Corey)
* QOM improvements (Alexey, Philippe, me)
* Memory API cleanups (Jay, me, Tristan, Peter)
* WHPX fixes and improvements (Lucian)
* Chardev fixes (Marc-André)
* IOMMU documentation improvements (Peter)
* Coverity fixes (Peter, Philippe)
* Include cleanup (Philippe)
* -clock deprecation (Thomas)
* Disable -sandbox unless CONFIG_SECCOMP (Yi Min Zhao)
* Configurability improvements (me)

# gpg: Signature made Fri 01 Jun 2018 17:42:13 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (56 commits)
  hw: make virtio devices configurable via default-configs/
  hw: allow compiling out SCSI
  memory: Make operations using MemoryRegionIoeventfd struct pass by pointer.
  char: Remove unwanted crlf conversion
  qdev: Remove DeviceClass::init() and ::exit()
  qdev: Simplify the SysBusDeviceClass::init path
  hw/i2c: Use DeviceClass::realize instead of I2CSlaveClass::init
  hw/i2c/smbus: Use DeviceClass::realize instead of SMBusDeviceClass::init
  target/i386/kvm.c: Remove compatibility shim for KVM_HINTS_REALTIME
  Update Linux headers to 4.17-rc6
  target/i386/kvm.c: Handle renaming of KVM_HINTS_DEDICATED
  scripts/update-linux-headers: Handle kernel license no longer being one file
  scripts/update-linux-headers: Handle __aligned_u64
  virtio-gpu-3d: Define VIRTIO_GPU_CAPSET_VIRGL2 elsewhere
  gdbstub: Prevent fd leakage
  docs/interop: add "firmware.json"
  ipmi: Use proper struct reference for KCS vmstate
  vmstate: Add a VSTRUCT type
  tcg: remove softfloat from --disable-tcg builds
  qemu-options: Mark the non-functional -clock option as deprecated
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-01 18:24:16 +01:00
Peter Maydell
8347c18506 exec.c: Initialize sa_flags passed to sigaction()
Coverity points out that in the user-only version of cpu_abort() we
call sigaction() with a partially initialized struct sigaction
(CID 1005351). Correct the omission.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180515182700.31736-1-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01 15:13:46 +02:00
Peter Maydell
2f7b009c2e Make address_space_translate_iommu take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_translate_iommu().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-14-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
49e14aa827 Make flatview_do_translate() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to flatview_do_translate().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-13-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
7446eb07c1 Make address_space_get_iotlb_entry() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_get_iotlb_entry().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-12-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
efa99a2ff8 Make flatview_translate() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to flatview_translate(); all its
callers now have attrs available.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-11-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
eace72b7a6 Make flatview_access_valid() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to flatview_access_valid().
Its callers now all have an attrs value to hand, so we can
correct our earlier temporary use of MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-10-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
8372d38327 Make MemoryRegion valid.accepts callback take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to the MemoryRegion valid.accepts
callback. We'll need this for subpage_accepts().

We could take the approach we used with the read and write
callbacks and add new a new _with_attrs version, but since there
are so few implementations of the accepts hook we just change
them all.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-9-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
6d7b9a6c3b Make memory_region_access_valid() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to memory_region_access_valid().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.

The callsite in flatview_access_valid() is part of a recursive
loop flatview_access_valid() -> memory_region_access_valid() ->
 subpage_accepts() -> flatview_access_valid(); we make it pass
MEMTXATTRS_UNSPECIFIED for now, until the next several commits
have plumbed an attrs parameter through the rest of the loop
and we can add an attrs parameter to flatview_access_valid().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-8-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
53d0790dfe Make flatview_extend_translation() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to flatview_extend_translation().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-7-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
fddffa4268 Make address_space_access_valid() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_access_valid().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-6-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
f26404fbee Make address_space_map() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_map().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-5-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
bc6b1cec84 Make address_space_translate{, _cached}() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_translate()
and address_space_translate_cached(). Callers either have an
attrs value to hand, or don't care and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-4-peter.maydell@linaro.org
2018-05-31 14:50:52 +01:00
Peter Maydell
c874dc4f5e Make tb_invalidate_phys_addr() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to tb_invalidate_phys_addr().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180521140402.23318-3-peter.maydell@linaro.org
2018-05-31 14:50:52 +01:00
Paolo Bonzini
48564041a7 exec: reintroduce MemoryRegion caching
MemoryRegionCache was reverted to "normal" address_space_* operations
for 2.9, due to lack of support for IOMMUs.  Reinstate the
optimizations, caching only the IOMMU translation at address_cache_init
but not the IOMMU lookup and target AddressSpace translation are not
cached; now that MemoryRegionCache supports IOMMUs, it becomes more widely
applicable too.

The inlined fast path is defined in memory_ldst_cached.inc.h, while the
slow path uses memory_ldst.inc.c as before.  The smaller fast path causes
a little code size reduction in MemoryRegionCache users:

    hw/virtio/virtio.o text size before: 32373
    hw/virtio/virtio.o text size after: 31941

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-09 00:13:38 +02:00
Paolo Bonzini
a411c84b56 exec: extract address_space_translate_iommu, fix page_mask corner case
This will be used to process IOMMUs in a MemoryRegionCache.  This
includes a small bugfix, in that the returned page_mask is now
correctly -1 if the IOMMU memory region maps the entire address
space directly.  Previously, address_space_get_iotlb_entry would
return ~TARGET_PAGE_MASK.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-09 00:13:38 +02:00
Paolo Bonzini
ad2804d9e4 exec: small changes to flatview_do_translate
Prepare for extracting the IOMMU part to a separate function.  Mostly
cosmetic; the only semantic change is that, if there is more than one
cascaded IOMMU and the second one fails to translate, *plen_out is now
adjusted according to the page mask of the first IOMMU.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-09 00:13:38 +02:00
David Gibson
2b10808539 Add host_memory_backend_pagesize() helper
There are a couple places (one generic, one target specific) where we need
to get the host page size associated with a particular memory backend.  I
have some upcoming code which will add another place which wants this.  So,
for convenience, add a helper function to calculate this.

host_memory_backend_pagesize() returns the host pagesize for a given
HostMemoryBackend object.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2018-04-27 18:05:22 +10:00
David Gibson
0de6e2a3ca Make qemu_mempath_getpagesize() accept NULL
qemu_mempath_getpagesize() gets the effective (host side) page size for
a block of memory backed by an mmap()ed file on the host.  It requires
the mem_path parameter to be non-NULL.

This ends up meaning all the callers need a different case for handling
anonymous memory (for memory-backend-ram or default memory with -mem-path
is not specified).

We can make all those callers a little simpler by having
qemu_mempath_getpagesize() accept NULL, and treat that as the anonymous
memory case.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2018-04-27 18:05:22 +10:00
Greg Kurz
72a841d2a4 exec: fix memory leak in find_max_supported_pagesize()
The string returned by object_property_get_str() is dynamically allocated.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <152231458624.69730.1752893648612848392.stgit@bahia.lan>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-04-02 23:05:15 -03:00
Peter Maydell
ed627b2ad3 virtio,vhost,pci,pc: features, cleanups
SRAT tables for DIMM devices
 new virtio net flags for speed/duplex
 post-copy migration support in vhost
 cleanups in pci
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJasR1rAAoJECgfDbjSjVRpOocH/R9A3g/TkpGjmLzJBrrX1NGO
 I/iq0ttHjqg4OBIChA4BHHjXwYUMs7XQn26B3efrk1otLAJhuqntZIIo3uU0WraA
 5J+4DT46ogs5rZWNzDCZ0zAkSaATDA6h9Nfh7TvPc9Q2WpcIT0cTa/jOtrxRc9Vq
 32hbUKtJSpNxRjwbZvk6YV21HtWo3Tktdaj9IeTQTN0/gfMyOMdgxta3+bymicbJ
 FuF9ybHcpXvrEctHhXHIL4/YVGEH/4shagZ4JVzv1dVdLeHLZtPomdf7+oc0+07m
 Qs+yV0HeRS5Zxt7w5blGLC4zDXczT/bUx8oln0Tz5MV7RR/+C2HwMOHC69gfpSc=
 =vomK
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio,vhost,pci,pc: features, cleanups

SRAT tables for DIMM devices
new virtio net flags for speed/duplex
post-copy migration support in vhost
cleanups in pci

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 20 Mar 2018 14:40:43 GMT
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (51 commits)
  postcopy shared docs
  libvhost-user: Claim support for postcopy
  postcopy: Allow shared memory
  vhost: Huge page align and merge
  vhost+postcopy: Wire up POSTCOPY_END notify
  vhost-user: Add VHOST_USER_POSTCOPY_END message
  libvhost-user: mprotect & madvises for postcopy
  vhost+postcopy: Call wakeups
  vhost+postcopy: Add vhost waker
  postcopy: postcopy_notify_shared_wake
  postcopy: helper for waking shared
  vhost+postcopy: Resolve client address
  postcopy-ram: add a stub for postcopy_request_shared_page
  vhost+postcopy: Helper to send requests to source for shared pages
  vhost+postcopy: Stash RAMBlock and offset
  vhost+postcopy: Send address back to qemu
  libvhost-user+postcopy: Register new regions with the ufd
  migration/ram: ramblock_recv_bitmap_test_byte_offset
  postcopy+vhost-user: Split set_mem_table for postcopy
  vhost+postcopy: Transmit 'listen' to slave
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	scripts/update-linux-headers.sh
2018-03-20 15:48:34 +00:00
Dr. David Alan Gilbert
2ce16640b4 postcopy: use UFFDIO_ZEROPAGE only when available
Use a flag on the RAMBlock to state whether it has the
UFFDIO_ZEROPAGE capability, use it when it's available.

This allows the use of postcopy on tmpfs as well as hugepage
backed files.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:27 +02:00
Dr. David Alan Gilbert
f90bb71bfd qemu_ram_block_host_offset
Utility to give the offset of a host pointer within a RAMBlock
(assuming we already know it's in that RAMBlock)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:26 +02:00
Dr. David Alan Gilbert
db144f7000 migrate: Update ram_block_discard_range for shared
The choice of call to discard a block is getting more complicated
for other cases.   We use fallocate PUNCH_HOLE in any file cases;
it works for both hugepage and for tmpfs.
We use the DONTNEED for non-hugepage cases either where they're
anonymous or where they're private.

Care should be taken when trying other backing files.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:26 +02:00
Igor Mammedov
2278b93941 Use cpu_create(type) instead of cpu_init(cpu_model)
With all targets defining CPU_RESOLVING_TYPE, refactor
cpu_parse_cpu_model(type, cpu_model) to parse_cpu_model(cpu_model)
so that callers won't have to know internal resolving cpu
type. Place it in exec.c so it could be called from both
target independed vl.c and *-user/main.c.

That allows us to stop abusing cpu type from
  MachineClass::default_cpu_type
as resolver class in vl.c which were confusing part of
cpu_parse_cpu_model().

Also with new parse_cpu_model(), the last users of cpu_init()
in null-machine.c and bsd/linux-user targets could be switched
to cpu_create() API and cpu_init() API will be removed by
follow up patch.

With no longer users left remove MachineState::cpu_model field,
new code should use MachineState::cpu_type instead and
leave cpu_model parsing to generic code in vl.c.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1518000027-274608-5-git-send-email-imammedo@redhat.com>
[ehabkost: Fix bsd-user build error]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-19 14:10:36 -03:00
Paolo Bonzini
b39b61e410 memory: fix flatview_access_valid RCU read lock/unlock imbalance
Fixes: 11e732a5ed
Reported-by: Cornelia Huck <cohuck@redhat.com>
Reported-by: luigi burdo <intermediadc@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 20180307130238.19358-1-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 15:55:20 +00:00
Paolo Bonzini
db84fd973e address_space_rw: address_space_to_flatview needs RCU lock
address_space_rw is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, transform flatview_rw
into address_space_rw, since flatview_rw is otherwise unused.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06 14:01:28 +01:00
Paolo Bonzini
ad0c60fa57 address_space_map: address_space_to_flatview needs RCU lock
address_space_map is calling address_space_to_flatview but it can
be called outside the RCU lock.  The function itself is calling
rcu_read_lock/rcu_read_unlock, just in the wrong place, so the
fix is easy.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06 14:01:28 +01:00
Paolo Bonzini
11e732a5ed address_space_access_valid: address_space_to_flatview needs RCU lock
address_space_access_valid is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, push the rcu_read_lock/unlock
pair up from flatview_access_valid to address_space_access_valid.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06 14:01:28 +01:00
Paolo Bonzini
b2a44fcad7 address_space_read: address_space_to_flatview needs RCU lock
address_space_read is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, push the rcu_read_lock/unlock
pair up from flatview_read_full to address_space_read's constant size
fast path and address_space_read_full.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06 14:01:28 +01:00
Paolo Bonzini
4c6ebbb364 address_space_write: address_space_to_flatview needs RCU lock
address_space_write is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, push the rcu_read_lock/unlock
pair up from flatview_write to address_space_write.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06 14:01:27 +01:00
Marcel Apfelbaum
06329ccecf mem: add share parameter to memory-backend-ram
Currently only file backed memory backend can
be created with a "share" flag in order to allow
sharing guest RAM with other processes in the host.

Add the "share" flag also to RAM Memory Backend
in order to allow remapping parts of the guest RAM
to different host virtual addresses. This is needed
by the RDMA devices in order to remap non-contiguous
QEMU virtual addresses to a contiguous virtual address range.

Moved the "share" flag to the Host Memory base class,
modified phys_mem_alloc to include the new parameter
and a new interface memory_region_init_ram_shared_nomigrate.

There are no functional changes if the new flag is not used.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2018-02-19 13:03:24 +02:00
Alistair Francis
493d89bf74 tcg: Replace fprintf(stderr, "*\n" with error_report()
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.

find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +
find ./* -type f -exec sed -i \
    'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
    {} +

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Stefan Weil <sw@weilnetz.de>

Conversions that aren't followed by exit() dropped, because they might
be inappropriate.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-14-armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2018-02-06 18:29:46 +01:00
Haozhong Zhang
9837684316 hostmem-file: add "align" option
When mmap(2) the backend files, QEMU uses the host page size
(getpagesize(2)) by default as the alignment of mapping address.
However, some backends may require alignments different than the page
size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux
kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned,
fails with a kernel message like

[617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff)

Because there is no common approach to get such alignment requirement,
we add the 'align' option to 'memory-backend-file', so that users or
management utils, which have enough knowledge about the backend, can
specify a proper alignment via this option.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[ehabkost: fixed typo, fixed error_setg() format string]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-19 11:18:51 -02:00
Pavel Dovgalyuk
15a356c49a cpu: flush TB cache when loading VMState
Flushing TB cache is required because TBs key in the cache may match
different code which existed in the previous state.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Maria Klimushenkova <maria.klimushenkova@ispras.ru>
Message-Id: <20180110134846.12940.99993.stgit@pasha-VirtualBox>
[Add comment suggested by Peter Maydell. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-01-16 14:54:52 +01:00
Dr. David Alan Gilbert
801110ab22 find_ram_offset: Align ram_addr_t allocation on long boundaries
The dirty bitmaps are built from 'long's and there is fast-path code
for synchronising the case where the RAMBlock is aligned to the start
of a long boundary.  Align the allocation to this boundary
to cause the fast path to be used.

Offsets before change:
11398@1515169675.018566:find_ram_offset size: 0x1e0000 @ 0x8000000
11398@1515169675.020064:find_ram_offset size: 0x20000 @ 0x81e0000
11398@1515169675.020244:find_ram_offset size: 0x20000 @ 0x8200000
11398@1515169675.024343:find_ram_offset size: 0x1000000 @ 0x8220000
11398@1515169675.025154:find_ram_offset size: 0x10000 @ 0x9220000
11398@1515169675.027682:find_ram_offset size: 0x40000 @ 0x9230000
11398@1515169675.032921:find_ram_offset size: 0x200000 @ 0x9270000
11398@1515169675.033307:find_ram_offset size: 0x1000 @ 0x9470000
11398@1515169675.033601:find_ram_offset size: 0x1000 @ 0x9471000

after change:
10923@1515169108.818245:find_ram_offset size: 0x1e0000 @ 0x8000000
10923@1515169108.819410:find_ram_offset size: 0x20000 @ 0x8200000
10923@1515169108.819587:find_ram_offset size: 0x20000 @ 0x8240000
10923@1515169108.823708:find_ram_offset size: 0x1000000 @ 0x8280000
10923@1515169108.824503:find_ram_offset size: 0x10000 @ 0x9280000
10923@1515169108.827093:find_ram_offset size: 0x40000 @ 0x92c0000
10923@1515169108.833045:find_ram_offset size: 0x200000 @ 0x9300000
10923@1515169108.833504:find_ram_offset size: 0x1000 @ 0x9500000
10923@1515169108.833787:find_ram_offset size: 0x1000 @ 0x9540000

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180105170138.23357-3-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:52 +01:00
Dr. David Alan Gilbert
154cc9ea3b find_ram_offset: Add comments and tracing
Add some comments so I can understand the various nested loops.
Add some tracing so I can see what they're doing.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180105170138.23357-2-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-16 14:54:52 +01:00
Peter Maydell
8af36743c2 exec: Don't reuse unassigned_mem_ops for io_mem_rom
We set up the io_mem_rom special memory region using the
unassigned_mem_ops structure; this is then used when a guest tries to
write to ROM.  This is incorrect, because the behaviour of unassigned
memory may be different from that of ROM for writes.  In particular,
on some architectures writing to unassigned memory generates a guest
exception, whereas writing to ROM is generally ignored.  Use a
special readonly_mem_ops for this purpose instead, so writes to
ROM are ignored for all guest CPUs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1513187549-2435-2-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-21 09:30:32 +01:00
Peter Xu
87a621d857 cpu: suffix cpu address spaces with cpu index
Renaming cpu address space names so that they won't be the same when
there are more than one.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171123092333.16085-4-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-21 09:30:31 +01:00
Peter Xu
80ceb07a83 cpu: refactor cpu_address_space_init()
Normally we create an address space for that CPU and pass that address
space into the function.  Let's just do it inside to unify address space
creations.  It'll simplify my next patch to rename those address spaces.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171123092333.16085-3-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-21 09:30:31 +01:00
Philippe Mathieu-Daudé
ff676046fb misc: remove duplicated includes
exec: housekeeping (funny since 02d0e09503)

applied using ./scripts/clean-includes

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18 17:07:02 +03:00
Peter Maydell
2726627197 exec.c: Factor out before/after actions for notdirty memory writes
The function notdirty_mem_write() has a sequence of actions
it has to do before and after the actual business of writing
data to host RAM to ensure that dirty flags are correctly
updated and we flush any TCG translations for the region.
We need to do this also in other places that write directly
to host RAM, most notably the TCG atomic helper functions.
Pull out the before and after pieces into their own functions.

We use an API where the prepare function stashes the various
bits of information about the write into a struct for the
complete function to use, because in the calls for the atomic
helpers the place where the complete function will be called
doesn't have the information to hand.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1511201308-23580-2-git-send-email-peter.maydell@linaro.org
2017-11-21 12:09:25 +00:00
Peter Maydell
62955e101e Miscellaneous bugfixes
-----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAloMXN0UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNNAQf/e7/uT2tW7WNfamSOMYXswf0R6ak+
 KjVSG+qiNsKaZzXmMFkhm4n0u1vCW0VGEQGRHr0MoSCyyfhupzLRHxfHi8SytqTf
 S6wqNtIbOK0L8bW+U5vzADks33UCuuUNlVZeOAkEPaXiLlgxmBoHfyoXkIGemJc2
 epx5x22rloNQLaBoL7FGmAkQhQCSJg19hAtRLo0tkryCwBZ9P6a1K0aNAHU2RFaB
 LgRFcxwduwTydsHRYeQ8J7YR0fERle01QUB8y9tlOc8/d2x9yRPBWhPHwscKMv6I
 JwM0c2Mnw6Yqbwyj7snWty7epgUcHWrOVnZnaIpNW9Z8m/wgz28oZ3a09w==
 =6wL6
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

Miscellaneous bugfixes

# gpg: Signature made Wed 15 Nov 2017 15:27:25 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  fix scripts/update-linux-headers.sh here document
  exec: Do not resolve subpage in mru_section
  util/stats64: Fix min/max comparisons
  cpu-exec: avoid cpu_exec_nocache infinite loop with record/replay
  cpu-exec: don't overwrite exception_index
  vhost-user-scsi: add missing virtqueue_size param
  target-i386: adds PV_TLB_FLUSH CPUID feature bit
  thread-posix: fix qemu_rec_mutex_trylock macro
  Makefile: simpler/faster "make help"
  ioapic/tracing: Remove last DPRINTFs
  Enable 8-byte wide MMIO for 16550 serial devices

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-11-16 14:42:54 +00:00
Paolo Bonzini
07c114bbf3 exec: Do not resolve subpage in mru_section
This fixes a crash caused by picking the wrong memory region in
address_space_lookup_region seen with client code accessing a device
model that uses alias memory regions.  The expensive part of
address_space_lookup_region anyway is phys_page_find; performance-wise
it is okay to repeat the subsequent subpage lookup.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20171114225941.072707456B5@zero.eik.bme.hu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-11-15 15:11:16 +01:00
Emilio G. Cota
2dda635410 qom: move CPUClass.tcg_initialize to a global
55c3cee ("qom: Introduce CPUClass.tcg_initialize", 2017-10-24)
introduces a per-CPUClass bool that we check so that the target CPU
is initialized for TCG only once. This works well except when
we end up creating more than one CPUClass, in which case we end
up incorrectly initializing TCG more than once, i.e. once for
each CPUClass.

This can be replicated with:
  $ aarch64-softmmu/qemu-system-aarch64 -machine xlnx-zcu102 -smp 6 \
      -global driver=xlnx,,zynqmp,property=has_rpu,value=on
In this case the class name of the "RPUs" is prefixed by "cortex-r5-",
whereas the "regular" CPUs are prefixed by "cortex-a53-". This
results in two CPUClass instances being created.

Fix it by introducing a static variable, so that only the first
target CPU being initialized will initialize the target-dependent
part of TCG, regardless of CPUClass instances.

Fixes: 55c3ceef61
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1510343626-25861-2-git-send-email-cota@braap.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-11-13 13:55:25 +00:00
Richard Henderson
9b990ee5a3 tcg: Add CPUState cflags_next_tb
We were generating code during tb_invalidate_phys_page_range,
check_watchpoint, cpu_io_recompile, and (seemingly) discarding
the TB, assuming that it would magically be picked up during
the next iteration through the cpu_exec loop.

Instead, record the desired cflags in CPUState so that we request
the proper TB so that there is no more magic.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24 13:53:41 -07:00
Emilio G. Cota
4e2ca83e71 tcg: define CF_PARALLEL and use it for TB hashing along with CF_COUNT_MASK
This will enable us to decouple code translation from the value
of parallel_cpus at any given time. It will also help us minimize
TB flushes when generating code via EXCP_ATOMIC.

Note that the declaration of parallel_cpus is brought to exec-all.h
to be able to define there the "curr_cflags" inline.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24 13:53:41 -07:00
Richard Henderson
55c3ceef61 qom: Introduce CPUClass.tcg_initialize
Move target cpu tcg initialization to common code,
called from cpu_exec_realizefn.

Acked-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24 22:00:13 +02:00
Paolo Bonzini
306526b5de watch_mem_write: implement 8-byte accesses
Aligned 8-byte memory writes by a 64-bit target on a 64-bit host should
always turn into atomic 8-byte writes on the host, however a write
write watchpoint would end up tearing the 8-byte write into two 4-byte
writes in access_with_adjusted_size().

Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-18 10:15:00 +02:00
Andrew Baumann
ad52878f97 notdirty_mem_write: implement 8-byte accesses
Aligned 8-byte memory writes by a 64-bit target on a 64-bit host should
always turn into atomic 8-byte writes on the host, however if we missed
in the softmmu, and the TLB line was marked as not dirty, then we
would end up tearing the 8-byte write into two 4-byte writes in
access_with_adjusted_size().

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-Id: <20171013181913.7556-1-Andrew.Baumann@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-18 10:15:00 +02:00
Peter Xu
076a93d797 exec: simplify address_space_get_iotlb_entry
This patch let address_space_get_iotlb_entry() to use the newly
introduced page_mask parameter in flatview_do_translate(). Then we
will be sure the IOTLB can be aligned to page mask, also we should
nicely support huge pages now when introducing a764040.

Fixes: a764040 ("exec: abstract address_space_do_translate()")
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20171010094247.10173-3-maxime.coquelin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-12 12:10:38 +02:00
Peter Xu
d5e5fafd11 exec: add page_mask for flatview_do_translate
The function is originally used for flatview_space_translate() and what
we care about most is (xlat, plen) range. However for iotlb requests, we
don't really care about "plen", but the size of the page that "xlat" is
located on. While, plen cannot really contain this information.

A simple example to show why "plen" is not good for IOTLB translations:

E.g., for huge pages, it is possible that guest mapped 1G huge page on
device side that used this GPA range:

  0x100000000 - 0x13fffffff

Then let's say we want to translate one IOVA that finally mapped to GPA
0x13ffffe00 (which is located on this 1G huge page). Then here we'll
get:

  (xlat, plen) = (0x13fffe00, 0x200)

So the IOTLB would be only covering a very small range since from
"plen" (which is 0x200 bytes) we cannot tell the size of the page.

Actually we can really know that this is a huge page - we just throw the
information away in flatview_do_translate().

This patch introduced "page_mask" optional parameter to capture that
page mask info. Also, I made "plen" an optional parameter as well, with
some comments for the whole function.

No functional change yet.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-Id: <20171010094247.10173-2-maxime.coquelin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-12 12:10:38 +02:00
Emilio G. Cota
3637cf58f9 util: move qemu_real_host_page_size/mask to osdep.h
These only depend on the host and therefore belong in the common
osdep, not in a target-dependent object.

While at it, query the host during an init constructor, which guarantees
the page size will be well-defined throughout the execution of the program.

Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 09:45:00 -07:00
Alexey Kardashevskiy
5e8fd947e2 memory: Rework "info mtree" to print flat views and dispatch trees
This adds a new "-d" switch to "info mtree" to print dispatch tree
internals.

This changes the way "-f" is handled - it prints now flat views and
associated address spaces.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-15-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:38 +02:00
Alexey Kardashevskiy
8629d3fcb7 memory: Rename mem_begin/mem_commit/mem_add helpers
This renames some helpers to reflect better what they do.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-9-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy
9950322a59 memory: Cleanup after switching to FlatView
We store AddressSpaceDispatch* in FlatView anyway so there is no need
to carry it from mem_add() to register_subpage/register_multipage.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-8-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy
166206845f memory: Switch memory from using AddressSpace to FlatView
FlatView's will be shared between AddressSpace's and subpage_t
and MemoryRegionSection cannot store AS anymore, hence this change.

In particular, for:

 typedef struct subpage_t {
     MemoryRegion iomem;
-    AddressSpace *as;
+    FlatView *fv;
     hwaddr base;
     uint16_t sub_section[];
 } subpage_t;

  struct MemoryRegionSection {
     MemoryRegion *mr;
-    AddressSpace *address_space;
+    FlatView *fv;
     hwaddr offset_within_region;
     Int128 size;
     hwaddr offset_within_address_space;
     bool readonly;
 };

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-7-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy
c775252378 memory: Remove AddressSpace pointer from AddressSpaceDispatch
AS in ASD is only used to pass AS from mem_begin() to register_subpage()
to store it in MemoryRegionSection, we can do this directly now.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-6-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00