Commit Graph

258 Commits

Author SHA1 Message Date
Peter Maydell
efa99a2ff8 Make flatview_translate() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to flatview_translate(); all its
callers now have attrs available.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-11-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
8372d38327 Make MemoryRegion valid.accepts callback take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to the MemoryRegion valid.accepts
callback. We'll need this for subpage_accepts().

We could take the approach we used with the read and write
callbacks and add new a new _with_attrs version, but since there
are so few implementations of the accepts hook we just change
them all.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-9-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
fddffa4268 Make address_space_access_valid() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_access_valid().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-6-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
f26404fbee Make address_space_map() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_map().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-5-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
bc6b1cec84 Make address_space_translate{, _cached}() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_translate()
and address_space_translate_cached(). Callers either have an
attrs value to hand, or don't care and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-4-peter.maydell@linaro.org
2018-05-31 14:50:52 +01:00
Peter Maydell
2ce931d012 memory.h: Improve IOMMU related documentation
Add more detail to the documentation for memory_region_init_iommu()
and other IOMMU-related functions and data structures.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20180521140402.23318-2-peter.maydell@linaro.org
2018-05-31 14:50:52 +01:00
Paolo Bonzini
48564041a7 exec: reintroduce MemoryRegion caching
MemoryRegionCache was reverted to "normal" address_space_* operations
for 2.9, due to lack of support for IOMMUs.  Reinstate the
optimizations, caching only the IOMMU translation at address_cache_init
but not the IOMMU lookup and target AddressSpace translation are not
cached; now that MemoryRegionCache supports IOMMUs, it becomes more widely
applicable too.

The inlined fast path is defined in memory_ldst_cached.inc.h, while the
slow path uses memory_ldst.inc.c as before.  The smaller fast path causes
a little code size reduction in MemoryRegionCache users:

    hw/virtio/virtio.o text size before: 32373
    hw/virtio/virtio.o text size after: 31941

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-09 00:13:38 +02:00
Paolo Bonzini
4269c82bf7 exec: move memory access declarations to a common header, inline *_phys functions
For now, this reduces the text size very slightly due to the newly-added
inlining:

   text size before: 9301965
   text size after: 9300645

Later, however, the declarations in include/exec/memory_ldst.inc.h will be
reused for the MemoryRegionCache slow path functions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-09 00:13:38 +02:00
Paolo Bonzini
b2a44fcad7 address_space_read: address_space_to_flatview needs RCU lock
address_space_read is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, push the rcu_read_lock/unlock
pair up from flatview_read_full to address_space_read's constant size
fast path and address_space_read_full.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06 14:01:28 +01:00
Paolo Bonzini
785a507ec7 memory: inline some performance-sensitive accessors
These accessors are called from inlined functions, and the call sequence
is much more expensive than just inlining the access.  Move the
struct declaration to memory-internal.h so that exec.c and memory.c
can both use an inline function.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06 14:01:27 +01:00
Marcel Apfelbaum
06329ccecf mem: add share parameter to memory-backend-ram
Currently only file backed memory backend can
be created with a "share" flag in order to allow
sharing guest RAM with other processes in the host.

Add the "share" flag also to RAM Memory Backend
in order to allow remapping parts of the guest RAM
to different host virtual addresses. This is needed
by the RDMA devices in order to remap non-contiguous
QEMU virtual addresses to a contiguous virtual address range.

Moved the "share" flag to the Host Memory base class,
modified phys_mem_alloc to include the new parameter
and a new interface memory_region_init_ram_shared_nomigrate.

There are no functional changes if the new flag is not used.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2018-02-19 13:03:24 +02:00
Paolo Bonzini
0fe1eca7dc memory: hide memory_region_sync_dirty_bitmap behind DirtyBitmapSnapshot
Simplify the users of memory_region_snapshot_and_clear_dirty, so
that they do not have to call memory_region_sync_dirty_bitmap
explicitly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-13 16:15:09 +01:00
Paolo Bonzini
77302fb5df memory: remove memory_region_test_and_clear_dirty
It is unused after g364fb has been converted to use DirtyBitmapSnapshot.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-13 16:15:09 +01:00
Peter Maydell
7b213bb475 * socket option parsing fix (Daniel)
* SCSI fixes (Fam)
 * Readline double-free fix (Greg)
 * More HVF attribution fixes (Izik)
 * WHPX (Windows Hypervisor Platform Extensions) support (Justin)
 * POLLHUP handler (Klim)
 * ivshmem fixes (Ladi)
 * memfd memory backend (Marc-André)
 * improved error message (Marcelo)
 * Memory fixes (Peter Xu, Zhecheng)
 * Remove obsolete code and comments (Peter M.)
 * qdev API improvements (Philippe)
 * Add CONFIG_I2C switch (Thomas)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJaexoYAAoJEL/70l94x66DVL0IAJC//aZCwwgyN9CRNDcOo10/
 UPtzprfezERkur77r1KvEYVNIfslRF6iTBou2+suOWkzoNL2LJ0XZ+wi+2u2sFIF
 ikvbQVk4dOWqJJQj7e1cmv5A2EZy2dcxjAoD1IG6CRy76+HzYqwjHVw+HkYY5CUS
 qwnUWjQddP6WtH9MsUHpX7p7atWo7T1tzkx4v8H+CIHBO3uUJQSZLkGYflvcstpj
 Fo04bZzSkDj2rnlqqBo/6UgJQXD8++Rs64vmiX2xwcK47TWO31Vbuwu+r8V9osWm
 LHFmRpL8ZkZfL0yqf0bpjmd688dirjVpHIJ5KE043Lo6AdI+K5xBfoBjXxtPiKE=
 =o90D
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* socket option parsing fix (Daniel)
* SCSI fixes (Fam)
* Readline double-free fix (Greg)
* More HVF attribution fixes (Izik)
* WHPX (Windows Hypervisor Platform Extensions) support (Justin)
* POLLHUP handler (Klim)
* ivshmem fixes (Ladi)
* memfd memory backend (Marc-André)
* improved error message (Marcelo)
* Memory fixes (Peter Xu, Zhecheng)
* Remove obsolete code and comments (Peter M.)
* qdev API improvements (Philippe)
* Add CONFIG_I2C switch (Thomas)

# gpg: Signature made Wed 07 Feb 2018 15:24:08 GMT
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (47 commits)
  Add the WHPX acceleration enlightenments
  Introduce the WHPX impl
  Add the WHPX vcpu API
  Add the Windows Hypervisor Platform accelerator.
  tests/test-filter-redirector: move close()
  tests: use memfd in vhost-user-test
  vhost-user-test: make read-guest-mem setup its own qemu
  tests: keep compiling failing vhost-user tests
  Add memfd based hostmem
  memfd: add hugetlbsize argument
  memfd: add hugetlb support
  memfd: add error argument, instead of perror()
  cpus: join thread when removing a vCPU
  cpus: hvf: unregister thread with RCU
  cpus: tcg: unregister thread with RCU, fix exiting of loop on unplug
  cpus: dummy: unregister thread with RCU, exit loop on unplug
  cpus: kvm: unregister thread with RCU
  cpus: hax: register/unregister thread with RCU, exit loop on unplug
  ivshmem: Disable irqfd on device reset
  ivshmem: Improve MSI irqfd error handling
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	cpus.c
2018-02-07 20:40:36 +00:00
Alexey Kardashevskiy
f1334de60b memory/iommu: Add get_attr()
This adds get_attr() to IOMMUMemoryRegionClass, like
iommu_ops::domain_get_attr in the Linux kernel.

This defines the first attribute - IOMMU_ATTR_SPAPR_TCE_FD - which
will be used between the pSeries machine and VFIO-PCI.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-02-06 11:08:24 -07:00
Jay Zhou
57914ecb06 memory: update comments and fix some typos
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Message-Id: <1515043788-38300-1-git-send-email-jianjay.zhou@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-05 13:54:38 +01:00
Haozhong Zhang
9837684316 hostmem-file: add "align" option
When mmap(2) the backend files, QEMU uses the host page size
(getpagesize(2)) by default as the alignment of mapping address.
However, some backends may require alignments different than the page
size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux
kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned,
fails with a kernel message like

[617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff)

Because there is no common approach to get such alignment requirement,
we add the 'align' option to 'memory-backend-file', so that users or
management utils, which have enough knowledge about the backend, can
specify a proper alignment via this option.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[ehabkost: fixed typo, fixed error_setg() format string]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-19 11:18:51 -02:00
Marc-André Lureau
e2fbe20851 memory: remove unused memory_region_set_global_locking()
This was never used since its introduction in commit
196ea13104 ("memory: Add global-locking property to memory
regions").

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18 17:07:02 +03:00
Paolo Bonzini
02d9651d6a memory: trace FlatView creation and destruction
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-22 01:06:51 +02:00
Alexey Kardashevskiy
b516572f31 memory: Get rid of address_space_init_shareable
Since FlatViews are shared now and ASes not, this gets rid of
address_space_init_shareable().

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-17-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-22 01:06:51 +02:00
Alexey Kardashevskiy
5e8fd947e2 memory: Rework "info mtree" to print flat views and dispatch trees
This adds a new "-d" switch to "info mtree" to print dispatch tree
internals.

This changes the way "-f" is handled - it prints now flat views and
associated address spaces.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-15-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:38 +02:00
Alexey Kardashevskiy
166206845f memory: Switch memory from using AddressSpace to FlatView
FlatView's will be shared between AddressSpace's and subpage_t
and MemoryRegionSection cannot store AS anymore, hence this change.

In particular, for:

 typedef struct subpage_t {
     MemoryRegion iomem;
-    AddressSpace *as;
+    FlatView *fv;
     hwaddr base;
     uint16_t sub_section[];
 } subpage_t;

  struct MemoryRegionSection {
     MemoryRegion *mr;
-    AddressSpace *address_space;
+    FlatView *fv;
     hwaddr offset_within_region;
     Int128 size;
     hwaddr offset_within_address_space;
     bool readonly;
 };

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-7-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy
66a6df1dc6 memory: Move AddressSpaceDispatch from AddressSpace to FlatView
As we are going to share FlatView's between AddressSpace's,
and AddressSpaceDispatch is a structure to perform quick lookup
in FlatView, this moves ASD to FlatView.

After previosly open coded ASD rendering, we can also remove
as->next_dispatch as the new FlatView pointer is stored
on a stack and set to an AS atomically.

flatview_destroy() is executed under RCU instead of
address_space_dispatch_free() now.

This makes mem_begin/mem_commit to work with ASD and mem_add with FV
as later on mem_add will be taking FV as an argument anyway.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-5-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Alexey Kardashevskiy
9a62e24f45 memory: Open code FlatView rendering
We are going to share FlatView's between AddressSpace's and per-AS
memory listeners won't suit the purpose anymore so open code
the dispatch tree rendering.

Since there is a good chance that dispatch_listener was the only
listener, this avoids address_space_update_topology_pass() if there is
no registered listeners; this should improve starting time.

This should cause no behavioural change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-3-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 23:19:37 +02:00
Peter Maydell
3114d092b1 memory.h: Move MemTxResult type to memattrs.h
Move the MemTxResult type to memattrs.h. We're going to want to
use it in cpu/qom.h, which doesn't want to include all of
memory.h. In practice MemTxResult and MemTxAttrs are pretty
closely linked since both are used for the new-style
read_with_attrs and write_with_attrs callbacks, so memattrs.h
is a reasonable home for this rather than creating a whole
new header file for it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
2017-09-04 15:21:54 +01:00
Peter Maydell
b08199c6fb memory.h: Add memory_region_init_{ram, rom, rom_device}() handling migration
Add new utility functions which both initialize a RAM
MemoryRegion and arrange for its contents to be migrated;
we give thes the memory_region_init_ram(), memory_region_init_rom()
and memory_region_init_rom_device() names that we just freed up
by renaming the old implementations to _nomigrate().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1499438577-7674-6-git-send-email-peter.maydell@linaro.org
2017-07-14 17:59:42 +01:00
Peter Maydell
b59821a95b memory: Rename memory_region_init_rom() and _rom_device() to _nomigrate()
Rename memory_region_init_rom() to memory_region_init_rom_nomigrate()
and memory_region_init_rom_device() to
memory_region_init_rom_device_nomigrate().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1499438577-7674-5-git-send-email-peter.maydell@linaro.org
2017-07-14 17:59:42 +01:00
Peter Maydell
1cfe48c1ce memory: Rename memory_region_init_ram() to memory_region_init_ram_nomigrate()
Rename memory_region_init_ram() to memory_region_init_ram_nomigrate().
This leaves the way clear for us to provide a memory_region_init_ram()
which does handle migration.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1499438577-7674-4-git-send-email-peter.maydell@linaro.org
2017-07-14 17:59:42 +01:00
Peter Maydell
a5c0234bb2 memory: Document that the RAM MR initializers do not handle migration
The various functions for initializing RAM MemoryRegions do not do
anything to cause the data in the MemoryRegion to be migrated.
Note in their documentation comments that this is the responsibility
of the caller.

(We will shortly add a new function that *does* do this for you.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1499438577-7674-3-git-send-email-peter.maydell@linaro.org
2017-07-14 17:59:42 +01:00
Alexey Kardashevskiy
1221a47467 memory/iommu: introduce IOMMUMemoryRegionClass
This finishes QOM'fication of IOMMUMemoryRegion by introducing
a IOMMUMemoryRegionClass. This also provides a fastpath analog for
IOMMU_MEMORY_REGION_GET_CLASS().

This makes IOMMUMemoryRegion an abstract class.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170711035620.4232-3-aik@ozlabs.ru>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:41 +02:00
Alexey Kardashevskiy
3df9d74806 memory/iommu: QOM'fy IOMMU MemoryRegion
This defines new QOM object - IOMMUMemoryRegion - with MemoryRegion
as a parent.

This moves IOMMU-related fields from MR to IOMMU MR. However to avoid
dymanic QOM casting in fast path (address_space_translate, etc),
this adds an @is_iommu boolean flag to MR and provides new helper to
do simple cast to IOMMU MR - memory_region_get_iommu. The flag
is set in the instance init callback. This defines
memory_region_is_iommu as memory_region_get_iommu()!=NULL.

This switches MemoryRegion to IOMMUMemoryRegion in most places except
the ones where MemoryRegion may be an alias.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20170711035620.4232-2-aik@ozlabs.ru>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:41 +02:00
KONRAD Frederic
c935674635 exec: allow to get a pointer for some mmio memory region
This introduces a special callback which allows to run code from some MMIO
devices.

SysBusDevice with a MemoryRegion which implements the request_ptr callback will
be notified when the guest try to execute code from their offset. Then it will
be able to eg: pre-load some code from an SPI device or ask a pointer from an
external simulator, etc..

When the pointer or the data in it are no longer valid the device has to
invalidate it.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2017-06-27 15:09:15 +02:00
Marc-André Lureau
6b9911d0b6 memory: remove memory_region_set_fd
Now unnecessary since ivshmem uses memory_region_init_ram_from_fd.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170602141229.15326-7-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-15 11:04:05 +02:00
Marc-André Lureau
fea617c58b Add memory_region_init_ram_from_fd()
Add a new function to initialize a RAM memory region with a file
descriptor to be mmap-ed.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170602141229.15326-5-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-15 11:04:05 +02:00
Juan Quintela
e8758b6229 trivial: Remove unneeded ifndef in memory.h
All the file is surounded already by #ifndef CONFIG_USER_ONLY.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-06-04 18:42:55 +03:00
Peter Xu
ad523590f6 memory: remove the last param in memory_region_iommu_replay()
We were always passing in that one as "false" to assume that's an read
operation, and we also assume that IOMMU translation would always have
that read permission. A better permission would be IOMMU_NONE since the
replay is after all not a real read operation, but just a page table
rebuilding process.

CC: David Gibson <david@gibson.dropbear.id.au>
CC: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
bf55b7afce memory: tune last param of iommu_ops.translate()
This patch converts the old "is_write" bool into IOMMUAccessFlags. The
difference is that "is_write" can only express either read/write, but
sometimes what we really want is "none" here (neither read nor write).
Replay is an good example - during replay, we should not check any RW
permission bits since thats not an actual IO at all.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Gerd Hoffmann
8deaf12ca1 memory: add support getting and using a dirty bitmap copy.
This patch adds support for getting and using a local copy of the dirty
bitmap.

memory_region_snapshot_and_clear_dirty() will create a snapshot of the
dirty bitmap for the specified range, clear the dirty bitmap and return
the copy.  The returned bitmap can be a bit larger than requested, the
range is expanded so the code can copy unsigned longs from the bitmap
and avoid atomic bit update operations.

memory_region_snapshot_get_dirty() will return the dirty status of
pages, pretty much like memory_region_get_dirty(), but using the copy
returned by memory_region_copy_and_clear_dirty().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-3-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Peter Xu
f06a696dc9 intel_iommu: provide its own replay() callback
The default replay() don't work for VT-d since vt-d will have a huge
default memory region which covers address range 0-(2^64-1). This will
normally consumes a lot of time (which looks like a dead loop).

The solution is simple - we don't walk over all the regions. Instead, we
jump over the regions when we found that the page directories are empty.
It'll greatly reduce the time to walk the whole region.

To achieve this, we provided a page walk helper to do that, invoking
corresponding hook function when we found an page we are interested in.
vtd_page_walk_level() is the core logic for the page walking. It's
interface is designed to suite further use case, e.g., to invalidate a
range of addresses.

Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-8-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
faa362e3cc memory: add MemoryRegionIOMMUOps.replay() callback
Originally we have one memory_region_iommu_replay() function, which is
the default behavior to replay the translations of the whole IOMMU
region. However, on some platform like x86, we may want our own replay
logic for IOMMU regions. This patch adds one more hook for IOMMUOps for
the callback, and it'll override the default if set.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-6-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
bd2bfa4c52 memory: introduce memory_region_notify_one()
Generalizing the notify logic in memory_region_notify_iommu() into a
single function. This can be further used in customized replay()
functions for IOMMUs.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-5-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
de472e4a92 memory: provide iommu_replay_all()
This is an "global" version of existing memory_region_iommu_replay() -
we announce the translations to all the registered notifiers, instead of
a specific one.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-4-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
512fa40867 memory: provide IOMMU_NOTIFIER_FOREACH macro
A new macro is provided to iterate all the IOMMU notifiers hooked
under specific IOMMU memory region.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-3-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
698feb5e13 memory: add section range info for IOMMU notifier
In this patch, IOMMUNotifier.{start|end} are introduced to store section
information for a specific notifier. When notification occurs, we not
only check the notification type (MAP|UNMAP), but also check whether the
notified iova range overlaps with the range of specific IOMMU notifier,
and skip those notifiers if not in the listened range.

When removing an region, we need to make sure we removed the correct
VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.

This patch is solving the problem that vfio-pci devices receive
duplicated UNMAP notification on x86 platform when vIOMMU is there. The
issue is that x86 IOMMU has a (0, 2^64-1) IOMMU region, which is
splitted by the (0xfee00000, 0xfeefffff) IRQ region. AFAIK
this (splitted IOMMU region) is only happening on x86.

This patch also helps vhost to leverage the new interface as well, so
that vhost won't get duplicated cache flushes. In that sense, it's an
slight performance improvement.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-2-git-send-email-peterx@redhat.com>
[ehabkost: included extra vhost_iommu_region_del() change from Peter Xu]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Paolo Bonzini
90c4fe5fc5 exec: revert MemoryRegionCache
MemoryRegionCache did not know about virtio support for IOMMUs (because the
two features were developed at the same time).  Revert MemoryRegionCache
to "normal" address_space_* operations for 2.9, as it is simpler than
undoing the virtio patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-03 13:41:53 +02:00
Dr. David Alan Gilbert
e8f5fe2de1 memory_region: Fix name comments
The 'name' parameter to memory_region_init_* had been marked as debug
only, however vmstate_region_ram uses it as a parameter to
qemu_ram_set_idstr to set RAMBlock names and these form part of the
migration stream.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170309152708.30635-1-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Paolo Bonzini
5eba0404b9 virtio: use MemoryRegionCache to access descriptors
For now, the cache is created on every virtqueue_pop.  Later on,
direct descriptors will be able to reuse it.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Peter Xu
57bb40c9db memory: hmp: add "-f" for "info mtree"
Adding one more option "-f" for "info mtree" to dump the flat views of
all the address spaces.

This will be useful to debug the memory rendering logic, also it'll be
much easier with it to know what memory region is handling what address
range.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1484556005-29701-3-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27 18:07:31 +01:00
Paolo Bonzini
0987d735a3 ramblock-notifier: new
This adds a notify interface of ram block additions and removals.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-16 17:52:35 +01:00
Jason Wang
12d37882f0 memory: handle alias in memory_region_is_iommu()
Cc: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-01-10 05:56:59 +02:00
Jason Wang
052c8fa998 exec: introduce address_space_get_iotlb_entry()
This patch introduces a helper to query the iotlb entry for a
possible iova. This will be used by later device IOTLB API to enable
the capability for a dataplane (e.g vhost) to query the IOTLB.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 05:56:58 +02:00
Paolo Bonzini
1f4e496e1f exec: introduce MemoryRegionCache
Device models often have to perform multiple access to a single
memory region that is known in advance, but would to use "DMA-style"
functions instead of address_space_map/unmap.  This can happen
for example when the data has to undergo endianness conversion.
Introduce a new data structure to cache the result of
address_space_translate without forcing usage of a host address
like address_space_map does.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:23 +01:00
Paolo Bonzini
0ce265ffef exec: introduce memory_ldst.inc.c
Templatize the address_space_* and *_phys functions, so that we can add
similar functions in the next patch that work with a lightweight,
cache-like version of address_space_map/unmap.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:23 +01:00
Alex Williamson
4a2e242bbb memory: Don't use memcpy for ram_device regions
With a vfio assigned device we lay down a base MemoryRegion registered
as an IO region, giving us read & write accessors.  If the region
supports mmap, we lay down a higher priority sub-region MemoryRegion
on top of the base layer initialized as a RAM device pointer to the
mmap.  Finally, if we have any quirks for the device (ie. address
ranges that need additional virtualization support), we put another IO
sub-region on top of the mmap MemoryRegion.  When this is flattened,
we now potentially have sub-page mmap MemoryRegions exposed which
cannot be directly mapped through KVM.

This is as expected, but a subtle detail of this is that we end up
with two different access mechanisms through QEMU.  If we disable the
mmap MemoryRegion, we make use of the IO MemoryRegion and service
accesses using pread and pwrite to the vfio device file descriptor.
If the mmap MemoryRegion is enabled and results in one of these
sub-page gaps, QEMU handles the access as RAM, using memcpy to the
mmap.  Using either pread/pwrite or the mmap directly should be
correct, but using memcpy causes us problems.  I expect that not only
does memcpy not necessarily honor the original width and alignment in
performing a copy, but it potentially also uses processor instructions
not intended for MMIO spaces.  It turns out that this has been a
problem for Realtek NIC assignment, which has such a quirk that
creates a sub-page mmap MemoryRegion access.

To resolve this, we disable memory_access_is_direct() for ram_device
regions since QEMU assumes that it can use memcpy for those regions.
Instead we access through MemoryRegionOps, which replaces the memcpy
with simple de-references of standard sizes to the host memory.

With this patch we attempt to provide unrestricted access to the RAM
device, allowing byte through qword access as well as unaligned
access.  The assumption here is that accesses initiated by the VM are
driven by a device specific driver, which knows the device
capabilities.  If unaligned accesses are not supported by the device,
we don't want them to work in a VM by performing multiple aligned
accesses to compose the unaligned access.  A down-side of this
philosophy is that the xp command from the monitor attempts to use
the largest available access weidth, unaware of the underlying
device.  Using memcpy had this same restriction, but at least now an
operator can dump individual registers, even if blocks of device
memory may result in access widths beyond the capabilities of a
given device (RTL NICs only support up to dword).

Reported-by: Thorsten Kohfeldt <thorsten.kohfeldt@gmx.de>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 09:53:03 -06:00
Alex Williamson
21e00fa55f memory: Replace skip_dump flag with "ram_device"
Setting skip_dump on a MemoryRegion allows us to modify one specific
code path, but the restriction we're trying to address encompasses
more than that.  If we have a RAM MemoryRegion backed by a physical
device, it not only restricts our ability to dump that region, but
also affects how we should manipulate it.  Here we recognize that
MemoryRegions do not change to sometimes allow dumps and other times
not, so we replace setting the skip_dump flag with a new initializer
so that we know exactly the type of region to which we're applying
this behavior.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 09:53:03 -06:00
Paolo Bonzini
9a54635dcb memory: add a per-AddressSpace list of listeners
This speeds up MEMORY_LISTENER_CALL noticeably.  Right now,
with many PCI devices you have N regions added to M AddressSpaces
(M = # PCI devices with bus-master enabled) and each call looks
up the whole listener list, with at least M listeners in it.
Because most of the regions in N are BARs, which are also roughly
proportional to M, the whole thing is O(M^3).  This changes it
to O(M^2), which is the best we can do without rewriting the
whole thing.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:19 +02:00
Paolo Bonzini
d45fa784cd memory: eliminate global MemoryListeners
There is none, so just drop the code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:19 +02:00
Paolo Bonzini
9c1f8f4493 migration: sync all address spaces
Migrating a VM during reboot sometimes results in differences
between the source and destination in the SMRAM area.

This is because migration_bitmap_sync() only fetches from KVM
the dirty log of address_space_memory.  SMRAM memory slots
are ignored and the modifications to SMRAM are not sent to the
destination.

Reported-by: He Rongguang <herongguang.he@huawei.com>
Reviewed-by: He Rongguang <herongguang.he@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:29 +02:00
Peter Xu
5bf3d31903 memory: introduce IOMMUOps.notify_flag_changed
The new interface can be used to replace the old notify_started() and
notify_stopped(). Meanwhile it provides explicit flags so that IOMMUs
can know what kind of notifications it is requested for.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1474606948-14391-3-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 09:00:04 +02:00
Peter Xu
cdb3081269 memory: introduce IOMMUNotifier and its caps
IOMMU Notifier list is used for notifying IO address mapping changes.
Currently VFIO is the only user.

However it is possible that future consumer like vhost would like to
only listen to part of its notifications (e.g., cache invalidations).

This patch introduced IOMMUNotifier and IOMMUNotfierFlag bits for a
finer grained control of it.

IOMMUNotifier contains a bitfield for the notify consumer describing
what kind of notification it is interested in. Currently two kinds of
notifications are defined:

- IOMMU_NOTIFIER_MAP:    for newly mapped entries (additions)
- IOMMU_NOTIFIER_UNMAP:  for entries to be removed (cache invalidates)

When registering the IOMMU notifier, we need to specify one or multiple
types of messages to listen to.

When notifications are triggered, its type will be checked against the
notifier's type bits, and only notifiers with registered bits will be
notified.

(For any IOMMU implementation, an in-place mapping change should be
 notified with an UNMAP followed by a MAP.)

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1474606948-14391-2-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 08:59:16 +02:00
Peter Maydell
39e0b03dec memory: Assert that memory_region_init_rom_device() ops aren't NULL
It doesn't make sense to pass a NULL ops argument to
memory_region_init_rom_device(), because the effect will
be that if the guest tries to write to the memory region
then QEMU will segfault. Catch the bug earlier by sanity
checking the arguments to this function, and remove the
misleading documentation that suggests that passing NULL
might be sensible.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1467122287-24974-4-git-send-email-peter.maydell@linaro.org
2016-07-04 13:06:35 +01:00
Peter Maydell
a1777f7f64 memory: Provide memory_region_init_rom()
Provide a new helper function memory_region_init_rom() for memory
regions which are read-only (and unlike those created by
memory_region_init_rom_device() don't have special behaviour
for writes). This has the same behaviour as calling
memory_region_init_ram() and then memory_region_set_readonly()
(which is what we do today in boards with pure ROMs) but is a
more easily discoverable API for the purpose.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1467122287-24974-2-git-send-email-peter.maydell@linaro.org
2016-07-04 13:06:35 +01:00
Alexey Kardashevskiy
d22d8956b1 memory: Add MemoryRegionIOMMUOps.notify_started/stopped callbacks
The IOMMU driver may change behavior depending on whether a notifier
client is present.  In the case of POWER, this represents a change in
the visibility of the IOTLB, for other drivers such as intel-iommu and
future AMD-Vi emulation, notifier support is not yet enabled and this
provides the opportunity to flag that incompatibility.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[new log & extracted from [PATCH qemu v17 12/12] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping listening]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-06-30 13:00:23 -06:00
Alexey Kardashevskiy
f682e9c244 memory: Add reporting of supported page sizes
Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate
uses when translating, however this information is not available outside
the translate context for various checks.

This adds a get_min_page_size callback to MemoryRegionIOMMUOps and
a wrapper for it so IOMMU users (such as VFIO) can know the minimum
actual page size supported by an IOMMU.

As IOMMU MR represents a guest IOMMU, this uses TARGET_PAGE_SIZE
as fallback.

This removes vfio_container_granularity() and uses new helper in
memory_region_iommu_replay() when replaying IOMMU mappings on added
IOMMU memory region.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
[dwg: Removed an unnecessary calculation]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 11:13:09 +10:00
Paolo Bonzini
0878d0e11b exec: hide mr->ram_addr from qemu_get_ram_ptr users
Let users of qemu_get_ram_ptr and qemu_ram_ptr_length pass in an
address that is relative to the MemoryRegion.  This basically means
what address_space_translate returns.

Because the semantics of the second parameter change, rename the
function to qemu_map_ram_ptr.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
07bdaa4196 memory: split memory_region_from_host from qemu_ram_addr_from_host
Move the old qemu_ram_addr_from_host to memory_region_from_host and
make it return an offset within the region.  For qemu_ram_addr_from_host
return the ram_addr_t directly, similar to what it was before
commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host",
2013-07-04).

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
4ff87573df memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr
Remove direct uses of ram_addr_t and optimize memory_region_{get,set}_fd
now that a MemoryRegion knows its RAMBlock directly.

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Fam Zheng
b613597819 memory: Remove code for mr->may_overlap
The collision check does nothing and hasn't been used. Remove the
variable together with related code.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1458900629-2334-2-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:44 +02:00
Paolo Bonzini
a7d6039cb3 cpu: move endian-dependent load/store functions to cpu-all.h
Disentangle cpu-common.h and memory.h from NEED_CPU_H.  Prototypes are
not defined for !NEED_CPU_H, so remove them from poison.h too.  Only
macros need poisoning.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Markus Armbruster
14b6d44d47 Use scripts/clean-includes to drop redundant qemu/typedefs.h
Re-run scripts/clean-includes to apply the previous commit's
corrections and updates.  Besides redundant qemu/typedefs.h, this only
finds a redundant config-host.h include in ui/egl-helpers.c.  No idea
how that escaped the previous runs.

Some manual whitespace trimming around dropped includes squashed in.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:16 +01:00
Fam Zheng
8e41fb63c5 memory: Drop MemoryRegion.ram_addr
All references to mr->ram_addr are replaced by
memory_region_get_ram_addr(mr) (except for a few assertions that are
replaced with mr->ram_block).

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-5-git-send-email-famz@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:29 +01:00
Fam Zheng
7ebb2745ac memory: Implement memory_region_get_ram_addr with mr->ram_block
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-4-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:18:28 +01:00
Peter Maydell
586fc27e6a * Asynchronous dump-guest-memory from Peter
* improved logging with -D -daemonize from Dimitris
 * more address_space_* optimization from Gonglei
 * TCG xsave/xrstor thinko fix
 * chardev bugfix and documentation patch
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJWzxnbAAoJEL/70l94x66DlfkIAJyo9kOareVLOnAE8ccayghk
 0SbU1ZR9etgdeH4vZIUKzFzSg86pnqqub/w9yxgNG35PsiXVOnzgYSv6W1qsXdVE
 v32u6c+vWHvHc3cCOFI5+6nURftf2+p/vB2VFXiI23VUbhs22UAjXsUdfbp321X5
 Krme2fxk0kmwPHoKiyek0qiXa8nt0fiuFzU7DN7gTQMoDFaDEqvcULlNJHUFnep+
 M1yQfhSzrD97bafPwmIDU0LejJxrzR6phuzWedugU1tay6y3pD85wQqHdFI7fn/o
 dC4F+vg41Lc/5jKUgRMpe5FmX4VM+DvRdYgZZsp9/SkBM7DuL7crhVWVwvQj82Y=
 =8BvG
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Asynchronous dump-guest-memory from Peter
* improved logging with -D -daemonize from Dimitris
* more address_space_* optimization from Gonglei
* TCG xsave/xrstor thinko fix
* chardev bugfix and documentation patch

# gpg: Signature made Thu 25 Feb 2016 15:12:27 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream:
  target-i386: fix confusion in xcr0 bit position vs. mask
  chardev: Properly initialize ChardevCommon components
  memory: Remove unreachable return statement
  memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length
  exec: store RAMBlock pointer into memory region
  log: Redirect stderr to logfile if deamonized
  dump-guest-memory: add qmp event DUMP_COMPLETED
  Dump: add hmp command "info dump"
  Dump: add qmp command "query-dump"
  DumpState: adding total_size and written_size fields
  dump-guest-memory: add "detach" support
  dump-guest-memory: disable dump when in INMIGRATE state
  dump-guest-memory: introduce dump_process() helper function.
  dump-guest-memory: add dump_in_progress() helper function
  dump-guest-memory: using static DumpState, add DumpStatus
  dump-guest-memory: add "detach" flag for QMP/HMP interfaces.
  dump-guest-memory: cleanup: removing dump_{error|cleanup}().
  scripts/kvm/kvm_stat: Fix missing right parantheses and ".format(...)"
  qemu-options.hx: Improve documentation of chardev multiplexing mode

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-25 15:30:57 +00:00
Gonglei
d61524486c memory: Remove unreachable return statement
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1455935721-8804-4-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:29 +01:00
Gonglei
3655cb9c73 memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length
these two functions consume too much cpu overhead to
find the RAMBlock by ram address.

After this patch, we can pass the RAMBlock pointer
to them so that they don't need to find the RAMBlock
anymore most of the time. We can get better performance
in address translation processing.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1455935721-8804-3-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:29 +01:00
Gonglei
58eaa2174e exec: store RAMBlock pointer into memory region
Each RAM memory region has a unique corresponding RAMBlock.
In the current realization, the memory region only stored
the ram_addr which means the offset of RAM address space,
We need to qurey the global ram.list to find the ram block
by ram_addr if we want to get the ram block, which is very
expensive.

Now, we store the RAMBlock pointer into memory region
structure. So, if we know the mr, we can easily get the
RAMBlock.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1456130097-4208-2-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:26 +01:00
Peter Maydell
90ce6e2644 include: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

NB: If this commit breaks compilation for your out-of-tree
patchseries or fork, then you need to make sure you add
#include "qemu/osdep.h" to any new .c files that you have.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-23 12:43:05 +00:00
Peter Crosthwaite
f0c02d15b5 memory: Add address_space_init_shareable()
This will either create a new AS or return a pointer to an
already existing equivalent one, if we have already created
an AS for the specified root memory region.

The motivation is to reuse address spaces as much as possible.
It's going to be quite common that bus masters out in device land
have pointers to the same memory region for their mastering yet
each will need to create its own address space. Let the memory
API implement sharing for them.

Aside from the perf optimisations, this should reduce the amount
of redundant output on info mtree as well.

Thee returned value will be malloced, but the malloc will be
automatically freed when the AS runs out of refs.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[PMM: dropped check for NULL root as unused; added doc-comment;
 squashed Peter C's reference-counting patch into this one;
 don't compare name string when deciding if we can share ASes;
 read as->malloced before the unref of as->root to avoid possible
 read-after-free if as->root was the owner of as]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:06 +00:00
Paolo Bonzini
3cc8f88499 memory: try to inline constant-length reads
memcpy can take a large amount of time for small reads and writes.
Handle the common case of reading s/g descriptors from memory (there
is no corresponding "write" case that is as common, because writes
often use address_space_st* functions) by inlining the relevant
parts of address_space_read into the caller.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
1619d1fe73 memory: inline a few small accessors
These are used in the address_space_* fast paths.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
a203ac702e memory: extract first iteration of address_space_read and address_space_write
We want to inline the case where there is only one iteration, because
then the compiler can also inline the memcpy.  As a start, extract
everything after the first address_space_translate call.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
612263cf33 memory: avoid unnecessary object_ref/unref
For the common case of DMA into non-hotplugged RAM, it is unnecessary
but expensive to do object_ref/unref.  Add back an owner field to
MemoryRegion, so that these memory regions can skip the reference
counting.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
Paolo Bonzini
a676854f34 memory: reorder MemoryRegion fields
Order fields so that all fields accessed during a RAM read/write fit in
the same cache line.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
Paolo Bonzini
49b24afcb1 exec: always call qemu_get_ram_ptr within rcu_read_lock
Simplify the code and document the assumption.  The only caller
that is not within rcu_read_lock is memory_region_get_ram_ptr.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
David Gibson
a788f227ef memory: Allow replay of IOMMU mapping notifications
When we have guest visible IOMMUs, we allow notifiers to be registered
which will be informed of all changes to IOMMU mappings.  This is used by
vfio to keep the host IOMMU mappings in sync with guest IOMMU mappings.

However, unlike with a memory region listener, an iommu notifier won't be
told about any mappings which already exist in the (guest) IOMMU at the
time it is registered.  This can cause problems if hotplugging a VFIO
device onto a guest bus which had existing guest IOMMU mappings, but didn't
previously have an VFIO devices (and hence no host IOMMU mappings).

This adds a memory_region_iommu_replay() function to handle this case.  It
replays any existing mappings in an IOMMU memory region to a specified
notifier.  Because the IOMMU memory region doesn't internally remember the
granularity of the guest IOMMU it has a small hack where the caller must
specify a granularity at which to replay mappings.

If there are finer mappings in the guest IOMMU these will be reported in
the iotlb structures passed to the notifier which it must handle (probably
causing it to flag an error).  This isn't new - the VFIO iommu notifier
must already handle notifications about guest IOMMU mappings too short
for it to represent in the host IOMMU.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05 12:39:03 -06:00
Veres Lajos
67cc32ebfd typofixes - v4
Signed-off-by: Veres Lajos <vlajos@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-11 10:45:43 +03:00
Daniel P. Berrange
b6af097528 maint: remove / fix many doubled words
Many source files have doubled words (eg "the the", "to to",
and so on). Most of these can simply be removed, but a couple
were actual mis-spellings (eg "to to" instead of "to do").
There was even one triple word score "to to to" :-)

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-11 10:21:38 +03:00
Pavel Fedin
6d6d2abf2c Merge memory_region_init_reservation() into memory_region_init_io()
Just specifying ops = NULL in some cases can be more convenient than having
two functions.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 78a379ab1b6b30ab497db7971ad336dad1dbee76.1438758065.git.p.fedin@samsung.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-13 11:26:21 +01:00
Paolo Bonzini
deb809edb8 memory: count number of active VGA logging clients
For a board that has multiple framebuffer devices, both of them
might want to use DIRTY_MEMORY_VGA on the same memory region.
The lack of reference counting in memory_region_set_log makes
this very awkward to implement.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-24 13:57:45 +02:00
Peter Maydell
fba0a593b2 Stop including qemu-common.h in memory.h
Including qemu-common.h from other header files is generally a bad
idea, because it means it's very easy to end up with a circular
dependency. For instance, if we wanted to include memory.h from
qom/cpu.h we'd end up with this loop:
 memory.h -> qemu-common.h -> cpu.h -> cpu-qom.h -> qom/cpu.h -> memory.h

Remove the include from memory.h. This requires us to fix up a few
other files which were inadvertently getting declarations indirectly
through memory.h.

The biggest change is splitting the fprintf_function typedef out
into its own header so other headers can get at it without having
to include qemu-common.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1435933104-15216-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06 14:59:09 +02:00
Jan Kiszka
196ea13104 memory: Add global-locking property to memory regions
This introduces the memory region property "global_locking". It is true
by default. By setting it to false, a device model can request BQL-free
dispatching of region accesses to its r/w handlers. The actual BQL
break-up will be provided in a separate patch.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Frederic Konrad <fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-4-git-send-email-pbonzini@redhat.com>
2015-07-01 15:45:50 +02:00
Paolo Bonzini
677e7805cf memory: track DIRTY_MEMORY_CODE in mr->dirty_log_mask
DIRTY_MEMORY_CODE is only needed for TCG.  By adding it directly to
mr->dirty_log_mask, we avoid testing for TCG everywhere a region is
checked for the enabled/disabled state of dirty logging.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
b2dfd71c48 memory: prepare for multiple bits in the dirty log mask
When the dirty log mask will also cover other bits than DIRTY_MEMORY_VGA,
some listeners may be interested in the overall zero/non-zero value of
the dirty log mask; others may be interested in the value of single bits.

For this reason, always call log_start/log_stop if bits have respectively
appeared or disappeared, and pass the old and new values of the dirty log
mask so that listeners can distinguish the kinds of change.

For example, KVM checks if dirty logging used to be completely disabled
(in log_start) or is now completely disabled (in log_stop).  On the
other hand, Xen has to check manually if DIRTY_MEMORY_VGA changed,
since that is the only bit it cares about.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
2d1a35bef0 memory: differentiate memory_region_is_logging and memory_region_get_dirty_log_mask
For now memory regions only track DIRTY_MEMORY_VGA individually, but
this will change soon.  To support this, split memory_region_is_logging
in two functions: one that returns a given bit from dirty_log_mask,
and one that returns the entire mask.  memory_region_is_logging gets an
extra parameter so that the compiler flags misuse.

While VGA-specific users (including the Xen listener!) will want to keep
checking that bit, KVM and vhost check for "any bit except migration"
(because migration is handled via the global start/stop listener
callbacks).

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:58 +02:00
Paolo Bonzini
dbddac6da0 memory: the only dirty memory flag for users is DIRTY_MEMORY_VGA
DIRTY_MEMORY_MIGRATION is triggered by memory_global_dirty_log_start
and memory_global_dirty_log_stop, so it cannot be used with
memory_region_set_log.

Specify this in the documentation and assert it.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:58 +02:00
Paolo Bonzini
41063e1e7a exec: move rcu_read_lock/unlock to address_space_translate callers
Once address_space_translate will be called outside the BQL, the returned
MemoryRegion might disappear as soon as the RCU read-side critical section
ends.  Avoid this by moving the critical section to the callers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1426684909-95030-3-git-send-email-pbonzini@redhat.com>
2015-04-30 16:55:32 +02:00
Peter Maydell
06feaacfb4 - miscellaneous cleanups for TCG (Emilio) and NBD (Bogdan)
- next part in the thread-safe address_space_* saga: atomic access
   to the bounce buffer and the map_clients list, from Fam
 - optional support for linking with tcmalloc, also from Fam
 - reapplying Peter Crosthwaite's "Respect as_translate_internal
   length clamp" after fixing the SPARC fallout.
 - build system fix from Wei Liu
 - small acpi-build and ioport cleanup by myself
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJVQJd4AAoJEL/70l94x66DYFYH/3ifhqWZsd4dfJri0CGAHI4i
 SpPmNeouc8W+F/3lwf6Inrh5NnTgd5QzoUBMQaWVkQKwUiWls8g2mXkT3jo0iDqT
 /B40YXnZjNm20MixNaZmk9AsOF6OqPM8EMufau874k5zTlx3tCGAW1QD+I1N7WK7
 DfsFsIUD1svo2prn55fSoitMG1TIVPnpcklb4YGJRbAacQYUDhr5KAIhT1quDR2R
 93BvToyQmPqRQ4YKqnJLp8HAkL4FaJumfFZVvyh2cZvyaYGN/RVdi2Dw985dJDPX
 /z4enE4GCAs4RDw3lZ1RDbiZDqpT2ibFgASg/arX3SxzqHirOGvMdkOjO99r9j4=
 =aLjh
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

- miscellaneous cleanups for TCG (Emilio) and NBD (Bogdan)
- next part in the thread-safe address_space_* saga: atomic access
  to the bounce buffer and the map_clients list, from Fam
- optional support for linking with tcmalloc, also from Fam
- reapplying Peter Crosthwaite's "Respect as_translate_internal
  length clamp" after fixing the SPARC fallout.
- build system fix from Wei Liu
- small acpi-build and ioport cleanup by myself

# gpg: Signature made Wed Apr 29 09:34:00 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (22 commits)
  nbd/trivial: fix type cast for ioctl
  translate-all: use bitmap helpers for PageDesc's bitmap
  target-i386: disable LINT0 after reset
  Makefile.target: prepend $libs_softmmu to $LIBS
  milkymist: do not modify libs-softmmu
  configure: Add support for tcmalloc
  exec: Respect as_translate_internal length clamp
  ioport: reserve the whole range of an I/O port in the AddressSpace
  ioport: loosen assertions on emulation of 16-bit ports
  ioport: remove wrong comment
  ide: there is only one data port
  gus: clean up MemoryRegionPortio
  sb16: remove useless mixer_write_indexw
  sun4m: fix slavio sysctrl and led register sizes
  acpi-build: remove dependency from ram_addr.h
  memory: add memory_region_ram_resize
  dma-helpers: Fix race condition of continue_after_map_failure and dma_aio_cancel
  exec: Notify cpu_register_map_client caller if the bounce buffer is available
  exec: Protect map_client_list with mutex
  linux-user, bsd-user: Remove two calls to cpu_exec_init_all
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-04-30 12:04:11 +01:00
Paolo Bonzini
37d7c08413 memory: add memory_region_ram_resize
This is a simple MemoryRegion wrapper for qemu_ram_resize.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-27 18:24:18 +02:00
Peter Maydell
500131154d exec.c: Add new address_space_ld*/st* functions
Add new address_space_ld*/st* functions which allow transaction
attributes and error reporting for basic load and stores. These
are named to be in line with the address_space_read/write/rw
buffer operations.

The existing ld/st*_phys functions are now wrappers around
the new functions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:24 +01:00
Peter Maydell
5c9eb0286c exec.c: Make address_space_rw take transaction attributes
Make address_space_rw take transaction attributes, rather
than always using the 'unspecified' attributes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:24 +01:00
Peter Maydell
3b64349539 memory: Replace io_mem_read/write with memory_region_dispatch_read/write
Rather than retaining io_mem_read/write as simple wrappers around
the memory_region_dispatch_read/write functions, make the latter
public and change all the callers to use them, since we need to
touch all the callsites anyway to add MemTxAttrs and MemTxResult
support. Delete io_mem_read and io_mem_write entirely.

(All the callers currently pass MEMTXATTRS_UNSPECIFIED
and convert the return value back to bool or ignore it.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:23 +01:00
Peter Maydell
cc05c43ad9 memory: Define API for MemoryRegionOps to take attrs and return status
Define an API so that devices can register MemoryRegionOps whose read
and write callback functions are passed an arbitrary pointer to some
transaction attributes and can return a success-or-failure status code.
This will allow us to model devices which:
 * behave differently for ARM Secure/NonSecure memory accesses
 * behave differently for privileged/unprivileged accesses
 * may return a transaction failure (causing a guest exception)
   for erroneous accesses

This patch defines the new API and plumbs the attributes parameter through
to the memory.c public level functions io_mem_read() and io_mem_write(),
where it is currently dummied out.

The success/failure response indication is also propagated out to
io_mem_read() and io_mem_write(), which retain the old-style
boolean true-for-error return.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:23 +01:00
Paolo Bonzini
374f2981d1 memory: protect current_map by RCU
Replace the flat_view_mutex with RCU, avoiding futex contention for
dataplane on large systems and many iothreads.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-02 16:55:10 +01:00
Michael S. Tsirkin
60786ef339 memory: API to allocate resizeable RAM MR
Add API to allocate resizeable RAM MR.

This looks just like regular RAM generally, but
has a special property that only a portion of it
(used_length) is actually used, and migrated.

This used_length size can change across reboots.

Follow up patches will change used_length for such blocks at migration,
making it easier to extend devices using such RAM (notably ACPI,
but in the future thinkably other ROMs) without breaking migration
compatibility or wasting ROM (guest) memory.

Device is notified on resize, so it can adjust if necessary.

Note: nothing prevents making all RAM resizeable in this way.
However, reviewers felt that only enabling this selectively will
make some class of errors easier to detect.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-08 13:17:55 +02:00
Michael S. Tsirkin
e7af4c6730 memory: add memory_region_set_size
Add API to change MR size.
Will be used internally for RAM resize.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-08 13:17:54 +02:00
Igor Mammedov
a2b257d621 memory: expose alignment used for allocating RAM as MemoryRegion API
introduce memory_region_get_alignment() that returns
underlying memory block alignment or 0 if it's not
relevant/implemented for backend.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-23 12:11:30 +02:00
Nikunj A Dadhania
e4dc3f5909 Add skip_dump flag to ignore memory region during dump
The PCI MMIO might be disabled or the device in the reset state.
Make sure we do not dump these memory regions.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-10-31 11:29:01 +01:00
Hu Tao
33e0eb5297 memory: add parameter errp to memory_region_init_rom_device
Add parameter errp to memory_region_init_rom_device and update all call
sites to propagate the error.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[Propagate the error out of realize. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:44 +02:00
Hu Tao
49946538d2 memory: add parameter errp to memory_region_init_ram
Add parameter errp to memory_region_init_ram and update all call sites
to pass in &error_abort.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-09 13:41:43 +02:00
Le Tan
8d7b8cb9c2 iommu: add is_write as a parameter to the translate function of MemoryRegionIOMMUOps
Add a bool variable is_write as a parameter to the translate function of
MemoryRegionIOMMUOps to indicate the operation of the access. It can be
used for correct fault reporting from within the callback.
Change the interface of related functions.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Peter Maydell
302fa28378 Revert "memory: Use canonical path component as the name"
This reverts commit b0225c2c0d
(which breaks building with Xen enabled and also leaks memory).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 20:05:46 +01:00
Peter Crosthwaite
b0225c2c0d memory: Use canonical path component as the name
Rather than having the name as separate state. This prepares support
for creating a MemoryRegion dynamically (i.e. without
memory_region_init() and friends) and the MemoryRegion still getting
a usable name.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
5d546d4b65 memory: constify memory_region_name
It doesn't change the MR and some prospective call sites will have
const MRs at hand.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Paolo Bonzini
469b046ead memory: remove memory_region_destroy
The function is empty after the previous patch, so remove it.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Peter Crosthwaite
d33382da9a memory: MemoryRegion: Add may-overlap and priority props
QOM propertyify the .may-overlap and .priority fields. The setters
will re-add the memory as a subregion if needed (i.e. the values change
when the memory region is already contained).

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Remove setters. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Paolo Bonzini
22a893e4f5 memory: MemoryRegion: replace owner field with QOM parent
The two are now the same.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Peter Crosthwaite
b4fefef9d5 memory: MemoryRegion: QOMify
QOMify memory regions as an Object. The former init() and destroy()
routines become instance_init() and instance_finalize() resp.

memory_region_init() is re-implemented to be:
object_initialize() + set fields

memory_region_destroy() is re-implemented to call unparent().

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Add newly-created MR as child, unparent on destruction. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-01 10:20:41 +02:00
Paolo Bonzini
dbcb898118 hostmem: add property to map memory with MAP_SHARED
A new "share" property can be used with the "memory-file" backend to
map memory with MAP_SHARED instead of MAP_PRIVATE.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:20 +03:00
Paolo Bonzini
a35ba7be4b hostmem: allow preallocation of any memory region
And allow preallocation of file-based memory even without -mem-prealloc.
Some care is necessary because -mem-prealloc does not allow disabling
preallocation for hostmem-file.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:20 +03:00
Paolo Bonzini
7f56e740a6 memory: add error propagation to file-based RAM allocation
Right now, -mem-path will fall back to RAM-based allocation in some
cases.  This should never happen with "-object memory-file", prepare
the code by adding correct error propagation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

MST: drop \n at end of error messages
2014-06-19 18:44:20 +03:00
Paolo Bonzini
0b183fc871 memory: move mem_path handling to memory_region_allocate_system_memory
Like the previous patch did in exec.c, split memory_region_init_ram and
memory_region_init_ram_from_file, and push mem_path one step further up.
Other RAM regions than system memory will now be backed by regular RAM.

Also, boards that do not use memory_region_allocate_system_memory will
not support -mem-path anymore.  This can be changed before the patches
are merged by migrating boards to use the function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:19 +03:00
Igor Mammedov
eed2bacfd2 memory: add memory_region_is_mapped() API
which allows to check if MemoryRegion is already mapped.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:47 +03:00
Paolo Bonzini
feca4ac18b memory: MemoryRegion: rename parent to container
Avoid confusion with the QOM parent.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-18 15:32:50 +02:00
Fam Zheng
edc1ba7a7a docs/memory.txt: Fix document on MMIO operations
.impl.valid should be .impl.unaligned and the description needs some
fixes.

.old_portio is removed since commit b40acf99b (ioport: Switch
dispatching to memory core layer).

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-05-07 21:00:44 +04:00
Igor Mammedov
8e46bbf362 memory_region_present: return false if address is not found in child MemoryRegion
Windows XP shows COM2 port as non functional in
"Device Manager" although no COM2 port backing device
is present in QEMU.

This regression is really due to
3bb28b7208b349e7a1b326e3c6ef9efac1d462bf?
    memory: Provide separate handling of unassigned io ports accesses

That is caused by the fact that QEMU reports to
OSPM that device is present by setting 5th bit in
PII4XPM.pci_conf[0x67] register when COM2 doesn't
exist.

It happens due to memory_region_present(io_as, 0x2f8)
returning false positive since 0x2f8 address eventually
translates into catchall io_as address space.

Fix memory_region_present(parent, addr) by returning
true only if addr maps into a MemoryRegion within
parent (excluding parent itself), to match its
doc comment.

While at it fix copy/paste error in
memory_region_present() doc comment.

Note: this is a temporary hack: we really need better handling for
unassigned regions, we should avoid fallback regions since they are bad
for performance (breaking radix tree assumption that the data structure
is sparsely populated); for memory we need to fix this to implement PCI
master abort properly, anyway.

Cc: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-03-09 21:09:37 +02:00
Edgar E. Iglesias
c6c6958c98 memory: Add MemoryListener to typedefs.h
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-02-11 22:56:31 +10:00
Juan Quintela
1ab4c8ceaa memory: split dirty bitmap into three
After all the previous patches, spliting the bitmap gets direct.

Note: For some reason, I have to move DIRTY_MEMORY_* definitions to
the beginning of memory.h to make compilation work.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:54 +01:00
Juan Quintela
5adca7ace9 memory: use bit 2 for migration
For historical reasons it was bit 3.  Once there, create a constant to
know the number of clients.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:54 +01:00
Juan Quintela
5215919291 memory: cpu_physical_memory_mask_dirty_range() always clears a single flag
Document it

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:54 +01:00
Marcel Apfelbaum
a1ff8ae066 memory: Change MemoryRegion priorities from unsigned to signed
When memory regions overlap, priority can be used to specify
which of them takes priority. By making the priority values signed
rather than unsigned, we make it more convenient to implement
a situation where one "background" region should appear only
where no other region exists: rather than having to explicitly
specify a high priority for all the other regions, we can let them take
the default (zero) priority and specify a negative priority for the
background region.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:44 +03:00
Paolo Bonzini
0075270317 exec: separate current radix tree from the one being built
This same treatment previously done to phys_node_map and phys_sections
is now applied to the dispatch field of AddressSpace.  Topology updates
use as->next_dispatch while accesses use as->dispatch.

Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:49 +02:00
Paolo Bonzini
89ae337acb exec: move listener from AddressSpaceDispatch to AddressSpace
This will help having two copies of AddressSpaceDispatch during the
recreation of the radix tree (one being built, and one that is complete
and will be protected by RCU).  We do not want to have to unregister and
re-register the listener.

Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:49 +02:00
Paolo Bonzini
c2fc83e83d memory: move MemoryListener declaration earlier
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:49 +02:00
Paolo Bonzini
3ce10901ca memory: introduce memory_region_present
This new API will avoid having too many memory_region_ref/unref
in paths that currently use memory_region_find.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:45 +02:00
Paolo Bonzini
46637be269 memory: add ref/unref
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:45 +02:00
Paolo Bonzini
803c0816a7 memory: add getter for owner
Whenever memory regions are accessed outside the BQL, they need to be
preserved against hot-unplug.  MemoryRegions actually do not have their
own reference count; they piggyback on a QOM object, their "owner".
The owner is set at creation time, and there is a function to retrieve
the owner.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:45 +02:00
Paolo Bonzini
2c9b15cab1 memory: add owner argument to initialization functions
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:44 +02:00
Jan Kiszka
5767e4e198 ioport: Move portio types to ioport.h
This decouples memory.h from ioport.h, concentrating all portio related
types in a single header.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:44 +02:00
Jan Kiszka
0659097de2 ioport: Remove unused old dispatching services
Remove unused ioport_register and isa_unassign_ioport along with
everything that only those services used.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:44 +02:00
Jan Kiszka
b40acf99be ioport: Switch dispatching to memory core layer
The current ioport dispatcher is a complex beast, mostly due to the
need to deal with old portio interface users. But we can overcome it
without converting all portio users by embedding the required base
address of a MemoryRegionPortio access into that data structure. That
removes the need to have the additional MemoryRegionIORange structure
in the loop on every access.

To handle old portio memory ops, we simply install dispatching handlers
for portio memory regions when registering them with the memory core.
This removes the need for the old_portio field.

We can drop the additional aliasing of ioport regions and also the
special address space listener. cpu_in and cpu_out now simply call
address_space_read/write. And we can concentrate portio handling in a
single source file.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:44 +02:00
Andreas Färber
ce927ed9e4 hwaddr: Make hwaddr type usable beyond softmmu
While not normally needed for *-user, it can safely be used there since
always based on uint64_t, to avoid ifdeffery.

To avoid accidental uses, move the guards from exec/hwaddr.h to its
inclusion sites.  No need for them in include/hw/.

Prepares for hwaddr use in qom/cpu.h.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-28 13:25:13 +02:00
Alexey Kardashevskiy
7dca8043f3 memory: give name to every AddressSpace
The "info mtree" command in QEMU console prints only "memory" and "I/O"
address spaces while there are actually a lot more other AddressSpace
structs created by PCI and VIO devices. Those devices do not normally
have names and therefore not present in "info mtree" output.

The patch fixes this.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:39:52 +02:00
David Gibson
068665757d memory: Add iommu map/unmap notifiers
This patch adds a NotifierList to MemoryRegions which represent IOMMUs
allowing other parts of the code to register interest in mappings or
unmappings from the IOMMU.  All IOMMU implementations will need to call
memory_region_notify_iommu() to inform those waiting on the notifier list,
whenever an IOMMU mapping is made or removed.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:47 +02:00
Avi Kivity
3095115744 memory: iommu support
Add a new memory region type that translates addresses it is given,
then forwards them to a target address space.  This is similar to
an alias, except that the mapping is more flexible than a linear
translation and trucation, and also less efficient since the
translation happens at runtime.

The implementation uses an AddressSpace mapping the target region to
avoid hierarchical dispatch all the way to the resolved region; only
iommu regions are looked up dynamically.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
[Modified to put translation in address_space_translate; assume
 IOMMUs are not reachable from TCG. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:47 +02:00
Paolo Bonzini
052e87b073 memory: make section size a 128-bit integer
So far, the size of all regions passed to listeners could fit in 64 bits,
because artificial regions (containers and aliases) are eliminated by
the memory core, leaving only device regions which have reasonable sizes

An IOMMU however cannot be eliminated by the memory core, and may have
an artificial size, hence we may need 65 bits to represent its size.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:47 +02:00
Paolo Bonzini
99b9cc0679 Revert "memory: limit sections in the radix tree to the actual address space size"
This reverts commit 86a8623692.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Paolo Bonzini
5c8a00ce18 exec: return MemoryRegion from address_space_translate
Only address_space_translate_for_iotlb needs to return the section.
Every caller of address_space_translate now uses only section->mr,
return it directly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Paolo Bonzini
fd8aaa767a memory: add return value to address_space_rw/read/write
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:34 +02:00
Paolo Bonzini
51644ab70b memory: add address_space_access_valid
The old-style IOMMU lets you check whether an access is valid in a
given DMAContext.  There is no equivalent for AddressSpace in the
memory API, implement it with a lookup of the dispatch tree.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:16 +02:00
Paolo Bonzini
149f54b53b memory: add address_space_translate
Using phys_page_find to translate an AddressSpace to a MemoryRegionSection
is unwieldy.  It requires to pass the page index rather than the address,
and later memory_region_section_addr has to be called.  Replace
memory_region_section_addr with a function that does all of it: call
phys_page_find, compute the offset within the region, and check how
big the current mapping is.  This way, a large flat region can be written
with a single lookup rather than a page at a time.

address_space_translate will also provide a single point where IOMMU
forwarding is implemented.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:26:50 +02:00