Commit Graph

754 Commits

Author SHA1 Message Date
Peter Xu
be9b23c4a5 ramblock: add new hmp command "info ramblock"
To dump information about ramblocks. It looks like:

(qemu) info ramblock
              Block Name    PSize              Offset               Used              Total
            /objects/mem    2 MiB  0x0000000000000000 0x0000000080000000 0x0000000080000000
                vga.vram    4 KiB  0x0000000080060000 0x0000000001000000 0x0000000001000000
    /rom@etc/acpi/tables    4 KiB  0x00000000810b0000 0x0000000000020000 0x0000000000200000
                 pc.bios    4 KiB  0x0000000080000000 0x0000000000040000 0x0000000000040000
  0000:00:03.0/e1000.rom    4 KiB  0x0000000081070000 0x0000000000040000 0x0000000000040000
                  pc.rom    4 KiB  0x0000000080040000 0x0000000000020000 0x0000000000020000
    0000:00:02.0/vga.rom    4 KiB  0x0000000081060000 0x0000000000010000 0x0000000000010000
   /rom@etc/table-loader    4 KiB  0x00000000812b0000 0x0000000000001000 0x0000000000001000
      /rom@etc/acpi/rsdp    4 KiB  0x00000000812b1000 0x0000000000001000 0x0000000000001000

Ramblock is something hidden internally in QEMU implementation, and this
command should only be used by mostly QEMU developers on RAM stuff. It
is not a command suitable for QMP interface. So only HMP interface is
provided for it.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-4-git-send-email-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 17:31:16 +01:00
Peter Xu
99e15582de ramblock: add RAMBLOCK_FOREACH()
So that it can simplifies the iterators.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-2-git-send-email-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 17:30:37 +01:00
Juan Quintela
6b6712efcc ram: Split dirty bitmap by RAMBlock
Both the ram bitmap and the unsent bitmap are split by RAMBlock.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Fix compilation when DEBUG_POSTCOPY is enabled (thanks Hailiang)
2017-05-04 10:00:38 +02:00
Peter Maydell
52e94ea5de Xen 2017/04/21 + fix
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJY/5EdAAoJEIlPj0hw4a6QHj8P/1D3iZq8vKyLvnkioYC00ao8
 dhuCFV1Dk0dl/QXvDXxNAHFqmHp2PzHeHF2V19bTbUh9QnY3WH9aSsMQ7YVPDzLb
 6ZezbxXd1l8+IOrWh+ch+x6jiUpRAeHGjhSpFxQEmH5THBmPtLHBFhAeeGpVq6CT
 MPFlN0mr1snyMMbK2aKCVTHgh5Wip83/t/kijIq4RmPwXdJbwxx55PL8jxC/NZHq
 /NbAZzAT149fmItfBNnsRTfNoDs7fSmgM9VelYsnJNHs6qkYYfewxys4vudePG8n
 ZcjCXMv9vdCSTfXfYY9nxhfGCgkkRSvBWrzARRIUPxTXGAmEU52LJrccpG9AEFRZ
 Kgz1JYr0y+u/g+RsCJvAglvmciawXgZH8GIR4sFl2iv4u6cx8PxNRgjTdt7YX4Je
 V7+scmOLB0U5wkI7PUcPV1v2fwKJHFfMdyik257otfMCJiTCnWGJfCZGR4JAdy3C
 NHuG4eIj2XLjZ7IQIo3mg+EfdCWdfVMKtXj0MAFsP6Kcr6sY/ef5sx/wB61k3NU1
 Y4ctI/LAOONsjlC2yzJ4aZR4LgyFcVcwx8FKOK7niCcCgVLYGWpyiHU3kFKYlmKM
 FKIlGPPOK8WuRV9NUGDI9A8XENyFXGyiR2xGCotfgHj+FSSqRzhKH4ecfgNimJVY
 wjTzZmBa68pLHdXaJ+SV
 =OfeB
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20170421-v2-tag' into staging

Xen 2017/04/21 + fix

# gpg: Signature made Tue 25 Apr 2017 19:10:37 BST
# gpg:                using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# gpg:                 aka "Stefano Stabellini <sstabellini@kernel.org>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* remotes/sstabellini/tags/xen-20170421-v2-tag: (21 commits)
  move xen-mapcache.c to hw/i386/xen/
  move xen-hvm.c to hw/i386/xen/
  move xen-common.c to hw/xen/
  add xen-9p-backend to MAINTAINERS under Xen
  xen/9pfs: build and register Xen 9pfs backend
  xen/9pfs: send responses back to the frontend
  xen/9pfs: implement in/out_iov_from_pdu and vmarshal/vunmarshal
  xen/9pfs: receive requests from the frontend
  xen/9pfs: connect to the frontend
  xen/9pfs: introduce Xen 9pfs backend
  9p: introduce a type for the 9p header
  xen: import ring.h from xen
  configure: use pkg-config for obtaining xen version
  xen: additionally restrict xenforeignmemory operations
  xen: use libxendevice model to restrict operations
  xen: use 5 digit xen versions
  xen: use libxendevicemodel when available
  configure: detect presence of libxendevicemodel
  xen: create wrappers for all other uses of xc_hvm_XXX() functions
  xen: rename xen_modified_memory() to xen_hvm_modified_memory()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-26 10:22:31 +01:00
Gerd Hoffmann
8deaf12ca1 memory: add support getting and using a dirty bitmap copy.
This patch adds support for getting and using a local copy of the dirty
bitmap.

memory_region_snapshot_and_clear_dirty() will create a snapshot of the
dirty bitmap for the specified range, clear the dirty bitmap and return
the copy.  The returned bitmap can be a bit larger than requested, the
range is expanded so the code can copy unsigned longs from the bitmap
and avoid atomic bit update operations.

memory_region_snapshot_get_dirty() will return the dirty status of
pages, pretty much like memory_region_get_dirty(), but using the copy
returned by memory_region_copy_and_clear_dirty().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-3-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Peter Maydell
32c7e0ab75 migration/next for 20170421
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJY+d69AAoJEPSH7xhYctcj/4oQAIFFEyWaqrL9ve5ySiJgdtcY
 zYtiIhZQ+nPuy2i1oDSX+vbMcmkJDDyfO5qLovxyHGkZHniR8HtxNHP+MkZQa07p
 DiSIvd51HvcixIouhbGcoUCU63AYxqNL3o5/TyNpUI72nvsgwl3yfOot7PtutE/F
 r384j8DrOJ9VwC5GGPg27mJvRPvyfDQWfxDCyMYVw153HTuwVYtgiu/layWojJDV
 D2L1KV45ezBuGckZTHt9y6K4J5qz8qHb/dJc+whBBjj4j9T9XOILU9NPDAEuvjFZ
 gHbrUyxj7EiApjHcDZoQm9Raez422ALU30yc9Kn7ik7vSqTxk2Ejq6Gz7y9MJrDn
 KdMj75OETJNjBL+0T9MmbtWts28+aalpTUXtBpmi3eWQV5Hcox2NF1RP42jtD9Pa
 lkrM6jv0nsdNfBPlQ+ZmBTJxysWECcMqy487nrzmPNC8vZfokjXL5be12puho9fh
 ziU4gx9C6/k82S+/H6WD/AUtRiXJM7j4oTU2mnjrsSXQC1JNWqODBOFUo9zsDufl
 vtcrxfPhSD1DwOInFSIBHf/RylcgTkPCL0rPoJ8npNDly6rHFYJ+oIbsn84Z4uYY
 RWvH8xB9wgRlK9L1WdRgOd2q7PaeHQoSSdPOiS9YVEVMVvSW8Es5CRlhcAsw/M/T
 1Tl65cNrjETAuZKL3dLH
 =EsZ5
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170421' into staging

migration/next for 20170421

# gpg: Signature made Fri 21 Apr 2017 11:28:13 BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20170421: (65 commits)
  hmp: info migrate_parameters format tunes
  hmp: info migrate_capability format tunes
  migration: rename max_size to threshold_size
  migration: set current_active_state once
  virtio-rng: stop virtqueue while the CPU is stopped
  migration: don't close a file descriptor while it can be in use
  ram: Remove migration_bitmap_extend()
  migration: Disable hotplug/unplug during migration
  qdev: Move qdev_unplug() to qdev-monitor.c
  qdev: Export qdev_hot_removed
  qdev: qdev_hotplug is really a bool
  migration: Remove MigrationState parameter from migration_is_idle()
  ram: Use RAMBitmap type for coherence
  ram: rename last_ram_offset() last_ram_pages()
  ram: Use ramblock and page offset instead of absolute offset
  ram: Change offset field in PageSearchStatus to page
  ram: Remember last_page instead of last_offset
  ram: Use page number instead of an address for the bitmap operations
  ram: reorganize last_sent_block
  ram: ram_discard_range() don't use the mis parameter
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 15:59:27 +01:00
Juan Quintela
66103a5796 ram: Remove migration_bitmap_extend()
We have disabled memory hotplug, so we don't need to handle
migration_bitamp there.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
b8c4899398 ram: rename last_ram_offset() last_ram_pages()
We always use it as pages anyways.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
15440dd5a0 ram: Pass RAMBlock to bitmap_sync
We change the meaning of start to be the offset from the beggining of
the block.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
68908ed665 ram: Change num_dirty_pages_period type to uint64_t
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Peter Xu
f06a696dc9 intel_iommu: provide its own replay() callback
The default replay() don't work for VT-d since vt-d will have a huge
default memory region which covers address range 0-(2^64-1). This will
normally consumes a lot of time (which looks like a dead loop).

The solution is simple - we don't walk over all the regions. Instead, we
jump over the regions when we found that the page directories are empty.
It'll greatly reduce the time to walk the whole region.

To achieve this, we provided a page walk helper to do that, invoking
corresponding hook function when we found an page we are interested in.
vtd_page_walk_level() is the core logic for the page walking. It's
interface is designed to suite further use case, e.g., to invalidate a
range of addresses.

Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-8-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
faa362e3cc memory: add MemoryRegionIOMMUOps.replay() callback
Originally we have one memory_region_iommu_replay() function, which is
the default behavior to replay the translations of the whole IOMMU
region. However, on some platform like x86, we may want our own replay
logic for IOMMU regions. This patch adds one more hook for IOMMUOps for
the callback, and it'll override the default if set.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-6-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
bd2bfa4c52 memory: introduce memory_region_notify_one()
Generalizing the notify logic in memory_region_notify_iommu() into a
single function. This can be further used in customized replay()
functions for IOMMUs.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-5-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
de472e4a92 memory: provide iommu_replay_all()
This is an "global" version of existing memory_region_iommu_replay() -
we announce the translations to all the registered notifiers, instead of
a specific one.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-4-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
512fa40867 memory: provide IOMMU_NOTIFIER_FOREACH macro
A new macro is provided to iterate all the IOMMU notifiers hooked
under specific IOMMU memory region.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-3-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
698feb5e13 memory: add section range info for IOMMU notifier
In this patch, IOMMUNotifier.{start|end} are introduced to store section
information for a specific notifier. When notification occurs, we not
only check the notification type (MAP|UNMAP), but also check whether the
notified iova range overlaps with the range of specific IOMMU notifier,
and skip those notifiers if not in the listened range.

When removing an region, we need to make sure we removed the correct
VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.

This patch is solving the problem that vfio-pci devices receive
duplicated UNMAP notification on x86 platform when vIOMMU is there. The
issue is that x86 IOMMU has a (0, 2^64-1) IOMMU region, which is
splitted by the (0xfee00000, 0xfeefffff) IRQ region. AFAIK
this (splitted IOMMU region) is only happening on x86.

This patch also helps vhost to leverage the new interface as well, so
that vhost won't get duplicated cache flushes. In that sense, it's an
slight performance improvement.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-2-git-send-email-peterx@redhat.com>
[ehabkost: included extra vhost_iommu_region_del() change from Peter Xu]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Paolo Bonzini
90c4fe5fc5 exec: revert MemoryRegionCache
MemoryRegionCache did not know about virtio support for IOMMUs (because the
two features were developed at the same time).  Revert MemoryRegionCache
to "normal" address_space_* operations for 2.9, as it is simpler than
undoing the virtio patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-03 13:41:53 +02:00
Paul Durrant
5100afb5f5 xen: rename xen_modified_memory() to xen_hvm_modified_memory()
This patch is a purely cosmetic change that avoids a name collision in
a subsequent patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-03-22 11:47:39 -07:00
Dr. David Alan Gilbert
463a4ac23b RAMBlocks: qemu_ram_is_shared
Provide a helper to say whether a RAMBlock was created as a
shared mapping.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16 09:00:58 +01:00
Chao Fan
1ffb5dfd35 Change the method to calculate dirty-pages-rate
In function cpu_physical_memory_sync_dirty_bitmap, file
include/exec/ram_addr.h:

if (src[idx][offset]) {
    unsigned long bits = atomic_xchg(&src[idx][offset], 0);
    unsigned long new_dirty;
    new_dirty = ~dest[k];
    dest[k] |= bits;
    new_dirty &= bits;
    num_dirty += ctpopl(new_dirty);
}

After these codes executed, only the pages not dirtied in bitmap(dest),
but dirtied in dirty_memory[DIRTY_MEMORY_MIGRATION] will be calculated.
For example:
When ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] = 0b00001111,
and atomic_rcu_read(&migration_bitmap_rcu)->bmap = 0b00000011,
the new_dirty will be 0b00001100, and this function will return 2 but not
4 which is expected.
the dirty pages in dirty_memory[DIRTY_MEMORY_MIGRATION] are all new,
so these should be calculated also.

Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16 08:55:56 +01:00
Dr. David Alan Gilbert
e8f5fe2de1 memory_region: Fix name comments
The 'name' parameter to memory_region_init_* had been marked as debug
only, however vmstate_region_ram uses it as a parameter to
qemu_ram_set_idstr to set RAMBlock names and these form part of the
migration stream.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170309152708.30635-1-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Peter Maydell
17783ac828 ppc patch queuye for 2017-03-03
This will probably be my last pull request before the hard freeze.  It
 has some new work, but that has all been posted in draft before the
 soft freeze, so I think it's reasonable to include in qemu-2.9.
 
 This batch has:
     * A substantial amount of POWER9 work
         * Implements the legacy (hash) MMU for POWER9
 	* Some more preliminaries for implementing the POWER9 radix
           MMU
 	* POWER9 has_work
 	* Basic POWER9 compatibility mode handling
 	* Removal of some premature tests
     * Some cleanups and fixes to the existing MMU code to make the
       POWER9 work simpler
     * A bugfix for TCG multiply adds on power
     * Allow pseries guests to access PCIe extended config space
 
 This also includes a code-motion not strictly in ppc code - moving
 getrampagesize() from ppc code to exec.c.  This will make some future
 VFIO improvements easier, Paolo said it was ok to merge via my tree.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYuOEEAAoJEGw4ysog2bOSHD0P/jBg/qr/4KnsB1KhnlVrB2sP
 vy2d3bGGlUWr9Z+CK/PMCRB8ekFgQLjidLIXji6mviUocv6m3WsVrnbLF/oOL/IT
 NPMVAffw7q804YVu1Ns9R82d6CIqHTy//bpg69tFMcJmhL9fqPan3wTZZ9JeiyAm
 SikqkAHBSW4SxKqg8ApaSqx5L2QTqyfkClR0sLmgM0JtmfJrbobpQ6bMtdPjUZ9L
 n2gnpO2vaWCa1SEQrRrdELqvcD8PHkSJapWOBXOkpGWxoeov/PYxOgkpdDUW4qYY
 lVLtp1Vd3OB/h3Unqfw32DNiHA5p89hWPX5UybKMgRVL9Cv2/lyY47pcY8XTeNzn
 bv84YRbFJeI+GgoEnghmtq+IM8XiW/cr9rWm9wATKfKGcmmFauumALrsffUpHVCM
 4hSNgBv5t2V9ptZ+MDlM/Ku+zk9GoqwQ+hemdpVtiyhOtGUPGFBn5YLE4c2DHFxV
 +L9JtBnFn8obnssNoz0wL+QvZchT1qUHMhH5CWAanjw9CTDp/YwQ2P01zK+00s9d
 4cB7fUG3WNto5eXXEGMaXeDsUEz8z//hTe3j5sVbnHsXi0R3dhv7iryifmx4bUKU
 H9EwAc+uNUHbvBy7u6IWg0I8P2n00CCO6JqXijQ92zELJ5j0XhzHUI2dOXn+zyEo
 3FZu56LFnSSUBEXuTjq4
 =PcNw
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170303' into staging

ppc patch queuye for 2017-03-03

This will probably be my last pull request before the hard freeze.  It
has some new work, but that has all been posted in draft before the
soft freeze, so I think it's reasonable to include in qemu-2.9.

This batch has:
    * A substantial amount of POWER9 work
        * Implements the legacy (hash) MMU for POWER9
	* Some more preliminaries for implementing the POWER9 radix
          MMU
	* POWER9 has_work
	* Basic POWER9 compatibility mode handling
	* Removal of some premature tests
    * Some cleanups and fixes to the existing MMU code to make the
      POWER9 work simpler
    * A bugfix for TCG multiply adds on power
    * Allow pseries guests to access PCIe extended config space

This also includes a code-motion not strictly in ppc code - moving
getrampagesize() from ppc code to exec.c.  This will make some future
VFIO improvements easier, Paolo said it was ok to merge via my tree.

# gpg: Signature made Fri 03 Mar 2017 03:20:36 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170303:
  target/ppc: rewrite f[n]m[add,sub] using float64_muladd
  spapr: Small cleanup of PPC MMU enums
  spapr_pci: Advertise access to PCIe extended config space
  target/ppc: Rework hash mmu page fault code and add defines for clarity
  target/ppc: Move no-execute and guarded page checking into new function
  target/ppc: Add execute permission checking to access authority check
  target/ppc: Add Instruction Authority Mask Register Check
  hw/ppc/spapr: Add POWER9 to pseries cpu models
  target/ppc/POWER9: Add cpu_has_work function for POWER9
  target/ppc/POWER9: Add POWER9 pa-features definition
  target/ppc/POWER9: Add POWER9 mmu fault handler
  target/ppc: Don't gen an SDR1 on POWER9 and rework register creation
  target/ppc: Add patb_entry to sPAPRMachineState
  target/ppc/POWER9: Add POWERPC_MMU_V3 bit
  powernv: Don't test POWER9 CPU yet
  exec, kvm, target-ppc: Move getrampagesize() to common code
  target/ppc: Add POWER9/ISAv3.00 to compat_table

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-04 16:31:14 +00:00
Yongji Xie
c99a29e702 memory: Introduce DEVICE_HOST_ENDIAN for ram device
At the moment ram device's memory regions are DEVICE_NATIVE_ENDIAN. It's
incorrect. This memory region is backed by a MMIO area in host, so the
uint64_t data that MemoryRegionOps read from/write to this area should be
host-endian rather than target-endian. Hence, current code does not work
when target and host endianness are different which is the most common case
on PPC64. To fix it, this introduces DEVICE_HOST_ENDIAN for the ram device.

This has been tested on PPC64 BE/LE host/guest in all possible combinations
including TCG.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Message-Id: <1488171164-28319-1-git-send-email-xyjxie@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:03 +01:00
Paolo Bonzini
30f3dda24b Merge branch 'icount-update' into HEAD
Merge the original development branch due to breakage caused by the
MTTCG merge.

Conflicts:
	cpu-exec.c
	translate-common.c

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:39:18 +01:00
Alexey Kardashevskiy
9c60766887 exec, kvm, target-ppc: Move getrampagesize() to common code
getrampagesize() returns the largest supported page size and mainly
used to know if huge pages are enabled.

However is implemented in target-ppc/kvm.c and not available
in TCG or other architectures.

This renames and moves gethugepagesize() to mmap-alloc.c where
fd-based analog of it is already implemented. This renames and moves
getrampagesize() to exec.c as it seems to be the common place for
helpers like this.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Dr. David Alan Gilbert
67f11b5c23 postcopy: Record largest page size
Record the largest page size in use; we'll need it soon for allocating
temporary buffers.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-7-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28 11:30:23 +00:00
Dr. David Alan Gilbert
d3a5038c46 exec: ram_block_discard_range
Create ram_block_discard_range in exec.c to replace
postcopy_ram_discard_range and most of ram_discard_range.

Those two routines are a bit of a weird combination, and
ram_discard_range is about to get more complex for hugepages.
It's OS dependent code (so shouldn't be in migration/ram.c) but
it needs quite a bit of the innards of RAMBlock so doesn't belong in
the os*.c.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-5-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28 11:30:23 +00:00
Alex Bennée
c3b9a07a33 cputlb: introduce tlb_flush_*_all_cpus[_synced]
This introduces support to the cputlb API for flushing all CPUs TLBs
with one call. This avoids the need for target helpers to iterate
through the vCPUs themselves.

An additional variant of the API (_synced) will cause the source vCPUs
work to be scheduled as "safe work". The result will be all the flush
operations will be complete by the time the originating vCPU executes
its safe work. The calling implementation can either end the TB
straight away (which will then pick up the cpu->exit_request on
entering the next block) or defer the exit until the architectural
sync point (usually a barrier instruction).

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-02-24 10:32:46 +00:00
Alex Bennée
b0706b7167 cputlb: atomically update tlb fields used by tlb_reset_dirty
The main use case for tlb_reset_dirty is to set the TLB_NOTDIRTY flags
in TLB entries to force the slow-path on writes. This is used to mark
page ranges containing code which has been translated so it can be
invalidated if written to. To do this safely we need to ensure the TLB
entries in question for all vCPUs are updated before we attempt to run
the code otherwise a race could be introduced.

To achieve this we atomically set the flag in tlb_reset_dirty_range and
take care when setting it when the TLB entry is filled.

On 32 bit systems attempting to emulate 64 bit guests we don't even
bother as we might not have the atomic primitives available. MTTCG is
disabled in this case and can't be forced on. The copy_tlb_helper
function helps keep the atomic semantics in one place to avoid
confusion.

The dirty helper function is made static as it isn't used outside of
cputlb.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-02-24 10:32:46 +00:00
Alex Bennée
0336cbf853 cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap
While the vargs approach was flexible the original MTTCG ended up
having munge the bits to a bitmap so the data could be used in
deferred work helpers. Instead of hiding that in cputlb we push the
change to the API to make it take a bitmap of MMU indexes instead.

For ARM some the resulting flushes end up being quite long so to aid
readability I've tended to move the index shifting to a new line so
all the bits being or-ed together line up nicely, for example:

    tlb_flush_page_by_mmuidx(other_cs, pageaddr,
                             (1 << ARMMMUIdx_S1SE1) |
                             (1 << ARMMMUIdx_S1SE0));

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[AT: SPARC parts only]
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
[PM: ARM parts only]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-24 10:32:46 +00:00
KONRAD Frederic
e3b9ca8109 cputlb: introduce tlb_flush_* async work.
Some architectures allow to flush the tlb of other VCPUs. This is not a problem
when we have only one thread for all VCPUs but it definitely needs to be an
asynchronous work when we are in true multithreaded work.

We take the tb_lock() when doing this to avoid racing with other threads
which may be invalidating TB's at the same time. The alternative would
be to use proper atomic primitives to clear the tlb entries en-mass.

This patch doesn't do anything to protect other cputlb function being
called in MTTCG mode making cross vCPU changes.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[AJB: remove need for g_malloc on defer, make check fixes, tb_lock]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-02-24 10:32:46 +00:00
Alex Bennée
e5143e30fb tcg: remove global exit_request
There are now only two uses of the global exit_request left.

The first ensures we exit the run_loop when we first start to process
pending work and in the kick handler. This is just as easily done by
setting the first_cpu->exit_request flag.

The second use is in the round robin kick routine. The global
exit_request ensured every vCPU would set its local exit_request and
cause a full exit of the loop. Now the iothread isn't being held while
running we can just rely on the kick handler to push us out as intended.

We lightly re-factor the main vCPU thread to ensure cpu->exit_requests
cause us to exit the main loop and process any IO requests that might
come along. As an cpu->exit_request may legitimately get squashed
while processing the EXCP_INTERRUPT exception we also check
cpu->queued_work_first to ensure queued work is expedited as soon as
possible.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-02-24 10:32:45 +00:00
Alex Bennée
791158d93b tcg: rename tcg_current_cpu to tcg_current_rr_cpu
..and make the definition local to cpus. In preparation for MTTCG the
concept of a global tcg_current_cpu will no longer make sense. However
we still need to keep track of it in the single-threaded case to be able
to exit quickly when required.

qemu_cpu_kick_no_halt() moves and becomes qemu_cpu_kick_rr_cpu() to
emphasise its use-case. qemu_cpu_kick now kicks the relevant cpu as
well as qemu_kick_rr_cpu() which will become a no-op in MTTCG.

For the time being the setting of the global exit_request remains.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2017-02-24 10:32:45 +00:00
Paolo Bonzini
1aab16c28a cpu-exec: unify icount_decr and tcg_exit_req
The icount interrupt flag and tcg_exit_req serve almost the same
purpose, let's make them completely the same.

The former TB_EXIT_REQUESTED and TB_EXIT_ICOUNT_EXPIRED cases are
unified, since we can distinguish them from the value of the
interrupt flag.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-22 14:56:34 +01:00
Paolo Bonzini
5eba0404b9 virtio: use MemoryRegionCache to access descriptors
For now, the cache is created on every virtqueue_pop.  Later on,
direct descriptors will be able to reuse it.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini
43d70ddf9f cpu-exec: fix icount out-of-bounds access
When icount is active, tb_add_jump is surprisingly called with an
out of bounds basic block index.  I have no idea how that can work,
but it does not seem like a good idea.  Clear *last_tb for all
TB_EXIT_ICOUNT_EXPIRED cases, even when all you have to do is
refill icount_extra.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-16 14:06:56 +01:00
Daniel P. Berrange
0ab8ed18a6 trace: switch to modular code generation for sub-directories
Introduce rules in the top level Makefile that are able to generate
trace.[ch] files in every subdirectory which has a trace-events file.

The top level directory is handled specially, so instead of creating
trace.h, it creates trace-root.h. This allows sub-directories to
include the top level trace-root.h file, without ambiguity wrt to
the trace.g file in the current sub-dir.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170125161417.31949-7-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-31 17:11:18 +00:00
Peter Xu
57bb40c9db memory: hmp: add "-f" for "info mtree"
Adding one more option "-f" for "info mtree" to dump the flat views of
all the address spaces.

This will be useful to debug the memory rendering logic, also it'll be
much easier with it to know what memory region is handling what address
range.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1484556005-29701-3-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27 18:07:31 +01:00
Peter Maydell
598cf1c805 * QOM interface fix (Eduardo)
* RTC fixes (Gaohuai, Igor)
 * Memory leak fixes (Li Qiang, me)
 * Ctrl-a b regression (Marc-André)
 * Stubs cleanups and fixes (Leif, me)
 * hxtool tweak (me)
 * HAX support (Vincent)
 * QemuThread, exec.c and SCSI fixes (Roman, Xinhua, me)
 * PC_COMPAT_2_8 fix (Marcelo)
 * stronger bitmap assertions (Peter)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJYggc9FBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 5pMH/092iVHw1la8VmphQd8W7hkCHckvVbwaEJ+n4BP8MjeUNmYFJX+op9Qlpqfe
 ekYqQgK69v2UwuofVK2gqS+Y2EyFHivTESk5pS3SM3lTewV1fzCM/HVG3pTxV/ol
 V+eBnp+shrfNG3Eg7YThTqx4LkDUp24Pd3HJVblQZMVpqGzL2xUuUQzSf8F/eeQJ
 xO61pm0ovpCY5MCg3kPLx8GIkPAmcXo5jhMCTz5aLnQW6TO/mwx271a4UE2RTLZ7
 cFjNhxdGSzlnn2RwId4HVYWGU42taW6mpa8NX1hVVUXa1A2qlAfi5N/WLaH0aGYR
 J5ZTIaXdPUBx2SrUmd8udj4a818=
 =H5BQ
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* QOM interface fix (Eduardo)
* RTC fixes (Gaohuai, Igor)
* Memory leak fixes (Li Qiang, me)
* Ctrl-a b regression (Marc-André)
* Stubs cleanups and fixes (Leif, me)
* hxtool tweak (me)
* HAX support (Vincent)
* QemuThread, exec.c and SCSI fixes (Roman, Xinhua, me)
* PC_COMPAT_2_8 fix (Marcelo)
* stronger bitmap assertions (Peter)

# gpg: Signature made Fri 20 Jan 2017 12:49:01 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (35 commits)
  pc.h: move x-mach-use-reliable-get-clock compat entry to PC_COMPAT_2_8
  bitmap: assert that start and nr are non negative
  Revert "win32: don't run subprocess tests on Mingw32 platform"
  hax: add Darwin support
  Plumb the HAXM-based hardware acceleration support
  target/i386: Add Intel HAX files
  kvm: move cpu synchronization code
  KVM: PPC: eliminate unnecessary duplicate constants
  ramblock-notifier: new
  char: fix ctrl-a b not working
  exec: Add missing rcu_read_unlock
  x86: ioapic: fix fail migration when irqchip=split
  x86: ioapic: dump version for "info ioapic"
  x86: ioapic: add traces for ioapic
  hxtool: emit Texinfo headings as @subsection
  qemu-thread: fix qemu_thread_set_name() race in qemu_thread_create()
  serial: fix memory leak in serial exit
  scsi-block: fix direction of BYTCHK test for VERIFY commands
  pc: fix crash in rtc_set_memory() if initial cpu is marked as hotplugged
  acpi: filter based on CONFIG_ACPI_X86 rather than TARGET
  ...

# Conflicts:
#	include/hw/i386/pc.h
2017-01-20 16:42:07 +00:00
Paolo Bonzini
0987d735a3 ramblock-notifier: new
This adds a notify interface of ram block additions and removals.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-16 17:52:35 +01:00
Alex Bennée
d10eb08f5d cputlb: drop flush_global flag from tlb_flush
We have never has the concept of global TLB entries which would avoid
the flush so we never actually use this flag. Drop it and make clear
that tlb_flush is the sledge-hammer it has always been.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
[DG: ppc portions]
Acked-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-13 14:24:37 +00:00
Jason Wang
12d37882f0 memory: handle alias in memory_region_is_iommu()
Cc: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-01-10 05:56:59 +02:00
Jason Wang
052c8fa998 exec: introduce address_space_get_iotlb_entry()
This patch introduces a helper to query the iotlb entry for a
possible iova. This will be used by later device IOTLB API to enable
the capability for a dataplane (e.g vhost) to query the IOTLB.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 05:56:58 +02:00
Paolo Bonzini
1f4e496e1f exec: introduce MemoryRegionCache
Device models often have to perform multiple access to a single
memory region that is known in advance, but would to use "DMA-style"
functions instead of address_space_map/unmap.  This can happen
for example when the data has to undergo endianness conversion.
Introduce a new data structure to cache the result of
address_space_translate without forcing usage of a host address
like address_space_map does.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:23 +01:00
Paolo Bonzini
0ce265ffef exec: introduce memory_ldst.inc.c
Templatize the address_space_* and *_phys functions, so that we can add
similar functions in the next patch that work with a lightweight,
cache-like version of address_space_map/unmap.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:23 +01:00
Bobby Bingham
c2a8531690 cpu_ldst.h: use correct guest address parameter
In the user emulation code path, tlb_vaddr_to_host erronesously passed
vaddr as the guest address to be translated, instead of addr, the parameter
which actually contained the guest address.

This resulted in incorrect addresses being used when emulating block copy
(mvc/mvpg) and block clear (xc) instructions for the s390x target.

Signed-off-by: Bobby Bingham <koorogi@koorogi.info>
Message-Id: <20161113050523.23909-1-koorogi@koorogi.info>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-22 23:26:51 +01:00
Peter Maydell
e80b4b8fb6 VFIO updates 2016-10-31
- Replace skip_dump with ram_device to denote device memory and mark
    as non-direct to avoid memcpy to MMIO - fixes RTL (Alex Williamson)
  - Skip zero-length sparse mmaps - avoids unnecessary warning
    (Alex Williamson)
  - Clear BARs on reset so guest doesn't assume programming on return
    from S3 (Ido Yariv)
  - Enable sub-page MMIO mmaps - performance improvement for devices
    with smaller BARs, iff both host and guest map them to full,
    aligned pages (Yongji Xie)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJYF37XAAoJECObm247sIsi9okP/jT/UBqR1G7RVuxQ8AZPPAsU
 mBClGw5lC2lQ70M/t9HNxMMpceHSmAIC4doauOhVNGn7yl3MgHywhEmuxvdQQBAV
 WQYkrZsAIyNhg4I0/92PybsppccEgXgGjz7tW+56udgPhU4ChSsbUwrt8uxZ6/M5
 R/rIGBe/46QVKCAPes3PvOLq19LErUnN0uSasP0QxacD0aFnO9vRSlT3Ake6mnqv
 u+Z1p8d9DM5LYkZPV0wcDWBlosda+cWFH+RhEp1UH4d+2hpW4+WB6bMG6SneguAV
 9P6Dl7z8dJUZauFXw+/ctYDHLOKmul6wb7fLR8n09kqLsgxveH3xEw3tILEDBMvn
 W9xBc1Rp5luH7vZio8ZUYvRO0+/MGEyzQwUPcOiw/VOWl0w8IYyA2UVpHQZk5Esi
 r+DsrkxdonrhqXuB4vrJg7TdlbBEh2cAciy2zrSsYADB2ine/op7O+68+kqwsrlP
 tQOz+wIEi+72G7S6jdnVUQAYu+01Fae55K8gR2OPwGQO5SWgliYY7AZbE3l6eMZ7
 UtgG8YfJpJbZ5wQnshkF5NlNO9HwUS3bp+YgaSdF+NiZC+lz1nKpsqEx/JXRST7V
 A9hvK5so5mZ69EmEz7ruijBIblF3nte+Pfrm+FTjwqMUklvbwsElJGKf/fI6f+kl
 xYyUWkiYOoZXmSkjCanm
 =ZMwj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20161031.0' into staging

VFIO updates 2016-10-31

 - Replace skip_dump with ram_device to denote device memory and mark
   as non-direct to avoid memcpy to MMIO - fixes RTL (Alex Williamson)
 - Skip zero-length sparse mmaps - avoids unnecessary warning
   (Alex Williamson)
 - Clear BARs on reset so guest doesn't assume programming on return
   from S3 (Ido Yariv)
 - Enable sub-page MMIO mmaps - performance improvement for devices
   with smaller BARs, iff both host and guest map them to full,
   aligned pages (Yongji Xie)

# gpg: Signature made Mon 31 Oct 2016 17:26:47 GMT
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20161031.0:
  vfio: Add support for mmapping sub-page MMIO BARs
  vfio/pci: fix out-of-sync BAR information on reset
  vfio: Handle zero-length sparse mmap ranges
  memory: Don't use memcpy for ram_device regions
  memory: Replace skip_dump flag with "ram_device"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 18:19:06 +00:00
Alex Williamson
4a2e242bbb memory: Don't use memcpy for ram_device regions
With a vfio assigned device we lay down a base MemoryRegion registered
as an IO region, giving us read & write accessors.  If the region
supports mmap, we lay down a higher priority sub-region MemoryRegion
on top of the base layer initialized as a RAM device pointer to the
mmap.  Finally, if we have any quirks for the device (ie. address
ranges that need additional virtualization support), we put another IO
sub-region on top of the mmap MemoryRegion.  When this is flattened,
we now potentially have sub-page mmap MemoryRegions exposed which
cannot be directly mapped through KVM.

This is as expected, but a subtle detail of this is that we end up
with two different access mechanisms through QEMU.  If we disable the
mmap MemoryRegion, we make use of the IO MemoryRegion and service
accesses using pread and pwrite to the vfio device file descriptor.
If the mmap MemoryRegion is enabled and results in one of these
sub-page gaps, QEMU handles the access as RAM, using memcpy to the
mmap.  Using either pread/pwrite or the mmap directly should be
correct, but using memcpy causes us problems.  I expect that not only
does memcpy not necessarily honor the original width and alignment in
performing a copy, but it potentially also uses processor instructions
not intended for MMIO spaces.  It turns out that this has been a
problem for Realtek NIC assignment, which has such a quirk that
creates a sub-page mmap MemoryRegion access.

To resolve this, we disable memory_access_is_direct() for ram_device
regions since QEMU assumes that it can use memcpy for those regions.
Instead we access through MemoryRegionOps, which replaces the memcpy
with simple de-references of standard sizes to the host memory.

With this patch we attempt to provide unrestricted access to the RAM
device, allowing byte through qword access as well as unaligned
access.  The assumption here is that accesses initiated by the VM are
driven by a device specific driver, which knows the device
capabilities.  If unaligned accesses are not supported by the device,
we don't want them to work in a VM by performing multiple aligned
accesses to compose the unaligned access.  A down-side of this
philosophy is that the xp command from the monitor attempts to use
the largest available access weidth, unaware of the underlying
device.  Using memcpy had this same restriction, but at least now an
operator can dump individual registers, even if blocks of device
memory may result in access widths beyond the capabilities of a
given device (RTL NICs only support up to dword).

Reported-by: Thorsten Kohfeldt <thorsten.kohfeldt@gmx.de>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 09:53:03 -06:00
Alex Williamson
21e00fa55f memory: Replace skip_dump flag with "ram_device"
Setting skip_dump on a MemoryRegion allows us to modify one specific
code path, but the restriction we're trying to address encompasses
more than that.  If we have a RAM MemoryRegion backed by a physical
device, it not only restricts our ability to dump that region, but
also affects how we should manipulate it.  Here we recognize that
MemoryRegions do not change to sometimes allow dumps and other times
not, so we replace setting the skip_dump flag with a new initializer
so that we know exactly the type of region to which we're applying
this behavior.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 09:53:03 -06:00
Paolo Bonzini
7d7500d998 tcg: comment on which functions have to be called with tb_lock held
softmmu requires more functions to be thread-safe, because translation
blocks can be invalidated from e.g. notdirty callbacks.  Probably the
same holds for user-mode emulation, it's just that no one has ever
tried to produce a coherent locking there.

This patch will guide the introduction of more tb_lock and tb_unlock
calls for system emulation.

Note that after this patch some (most) of the mentioned functions are
still called outside tb_lock/tb_unlock.  The next one will rectify this.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-7-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 10:51:16 +01:00
Alex Bennée
301e40ed80 translate-all: add DEBUG_LOCKING asserts
This adds asserts to check the locking on the various translation
engines structures. There are two sets of structures that are protected
by locks.

The first the l1map and PageDesc structures used to track which
translation blocks are associated with which physical addresses. In
user-mode this is covered by the mmap_lock.

The second case are TB context related structures which are protected by
tb_lock which is also user-mode only.

Currently the asserts do nothing in SoftMMU mode but this will change
for MTTCG.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-4-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 10:24:45 +01:00
Richard Henderson
fdbc2b5722 tcg: Add EXCP_ATOMIC
When we cannot emulate an atomic operation within a parallel
context, this exception allows us to stop the world and try
again in a serial context.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Peter Maydell
c43e853afe x86 and CPU queue, 2016-10-24
x2APIC support to APIC code, cpu_exec_init() refactor on all
 architectures, and other x86 changes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYDmYyAAoJECgHk2+YTcWmoSUP/2ga+b9YmPuyL7XC+12pff0I
 Z8gdjUzbMUNcCI0JMZCTGUJbs3BapLcnsA7ypmt88s9kG02WeDMhNx1BfYiAFgLU
 kPLQlXAM7awEdGagd3sTCiFojSUZ7GxYHjd5fuhPoOAXvXM8im6zJl18ZcsnStjO
 /J8JGoGDHq1XJlz+RIjnGamojJWCiO/+iiD+rFmVSic8zjHPDYq14sIk/QJX+DaF
 azLiOI6DAlX3kyrN5ZshhIRQ3COzzUMUSDF/ZaYHjudUco5MBnwj/oLQniTq+ZUd
 hCu7dr5TpLxI7q1yltyd0UIl/+aZGbE8tEvoXAtc735iK4m2CTckT7ql6x3xI+Ir
 PmpPgIswHqfCiCXm8imLj6ZI47kRA1x4x4AudLaNVKP7jO82485sS9HWpOadYsaU
 jvek2SqfqvH+vce4FzwlLEcXGDb73MT/XkIUvd7SfPIbs9umgdZc03U4SHfAWr0i
 lAIRs4Ym0AAS2WSE4E09wvdUUr9oxaQBMhw3JAiNmg7hLfyINTP+D/IhtlAVXXEA
 F9D7fky5lDwfKvIwPxPJbDD5bCBV9AmxhiahIhv3epu4Kg4orf1inkrx0IZWSbB0
 7+JZ7j8asuizfibkeZAN9rxVwmz32makJNsnjzZHlnaPxTvIDzvRkNceBnhC5vKq
 3yfxgl4agXmMjveraAtt
 =T2kg
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging

x86 and CPU queue, 2016-10-24

x2APIC support to APIC code, cpu_exec_init() refactor on all
architectures, and other x86 changes.

# gpg: Signature made Mon 24 Oct 2016 20:51:14 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request:
  exec: call cpu_exec_exit() from a CPU unrealize common function
  exec: move cpu_exec_init() calls to realize functions
  exec: split cpu_exec_init()
  pc: q35: Bump max_cpus to 288
  pc: Require IRQ remapping and EIM if there could be x2APIC CPUs
  pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs
  Increase MAX_CPUMASK_BITS from 255 to 288
  pc: Clarify FW_CFG_MAX_CPUS usage comment
  pc: kvm_apic: Pass APIC ID depending on xAPIC/x2APIC mode
  pc: apic_common: Reset APIC ID to initial ID when switching into x2APIC mode
  pc: apic_common: Restore APIC ID to initial ID on reset
  pc: apic_common: Extend APIC ID property to 32bit
  pc: Leave max apic_id_limit only in legacy cpu hotplug code
  acpi: cphp: Force switch to modern cpu hotplug if APIC ID > 254
  pc: acpi: x2APIC support for SRAT table
  pc: acpi: x2APIC support for MADT table and _MAT method

Conflicts:
	target-arm/cpu.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-25 10:25:27 +01:00
Laurent Vivier
ce5b1bbf62 exec: move cpu_exec_init() calls to realize functions
Modify all CPUs to call it from XXX_cpu_realizefn() function.

Remove all the cannot_destroy_with_object_finalize_yet as
unsafe references have been moved to cpu_exec_realizefn().
(tested with QOM command provided by commit 4c315c27)

for arm:

Setting of cpu->mp_affinity is moved from arm_cpu_initfn()
to arm_cpu_realizefn() as setting of cpu_index is now done
in cpu_exec_realizefn(). To avoid to overwrite an user defined
value, we set it to an invalid value by default, and update
it in realize function only if the value is still invalid.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:16 -02:00
Peter Maydell
20bccb82ff cpu: Support a target CPU having a variable page size
Support target CPUs having a page size which isn't knownn
at compile time. To use this, the CPU implementation should:
 * define TARGET_PAGE_BITS_VARY
 * not define TARGET_PAGE_BITS
 * define TARGET_PAGE_BITS_MIN to the smallest value it
   might possibly want for TARGET_PAGE_BITS
 * call set_preferred_target_page_bits() in its realize
   function to indicate the actual preferred target page
   size for the CPU (and report any error from it)

In CONFIG_USER_ONLY, the CPU implementation should continue
to define TARGET_PAGE_BITS appropriately for the guest
OS page size.

Machines which want to take advantage of having the page
size something larger than TARGET_PAGE_BITS_MIN must
set the MachineClass minimum_page_bits field to a value
which they guarantee will be no greater than the preferred
page size for any CPU they create.

Note that changing the target page size by setting
minimum_page_bits is a migration compatibility break
for that machine.

For debugging purposes, attempts to use TARGET_PAGE_SIZE
before it has been finally confirmed will assert.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-24 16:26:49 +01:00
Paolo Bonzini
9a54635dcb memory: add a per-AddressSpace list of listeners
This speeds up MEMORY_LISTENER_CALL noticeably.  Right now,
with many PCI devices you have N regions added to M AddressSpaces
(M = # PCI devices with bus-master enabled) and each call looks
up the whole listener list, with at least M listeners in it.
Because most of the regions in N are BARs, which are also roughly
proportional to M, the whole thing is O(M^3).  This changes it
to O(M^2), which is the best we can do without rewriting the
whole thing.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:19 +02:00
Paolo Bonzini
d45fa784cd memory: eliminate global MemoryListeners
There is none, so just drop the code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:19 +02:00
Dr. David Alan Gilbert
863e9621c5 RAMBlocks: Store page size
Store the page size in each RAMBlock, we need it later.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Sergey Fedorov
3359baad36 tcg: Make tb_flush() thread safe
Use async_safe_run_on_cpu() to make tb_flush() thread safe.  This is
possible now that code generation does not happen in the middle of
execution.

It can happen that multiple threads schedule a safe work to flush the
translation buffer. To keep statistics and debugging output sane, always
check if the translation buffer has already been flushed.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
[AJB: minor re-base fixes]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1470158864-17651-13-git-send-email-alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:30 +02:00
Paolo Bonzini
267f685b8b cpus-common: move CPU list management to common code
Add a mutex for the CPU list to system emulation, as it will be used to
manage safe work.  Abstract manipulation of the CPU list in new functions
cpu_list_add and cpu_list_remove.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:29 +02:00
Paolo Bonzini
9c1f8f4493 migration: sync all address spaces
Migrating a VM during reboot sometimes results in differences
between the source and destination in the SMRAM area.

This is because migration_bitmap_sync() only fetches from KVM
the dirty log of address_space_memory.  SMRAM memory slots
are ignored and the modifications to SMRAM are not sent to the
destination.

Reported-by: He Rongguang <herongguang.he@huawei.com>
Reviewed-by: He Rongguang <herongguang.he@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:29 +02:00
Peter Xu
5bf3d31903 memory: introduce IOMMUOps.notify_flag_changed
The new interface can be used to replace the old notify_started() and
notify_stopped(). Meanwhile it provides explicit flags so that IOMMUs
can know what kind of notifications it is requested for.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1474606948-14391-3-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 09:00:04 +02:00
Peter Xu
cdb3081269 memory: introduce IOMMUNotifier and its caps
IOMMU Notifier list is used for notifying IO address mapping changes.
Currently VFIO is the only user.

However it is possible that future consumer like vhost would like to
only listen to part of its notifications (e.g., cache invalidations).

This patch introduced IOMMUNotifier and IOMMUNotfierFlag bits for a
finer grained control of it.

IOMMUNotifier contains a bitfield for the notify consumer describing
what kind of notification it is interested in. Currently two kinds of
notifications are defined:

- IOMMU_NOTIFIER_MAP:    for newly mapped entries (additions)
- IOMMU_NOTIFIER_UNMAP:  for entries to be removed (cache invalidates)

When registering the IOMMU notifier, we need to specify one or multiple
types of messages to listen to.

When notifications are triggered, its type will be checked against the
notifier's type bits, and only notifiers with registered bits will be
notified.

(For any IOMMU implementation, an in-place mapping change should be
 notified with an UNMAP followed by a MAP.)

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1474606948-14391-2-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 08:59:16 +02:00
Richard Henderson
01ecaf438b tcg: Merge GETPC and GETRA
The return address argument to the softmmu template helpers was
confused.  In the legacy case, we wanted to indicate that there
is no return address, and so passed in NULL.  However, we then
immediately subtracted GETPC_ADJ from NULL, resulting in a non-zero
value, indicating the presence of an (invalid) return address.

Push the GETPC_ADJ subtraction down to the only point it's required:
immediately before use within cpu_restore_state_from_tb, after all
NULL pointer checks have been completed.

This makes GETPC and GETRA identical.  Remove GETRA as the lesser
used macro, replacing all uses with GETPC.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-16 08:12:11 -07:00
Paolo Bonzini
6d21e4208f tcg: Prepare TB invalidation for lockless TB lookup
When invalidating a translation block, set an invalid flag into the
TranslationBlock structure first.  It is also necessary to check whether
the target TB is still valid after acquiring 'tb_lock' but before calling
tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in
future. Note that we don't have to check 'last_tb'; an already invalidated
TB will not be executed anyway and it is thus safe to patch it.

Suggested-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-13 19:08:43 +02:00
Richard Henderson
dcb8e75870 tcg: Reorg TCGOp chaining
Instead of using -1 as end of chain, use 0, and link through the 0
entry as a fully circular double-linked list.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-08-05 21:44:18 +05:30
Peter Maydell
d9fe91d868 linux-user: Use correct alignment for long long on i386 guests
For i386, the ABI specifies that 'long long' (8 byte values)
need only be 4 aligned, but we were requiring them to be
8-aligned. This meant we were laying out the target_epoll_event
structure wrongly. Add a suitable ifdef to abitypes.h to
specify the i386-specific alignment requirement.

Reported-by: Icenowy Zheng <icenowy@aosc.xyz>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-08-04 16:34:59 +03:00
Igor Mammedov
1bc7e522d9 exec: Reduce CONFIG_USER_ONLY ifdeffenery
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-26 15:31:58 -03:00
Markus Armbruster
175de52487 Clean up decorations and whitespace around header guards
Cleaned up with scripts/clean-header-guards.pl.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12 16:20:46 +02:00
Markus Armbruster
2a6a4076e1 Clean up ill-advised or unusual header guards
Cleaned up with scripts/clean-header-guards.pl.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12 16:20:46 +02:00
Markus Armbruster
121d07125b Clean up header guards that don't match their file name
Header guard symbols should match their file name to make guard
collisions less likely.  Offenders found with
scripts/clean-header-guards.pl -vn.

Cleaned up with scripts/clean-header-guards.pl, followed by some
renaming of new guard symbols picked by the script to better ones.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12 16:19:16 +02:00
Markus Armbruster
a9c94277f0 Use #include "..." for our own headers, <...> for others
Tracked down with an ugly, brittle and probably buggy Perl script.

Also move includes converted to <...> up so they get included before
ours where that's obviously okay.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12 16:19:16 +02:00
Sergey Sorokin
b35399bb4e Fix confusing argument names in some common functions
There are functions tlb_fill(), cpu_unaligned_access() and
do_unaligned_access() that are called with access type and mmu index
arguments. But these arguments are named 'is_write' and 'is_user' in their
declarations. The patches fix the arguments to avoid a confusion.

Signed-off-by: Sergey Sorokin <afarallax@yandex.ru>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-id: 1465907177-1399402-1-git-send-email-afarallax@yandex.ru
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-12 13:06:08 +01:00
Sergey Sorokin
1f00b27f17 tcg: Improve the alignment check infrastructure
Some architectures (e.g. ARMv8) need the address which is aligned
to a size more than the size of the memory access.
To support such check it's enough the current costless alignment
check implementation in QEMU, but we need to support
an alignment size specifying.

Signed-off-by: Sergey Sorokin <afarallax@yandex.ru>
Message-Id: <1466705806-679898-1-git-send-email-afarallax@yandex.ru>
Signed-off-by: Richard Henderson <rth@twiddle.net>
[rth: Assert in tcg_canonicalize_memop.  Leave get_alignment_bits
available for, though unused by, user-mode.  Retain logging difference
based on ALIGNED_ONLY.]
2016-07-05 20:50:13 -07:00
Peter Maydell
39e0b03dec memory: Assert that memory_region_init_rom_device() ops aren't NULL
It doesn't make sense to pass a NULL ops argument to
memory_region_init_rom_device(), because the effect will
be that if the guest tries to write to the memory region
then QEMU will segfault. Catch the bug earlier by sanity
checking the arguments to this function, and remove the
misleading documentation that suggests that passing NULL
might be sensible.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1467122287-24974-4-git-send-email-peter.maydell@linaro.org
2016-07-04 13:06:35 +01:00
Peter Maydell
a1777f7f64 memory: Provide memory_region_init_rom()
Provide a new helper function memory_region_init_rom() for memory
regions which are read-only (and unlike those created by
memory_region_init_rom_device() don't have special behaviour
for writes). This has the same behaviour as calling
memory_region_init_ram() and then memory_region_set_readonly()
(which is what we do today in boards with pure ROMs) but is a
more easily discoverable API for the purpose.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1467122287-24974-2-git-send-email-peter.maydell@linaro.org
2016-07-04 13:06:35 +01:00
Alexey Kardashevskiy
d22d8956b1 memory: Add MemoryRegionIOMMUOps.notify_started/stopped callbacks
The IOMMU driver may change behavior depending on whether a notifier
client is present.  In the case of POWER, this represents a change in
the visibility of the IOTLB, for other drivers such as intel-iommu and
future AMD-Vi emulation, notifier support is not yet enabled and this
provides the opportunity to flag that incompatibility.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[new log & extracted from [PATCH qemu v17 12/12] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping listening]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-06-30 13:00:23 -06:00
Peter Crosthwaite
8642c1b81e target-*: Don't redefine cpu_exec()
This function needs to be converted to QOM hook and virtualised for
multi-arch. This rename interferes, as cpu-qom will not have access
to the renaming causing name divergence. This rename doesn't really do
anything anyway so just delete it.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <69bd25a8678b8b31b91cd9760c777bed1aafb44e.1437212383.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaitepeter@gmail.com>
2016-06-29 14:03:47 +02:00
Alexey Kardashevskiy
f682e9c244 memory: Add reporting of supported page sizes
Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate
uses when translating, however this information is not available outside
the translate context for various checks.

This adds a get_min_page_size callback to MemoryRegionIOMMUOps and
a wrapper for it so IOMMU users (such as VFIO) can know the minimum
actual page size supported by an IOMMU.

As IOMMU MR represents a guest IOMMU, this uses TARGET_PAGE_SIZE
as fallback.

This removes vfio_container_granularity() and uses new helper in
memory_region_iommu_replay() when replaying IOMMU mappings on added
IOMMU memory region.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
[dwg: Removed an unnecessary calculation]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 11:13:09 +10:00
Lluís Vilanova
dcdaadb6ea trace: [all] Add "guest_mem_before" event
The event is described in "trace-events". Note that the "MO_AMASK" flag
is not traced, since it does not seem to affect the visible semantics of
instructions.

[s/inline inline/inline/ to fix clang build.
--Stefan]

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 146549350711.18437.726780393247474362.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:21:56 +01:00
Emilio G. Cota
909eaac9bb tb hash: track translated blocks with qht
Having a fixed-size hash table for keeping track of all translation blocks
is suboptimal: some workloads are just too big or too small to get maximum
performance from the hash table. The MRU promotion policy helps improve
performance when the hash table is a little undersized, but it cannot
make up for severely undersized hash tables.

Furthermore, frequent MRU promotions result in writes that are a scalability
bottleneck. For scalability, lookups should only perform reads, not writes.
This is not a big deal for now, but it will become one once MTTCG matures.

The appended fixes these issues by using qht as the implementation of
the TB hash table. This solution is superior to other alternatives considered,
namely:

- master: implementation in QEMU before this patchset
- xxhash: before this patch, i.e. fixed buckets + xxhash hashing + MRU.
- xxhash-rcu: fixed buckets + xxhash + RCU list + MRU.
              MRU is implemented here by adding an intermediate struct
              that contains the u32 hash and a pointer to the TB; this
              allows us, on an MRU promotion, to copy said struct (that is not
              at the head), and put this new copy at the head. After a grace
              period, the original non-head struct can be eliminated, and
              after another grace period, freed.
- qht-fixed-nomru: fixed buckets + xxhash + qht without auto-resize +
                   no MRU for lookups; MRU for inserts.
The appended solution is the following:
- qht-dyn-nomru: dynamic number of buckets + xxhash + qht w/ auto-resize +
                 no MRU for lookups; MRU for inserts.

The plots below compare the considered solutions. The Y axis shows the
boot time (in seconds) of a debian jessie image with arm-softmmu; the X axis
sweeps the number of buckets (or initial number of buckets for qht-autoresize).
The plots in PNG format (and with errorbars) can be seen here:
  http://imgur.com/a/Awgnq

Each test runs 5 times, and the entire QEMU process is pinned to a
single core for repeatability of results.

                            Host: Intel Xeon E5-2690

  28 ++------------+-------------+-------------+-------------+------------++
     A*****        +             +             +             master **A*** +
  27 ++    *                                                 xxhash ##B###++
     |      A******A******                               xxhash-rcu $$C$$$ |
  26 C$$                  A******A******            qht-fixed-nomru*%%D%%%++
     D%%$$                              A******A******A*qht-dyn-mru A*E****A
  25 ++ %%$$                                          qht-dyn-nomru &&F&&&++
     B#####%                                                               |
  24 ++    #C$$$$$                                                        ++
     |      B###  $                                                        |
     |          ## C$$$$$$                                                 |
  23 ++           #       C$$$$$$                                         ++
     |             B######       C$$$$$$                                %%%D
  22 ++                  %B######       C$$$$$$C$$$$$$C$$$$$$C$$$$$$C$$$$$$C
     |                    D%%%%%%B######      @E@@@@@@    %%%D%%%@@@E@@@@@@E
  21 E@@@@@@E@@@@@@F&&&@@@E@@@&&&D%%%%%%B######B######B######B######B######B
     +             E@@@   F&&&   +      E@     +      F&&&   +             +
  20 ++------------+-------------+-------------+-------------+------------++
     14            16            18            20            22            24
                             log2 number of buckets

                                 Host: Intel i7-4790K

  14.5 ++------------+------------+-------------+------------+------------++
       A**           +            +             +            master **A*** +
    14 ++ **                                                 xxhash ##B###++
  13.5 ++   **                                           xxhash-rcu $$C$$$++
       |                                            qht-fixed-nomru %%D%%% |
    13 ++     A******                                   qht-dyn-mru @@E@@@++
       |             A*****A******A******             qht-dyn-nomru &&F&&& |
  12.5 C$$                               A******A******A*****A******    ***A
    12 ++ $$                                                        A***  ++
       D%%% $$                                                             |
  11.5 ++  %%                                                             ++
       B###  %C$$$$$$                                                      |
    11 ++  ## D%%%%% C$$$$$                                               ++
       |     #      %      C$$$$$$                                         |
  10.5 F&&&&&&B######D%%%%%       C$$$$$$C$$$$$$C$$$$$$C$$$$$C$$$$$$    $$$C
    10 E@@@@@@E@@@@@@B#####B######B######E@@@@@@E@@@%%%D%%%%%D%%%###B######B
       +             F&&          D%%%%%%B######B######B#####B###@@@D%%%   +
   9.5 ++------------+------------+-------------+------------+------------++
       14            16           18            20           22            24
                              log2 number of buckets

Note that the original point before this patch series is X=15 for "master";
the little sensitivity to the increased number of buckets is due to the
poor hashing function in master.

xxhash-rcu has significant overhead due to the constant churn of allocating
and deallocating intermediate structs for implementing MRU. An alternative
would be do consider failed lookups as "maybe not there", and then
acquire the external lock (tb_lock in this case) to really confirm that
there was indeed a failed lookup. This, however, would not be enough
to implement dynamic resizing--this is more complex: see
"Resizable, Scalable, Concurrent Hash Tables via Relativistic
Programming" by Triplett, McKenney and Walpole. This solution was
discarded due to the very coarse RCU read critical sections that we have
in MTTCG; resizing requires waiting for readers after every pointer update,
and resizes require many pointer updates, so this would quickly become
prohibitive.

qht-fixed-nomru shows that MRU promotion is advisable for undersized
hash tables.

However, qht-dyn-mru shows that MRU promotion is not important if the
hash table is properly sized: there is virtually no difference in
performance between qht-dyn-nomru and qht-dyn-mru.

Before this patch, we're at X=15 on "xxhash"; after this patch, we're at
X=15 @ qht-dyn-nomru. This patch thus matches the best performance that we
can achieve with optimum sizing of the hash table, while keeping the hash
table scalable for readers.

The improvement we get before and after this patch for booting debian jessie
with arm-softmmu is:

- Intel Xeon E5-2690: 10.5% less time
- Intel i7-4790K: 5.2% less time

We could get this same improvement _for this particular workload_ by
statically increasing the size of the hash table. But this would hurt
workloads that do not need a large hash table. The dynamic (upward)
resizing allows us to start small and enlarge the hash table as needed.

A quick note on downsizing: the table is resized back to 2**15 buckets
on every tb_flush; this makes sense because it is not guaranteed that the
table will reach the same number of TBs later on (e.g. most bootup code is
thrown away after boot); it makes sense to grow the hash table as
more code blocks are translated. This also avoids the complication of
having to build downsizing hysteresis logic into qht.

Reviewed-by: Sergey Fedorov <serge.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-15-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 17:11:16 -07:00
Emilio G. Cota
42bd32287f tb hash: hash phys_pc, pc, and flags with xxhash
For some workloads such as arm bootup, tb_phys_hash is performance-critical.
The is due to the high frequency of accesses to the hash table, originated
by (frequent) TLB flushes that wipe out the cpu-private tb_jmp_cache's.
More info:
  https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg05098.html

To dig further into this I modified an arm image booting debian jessie to
immediately shut down after boot. Analysis revealed that quite a bit of time
is unnecessarily spent in tb_phys_hash: the cause is poor hashing that
results in very uneven loading of chains in the hash table's buckets;
the longest observed chain had ~550 elements.

The appended addresses this with two changes:

1) Use xxhash as the hash table's hash function. xxhash is a fast,
   high-quality hashing function.

2) Feed the hashing function with not just tb_phys, but also pc and flags.

This improves performance over using just tb_phys for hashing, since that
resulted in some hash buckets having many TB's, while others getting very few;
with these changes, the longest observed chain on a single hash bucket is
brought down from ~550 to ~40.

Tests show that the other element checked for in tb_find_physical,
cs_base, is always a match when tb_phys+pc+flags are a match,
so hashing cs_base is wasteful. It could be that this is an ARM-only
thing, though. UPDATE:
On Tue, Apr 05, 2016 at 08:41:43 -0700, Richard Henderson wrote:
> The cs_base field is only used by i386 (in 16-bit modes), and sparc (for a TB
> consisting of only a delay slot).
> It may well still turn out to be reasonable to ignore cs_base for hashing.

BTW, after this change the hash table should not be called "tb_hash_phys"
anymore; this is addressed later in this series.

This change gives consistent bootup time improvements. I tested two
host machines:
- Intel Xeon E5-2690: 11.6% less time
- Intel i7-4790K: 19.2% less time

Increasing the number of hash buckets yields further improvements. However,
using a larger, fixed number of buckets can degrade performance for other
workloads that do not translate as many blocks (600K+ for debian-jessie arm
bootup). This is dealt with later in this series.

Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-8-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 23:10:19 +00:00
Emilio G. Cota
dc8b295d05 exec: add tb_hash_func5, derived from xxhash
This will be used by upcoming changes for hashing the tb hash.

Add this into a separate file to include the copyright notice from
xxhash.

Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-7-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 23:10:18 +00:00
Peter Maydell
6886b98036 cpu-exec: Rename cpu_resume_from_signal() to cpu_loop_exit_noexc()
The function cpu_resume_from_signal() is now always called with a
NULL puc argument, and is rather misnamed since it is never called
from a signal handler. It is essentially forcing an exit to the
top level cpu loop but without raising any exception, so rename
it to cpu_loop_exit_noexc() and drop the useless unused argument.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-4-git-send-email-peter.maydell@linaro.org
2016-06-09 15:55:02 +01:00
Peter Maydell
b66e10e4c9 linux-user pull request for June 2016
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAV1gdMrRIkN7ePJvAAQhLcg/+Kby99taEuewItrA1yDs75jxOlLqaJopd
 cVzo4LFRFPhIn4UEKqRQS0CGoIeU/DYOmObvuUzJxs2LyUoHoqmQOwEm5obC2a85
 JrHo/NOppYBbyvvIEAAXzZDCZo0KZKVclrlT+AX5obpOSNSvAnKvEuLWq1aQ9WGN
 n4AzHuFEl885cd4nFd8VK/xth89bqz6U/z8CjgIuw3mczp1XNrK5IJJwAy5epHay
 GCBr9XHooW3SU971WS20RTRS0D33tKPHgCU3ZeZ3rKh4g3JNj6/ixdVgzi9NqFsQ
 5DzAj/iBGhN1LtCOednRS6tUt32Bhy8G/g4O3GiXdejagAmNe2wz31cveNJ8S3W5
 DK8SZAnJlz06zN5uIpOVQgDOqfXZkCp7ndq779QJoHOAnuOjJBcUbhw1myz2R3eR
 6208tStWl3R0+ATEK8CZ7ejg1cUHvdzyqGJA+1nC2HaFUrBWipxN8jf2fz9vO/wG
 G7zNbahvVgyJWO7bPNK4mxkb6qkWCETnCnLJsq2ZbmtPEMcINjD8vNWLNvFGVG8b
 2HbinDrzh0Z9Zik5gLZfiVyP5HFaWSrJn9QRVIgaUjuIH9n3/25sl9OvW/JLjxJ+
 h2P17CLnAK6dhUYc4R3wQTx2X/N2FvO4DD8iMYOcgDY6fhZ2b6EEyE9yBgQrIDbF
 gU1AlC/CX+M=
 =AXqa
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160608' into staging

linux-user pull request for June 2016

# gpg: Signature made Wed 08 Jun 2016 14:27:14 BST
# gpg:                using RSA key 0xB44890DEDE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg:                 aka "Riku Voipio <riku.voipio@linaro.org>"

* remotes/riku/tags/pull-linux-user-20160608: (44 commits)
  linux-user: In fork_end(), remove correct CPUs from CPU list
  linux-user: Special-case ERESTARTSYS in target_strerror()
  linux-user: Make target_strerror() return 'const char *'
  linux-user: Correct signedness of target_flock l_start and l_len fields
  linux-user: Use safe_syscall wrapper for ioctl
  linux-user: Use safe_syscall wrapper for accept and accept4 syscalls
  linux-user: Use safe_syscall wrapper for semop
  linux-user: Use safe_syscall wrapper for epoll_wait syscalls
  linux-user: Use safe_syscall wrapper for poll and ppoll syscalls
  linux-user: Use safe_syscall wrapper for sleep syscalls
  linux-user: Use safe_syscall wrapper for rt_sigtimedwait syscall
  linux-user: Use safe_syscall wrapper for flock
  linux-user: Use safe_syscall wrapper for mq_timedsend and mq_timedreceive
  linux-user: Use safe_syscall wrapper for msgsnd and msgrcv
  linux-user: Use safe_syscall wrapper for send* and recv* syscalls
  linux-user: Use safe_syscall wrapper for connect syscall
  linux-user: Use safe_syscall wrapper for readv and writev syscalls
  linux-user: Fix error conversion in 64-bit fadvise syscall
  linux-user: Fix NR_fadvise64 and NR_fadvise64_64 for 32-bit guests
  linux-user: Fix handling of arm_fadvise64_64 syscall
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Conflicts:
	configure
	scripts/qemu-binfmt-conf.sh
2016-06-08 18:34:32 +01:00
Peter Maydell
e0ca2ed562 thunk: Rename args and fields in host-target bitmask conversion code
The target_to_host_bitmask() and host_to_target_bitmask() functions
and the associated struct bitmask_transtbl are completely generic,
but for historical reasons the target related fields and parameters
are named 'x86' and the host related fields are named 'alpha'.
Rename them to 'target' and 'host'.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Peter Maydell
7a00217d1a thunk: Drop unused NO_THUNK_TYPE_SIZE guards
The thunk_type_size_array() and thunk_type_align_array() functions
are only provided if NO_THUNK_TYPE_SIZE is not defined. However
nothing in the codebase defines that, and so in fact these functions
are always present. Drop the unnecessary #ifdefs.

(Over a decade ago thunk.h used to be included by some softmmu
files, which defined NO_THUNK_TYPE_SIZE, but these includes are
long gone; see for instance commit f193c7979c2f7.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Peter Maydell
0d5c21f2b3 qemu-common.h: Drop WORDS_ALIGNED define
The WORDS_ALIGNED #define is not used anywhere, and hasn't been since
2013 when commit 612d590ebc rewrote the various ld<type>_<endian>_p
functions to not use it. Remove the #define and the comment describing it.
Also remove the line in the comment about TARGET_WORDS_ALIGNED, since
it has never actually existed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Peter Maydell
24a6e0633a hw: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Timothy E Baldwin
8fdb9fef3d linux-user: Remove redundant gdb_queuesig()
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-22-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:07 +03:00
Paolo Bonzini
0878d0e11b exec: hide mr->ram_addr from qemu_get_ram_ptr users
Let users of qemu_get_ram_ptr and qemu_ram_ptr_length pass in an
address that is relative to the MemoryRegion.  This basically means
what address_space_translate returns.

Because the semantics of the second parameter change, rename the
function to qemu_map_ram_ptr.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
07bdaa4196 memory: split memory_region_from_host from qemu_ram_addr_from_host
Move the old qemu_ram_addr_from_host to memory_region_from_host and
make it return an offset within the region.  For qemu_ram_addr_from_host
return the ram_addr_t directly, similar to what it was before
commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host",
2013-07-04).

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
f615f39616 exec: remove ram_addr argument from qemu_ram_block_from_host
Of the two callers, one does not use it, and the other can compute
it itself based on the other output argument (offset) and the RAMBlock.

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
4ff87573df memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr
Remove direct uses of ram_addr_t and optimize memory_region_{get,set}_fd
now that a MemoryRegion knows its RAMBlock directly.

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Fam Zheng
b613597819 memory: Remove code for mr->may_overlap
The collision check does nothing and hasn't been used. Remove the
variable together with related code.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1458900629-2334-2-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:44 +02:00
Gonglei
fa53a0e53e memory: drop find_ram_block()
On the one hand, we have already qemu_get_ram_block() whose function
is similar. On the other hand, we can directly use mr->ram_block but
searching RAMblock by ram_addr which is a kind of waste.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1462845901-89716-2-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:44 +02:00
Paolo Bonzini
df43d49cb8 hw: clean up hw/hw.h includes
Include qom/object.h and exec/memory.h instead of exec/ioport.h;
exec/ioport.h was almost everywhere required only for those two
includes, not for the content of the header itself.

Remove block/aio.h, everybody is already including it through
another path.

With this change, include/hw/hw.h is freed from qemu-common.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:30 +02:00
Paolo Bonzini
89a80e7400 hw: remove pio_addr_t
pio_addr_t is almost unused, because these days I/O ports are simply
accessed through the address space.  cpu_{in,out}[bwl] themselves are
almost unused; monitor.c and xen-hvm.c could use address_space_read/write
directly, since they have an integer size at hand.  This leaves qtest as
the only user of those functions.

On the other hand even portio_* functions use this type; the only
interesting use of pio_addr_t thus is include/hw/sysbus.h.  I guess I
could move it there, but I don't see much benefit in that either.  Using
uint32_t is enough and avoids the need to include ioport.h everywhere.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:30 +02:00
Paolo Bonzini
63c915526d cpu: move exec-all.h inclusion out of cpu.h
exec-all.h contains TCG-specific definitions.  It is not needed outside
TCG-specific files such as translate.c, exec.c or *helper.c.

One generic function had snuck into include/exec/exec-all.h; move it to
include/qom/cpu.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
00f6da6a1a exec: extract exec/tb-context.h
TCG backends do not need most of exec-all.h; extract what they actually
need to a separate file or move it directly to tcg.h.  The next patch
will stop including exec-all.h from everywhere.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
33c11879fd qemu-common: push cpu.h inclusion out of qemu-common.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
87776ab72b qemu-common: stop including qemu/host-utils.h from qemu-common.h
Move it to the actual users.  There are some inclusions of
qemu/host-utils.h in headers, but they are all necessary.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
a7d6039cb3 cpu: move endian-dependent load/store functions to cpu-all.h
Disentangle cpu-common.h and memory.h from NEED_CPU_H.  Prototypes are
not defined for !NEED_CPU_H, so remove them from poison.h too.  Only
macros need poisoning.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
bdd902277c include: poison symbols in osdep.h
Ensure that all target-independent files ignore poisoned symbols,
and fix the fallout.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Sergey Fedorov
6f789be56d tcg: Rework tb_invalidated_flag
'tb_invalidated_flag' was meant to catch two events:
 * some TB has been invalidated by tb_phys_invalidate();
 * the whole translation buffer has been flushed by tb_flush().

Then it was checked:
 * in cpu_exec() to ensure that the last executed TB can be safely
   linked to directly call the next one;
 * in cpu_exec_nocache() to decide if the original TB should be provided
   for further possible invalidation along with the temporarily
   generated TB.

It is always safe to patch an invalidated TB since it is not going to be
used anyway. It is also safe to call tb_phys_invalidate() for an already
invalidated TB. Thus, setting this flag in tb_phys_invalidate() is
simply unnecessary. Moreover, it can prevent from pretty proper linking
of TBs, if any arbitrary TB has been invalidated. So just don't touch it
in tb_phys_invalidate().

If this flag is only used to catch whether tb_flush() has been called
then rename it to 'tb_flushed'. Declare it as 'bool' and stick to using
only 'true' and 'false' to set its value. Also, instead of setting it in
tb_gen_code(), just after tb_flush() has been called, do it right inside
of tb_flush().

In cpu_exec(), this flag is used to track if tb_flush() has been called
and have made 'next_tb' (a reference to the last executed TB) invalid
for linking it to directly call the next TB. tb_flush() can be called
during the CPU execution loop from tb_gen_code(), during TB execution or
by another thread while 'tb_lock' is released. Catch for translation
buffer flush reliably by resetting this flag once before first TB lookup
and each time we find it set before trying to add a direct jump. Don't
touch in in tb_find_physical().

Each vCPU has its own execution loop in multithreaded mode and thus
should have its own copy of the flag to be able to reset it with its own
'next_tb' and don't affect any other vCPU execution thread. So make this
flag per-vCPU and move it to CPUState.

In cpu_exec_nocache(), we only need to check if tb_flush() has been
called from tb_gen_code() called by cpu_exec_nocache() itself. To do
this reliably, preserve the old value of the flag, reset it before
calling tb_gen_code(), check afterwards, and combine the saved value
back to the flag.

This patch is based on the patch "tcg: move tb_invalidated_flag to
CPUState" from Paolo Bonzini <pbonzini@redhat.com>.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Sergey Fedorov
9962c478b1 tcg: Clarify thread safety check in tb_add_jump()
The check is to make sure that another thread hasn't already done the
same while we were outside of tb_lock. Mention this in a comment.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
c37e6d7e35 tcg: Use uintptr_t type for jmp_list_{next|first} fields of TB
These fields do not contain pure pointers to a TranslationBlock
structure. So uintptr_t is the most appropriate type for them.
Also put some asserts to assure that the two least significant bits of
the pointer are always zero before assigning it to jmp_list_first.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
f309101c26 tcg: Clean up direct block chaining data fields
Briefly describe in a comment how direct block chaining is done. It
should help in understanding of the following data fields.

Rename some fields in TranslationBlock and TCGContext structures to
better reflect their purpose (dropping excessive 'tb_' prefix in
TranslationBlock but keeping it in TCGContext):
   tb_next_offset  =>  jmp_reset_offset
   tb_jmp_offset   =>  jmp_insn_offset
   tb_next         =>  jmp_target_addr
   jmp_next        =>  jmp_list_next
   jmp_first       =>  jmp_list_first

Avoid using a magic constant as an invalid offset which is used to
indicate that there's no n-th jump generated.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
10b4f48555 tcg: Note requirement on atomic direct jump patching
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1461341333-19646-12-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
7d14e0e2d6 tcg/arm: Make direct jump patching thread-safe
Ensure direct jump patching in ARM is atomic by using
atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-8-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
ed3d51ecd7 tcg/s390: Make direct jump patching thread-safe
Ensure direct jump patching in s390 is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-7-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
0d07abf05e tcg/i386: Make direct jump patching thread-safe
Ensure direct jump patching in i386 is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() for code patching.

tcg_out_nopn() implementation:
Suggested-by: Richard Henderson <rth@twiddle.net>.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-6-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
76442a939e tci: Make direct jump patching thread-safe
Ensure direct jump patching in TCI is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() to load/store the address.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-4-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:40 -10:00
Emilio G. Cota
89fee74a0f tb: consistently use uint32_t for tb->flags
We are inconsistent with the type of tb->flags: usage varies loosely
between int and uint64_t. Settle to uint32_t everywhere, which is
superior to both: at least one target (aarch64) uses the most significant
bit in the u32, and uint64_t is wasteful.

Compile-tested for all targets.

Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1460049562-23517-1-git-send-email-cota@braap.org>
2016-05-12 14:06:40 -10:00
Edgar E. Iglesias
25caa94c4a gen-icount: Use tcg_set_insn_param
Use tcg_set_insn_param() instead of directly accessing internal
tcg data structures to update an insn param.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1461931684-1867-3-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:26 +01:00
Alex Bennée
d977e1c2db qemu-log: dfilter-ise exec, out_asm, op and opt_op
This ensures the code generation debug code will honour -dfilter if set.
For the "exec" tracing I've added a new inline macro for efficiency's
sake.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aureL32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1458052224-9316-8-git-send-email-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:18 +01:00
Peter Maydell
1a83063522 qemu-log: Improve the "exec" TB execution logging
Improve the TB execution logging so that it is easier to identify
what is happening from trace logs:
 * move the "Trace" logging of executed TBs into cpu_tb_exec()
   so that it is emitted if and only if we actually execute a TB,
   and for consistency for the CPU state logging
 * log when we link two TBs together via tb_add_jump()
 * log when cpu_tb_exec() returns early from a chain of TBs

The new style logging looks like this:

Trace 0x7fb7cc822ca0 [ffffffc0000dce00]
Linking TBs 0x7fb7cc822ca0 [ffffffc0000dce00] index 0 -> 0x7fb7cc823110 [ffffffc0000dce10]
Trace 0x7fb7cc823110 [ffffffc0000dce10]
Trace 0x7fb7cc823420 [ffffffc000302688]
Trace 0x7fb7cc8234a0 [ffffffc000302698]
Trace 0x7fb7cc823520 [ffffffc0003026a4]
Trace 0x7fb7cc823560 [ffffffc0000dce44]
Linking TBs 0x7fb7cc823560 [ffffffc0000dce44] index 1 -> 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc8235d0 [ffffffc0000dce70]
Stopped execution of TB chain before 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc822fd0 [ffffffc0000dd52c]

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[AJB: reword patch title, Abandoned->Stopped]
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1458052224-9316-6-git-send-email-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:18 +01:00
Markus Armbruster
14b6d44d47 Use scripts/clean-includes to drop redundant qemu/typedefs.h
Re-run scripts/clean-includes to apply the previous commit's
corrections and updates.  Besides redundant qemu/typedefs.h, this only
finds a redundant config-host.h include in ui/egl-helpers.c.  No idea
how that escaped the previous runs.

Some manual whitespace trimming around dropped includes squashed in.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:16 +01:00
Fam Zheng
f1060c55bf exec: Pass RAMBlock pointer to qemu_ram_free
The only caller now knows exactly which RAMBlock to free, so it's not
necessary to do the lookup.

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-6-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:37 +01:00
Fam Zheng
8e41fb63c5 memory: Drop MemoryRegion.ram_addr
All references to mr->ram_addr are replaced by
memory_region_get_ram_addr(mr) (except for a few assertions that are
replaced with mr->ram_block).

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-5-git-send-email-famz@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:29 +01:00
Fam Zheng
7ebb2745ac memory: Implement memory_region_get_ram_addr with mr->ram_block
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-4-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:18:28 +01:00
Fam Zheng
528f46af6e exec: Return RAMBlock pointer from allocating functions
Previously we return RAMBlock.offset; now return the pointer to the
whole structure.

ram_block_add returns void now, error is completely passed with errp.

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-2-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:18:28 +01:00
Peter Maydell
586fc27e6a * Asynchronous dump-guest-memory from Peter
* improved logging with -D -daemonize from Dimitris
 * more address_space_* optimization from Gonglei
 * TCG xsave/xrstor thinko fix
 * chardev bugfix and documentation patch
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJWzxnbAAoJEL/70l94x66DlfkIAJyo9kOareVLOnAE8ccayghk
 0SbU1ZR9etgdeH4vZIUKzFzSg86pnqqub/w9yxgNG35PsiXVOnzgYSv6W1qsXdVE
 v32u6c+vWHvHc3cCOFI5+6nURftf2+p/vB2VFXiI23VUbhs22UAjXsUdfbp321X5
 Krme2fxk0kmwPHoKiyek0qiXa8nt0fiuFzU7DN7gTQMoDFaDEqvcULlNJHUFnep+
 M1yQfhSzrD97bafPwmIDU0LejJxrzR6phuzWedugU1tay6y3pD85wQqHdFI7fn/o
 dC4F+vg41Lc/5jKUgRMpe5FmX4VM+DvRdYgZZsp9/SkBM7DuL7crhVWVwvQj82Y=
 =8BvG
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Asynchronous dump-guest-memory from Peter
* improved logging with -D -daemonize from Dimitris
* more address_space_* optimization from Gonglei
* TCG xsave/xrstor thinko fix
* chardev bugfix and documentation patch

# gpg: Signature made Thu 25 Feb 2016 15:12:27 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream:
  target-i386: fix confusion in xcr0 bit position vs. mask
  chardev: Properly initialize ChardevCommon components
  memory: Remove unreachable return statement
  memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length
  exec: store RAMBlock pointer into memory region
  log: Redirect stderr to logfile if deamonized
  dump-guest-memory: add qmp event DUMP_COMPLETED
  Dump: add hmp command "info dump"
  Dump: add qmp command "query-dump"
  DumpState: adding total_size and written_size fields
  dump-guest-memory: add "detach" support
  dump-guest-memory: disable dump when in INMIGRATE state
  dump-guest-memory: introduce dump_process() helper function.
  dump-guest-memory: add dump_in_progress() helper function
  dump-guest-memory: using static DumpState, add DumpStatus
  dump-guest-memory: add "detach" flag for QMP/HMP interfaces.
  dump-guest-memory: cleanup: removing dump_{error|cleanup}().
  scripts/kvm/kvm_stat: Fix missing right parantheses and ".format(...)"
  qemu-options.hx: Improve documentation of chardev multiplexing mode

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-25 15:30:57 +00:00
Gonglei
d61524486c memory: Remove unreachable return statement
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1455935721-8804-4-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:29 +01:00
Gonglei
3655cb9c73 memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length
these two functions consume too much cpu overhead to
find the RAMBlock by ram address.

After this patch, we can pass the RAMBlock pointer
to them so that they don't need to find the RAMBlock
anymore most of the time. We can get better performance
in address translation processing.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1455935721-8804-3-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:29 +01:00
Gonglei
58eaa2174e exec: store RAMBlock pointer into memory region
Each RAM memory region has a unique corresponding RAMBlock.
In the current realization, the memory region only stored
the ram_addr which means the offset of RAM address space,
We need to qurey the global ram.list to find the ram block
by ram_addr if we want to get the ram block, which is very
expensive.

Now, we store the RAMBlock pointer into memory region
structure. So, if we know the mr, we can easily get the
RAMBlock.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1456130097-4208-2-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:26 +01:00
Peter Maydell
df215b59d9 vhost, virtio, pci, pc
Fixes all over the place.
 virtio dataplane migration support.
 Old q35 machine types removed.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWzuKeAAoJECgfDbjSjVRpGzIH/1Tz6CoEq1rowiyVJ9B80oQU
 gDI2YWnJDSwJllmAF0rmoPRBQR8op3ZETZiCAcADHoZ7kdBNWGbyQeaDrrEPH7Q/
 rCDVt8Q3g80vs89aWKG0nQ16J2MW5TbkuiQw7pjQSdc9AbUdWpUqSiWnpZ+sPAql
 6DuVpjQ4/rN2alucXoa1Sir8KDDV7kBuY8U6/KoY890qzh842dv2523qvuCza9yR
 KX8Imj3oQAFjFSv5t1aOD3yYvWFd73EsReHPLGb1JtsVr/6wjs0sFUyA3JicBgnT
 +kWoSObWikfDY69HnqTkJpkun6woMM3zW5h2SkUBf9QP3yqLfGIp9uSriNN84Ak=
 =KXyh
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

vhost, virtio, pci, pc

Fixes all over the place.
virtio dataplane migration support.
Old q35 machine types removed.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 25 Feb 2016 11:16:46 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (21 commits)
  q35: No need to check gigabyte_align
  q35: Remove unused q35-acpi-dsdt.aml file
  ich9: Remove enable_tco arguments from init functions
  machine: Remove no_tco field
  q35: Remove old machine versions
  tests/vhost-user-bridge: fix build on 32 bit systems
  vring: remove
  virtio-scsi: do not use vring in dataplane
  virtio-blk: do not use vring in dataplane
  virtio-blk: fix "disabled data plane" mode
  virtio: export vring_notify as virtio_should_notify
  virtio: add AioContext-specific function for host notifiers
  vring: make vring_enable_notification return void
  block-migration: acquire AioContext as necessary
  pci core: function pci_bus_init() cleanup
  pci core: function pci_host_bus_register() cleanup
  balloon: Use only 'pc-dimm' type dimm for ballooning
  virtio-balloon: rewrite get_current_ram_size()
  move get_current_ram_size to virtio-balloon.c
  vhost-user: don't merge regions with different fds
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-25 12:13:49 +00:00
Peter Maydell
90ce6e2644 include: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

NB: If this commit breaks compilation for your out-of-tree
patchseries or fork, then you need to make sure you add
#include "qemu/osdep.h" to any new .c files that you have.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-23 12:43:05 +00:00
Vladimir Sementsov-Ogievskiy
39de99843e move get_current_ram_size to virtio-balloon.c
get_current_ram_size() is used only in virtio-balloon.c
This patch moves it into virtio-balloon and make it static, to allow
some balloon-specific tuning.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-23 12:55:16 +02:00
Paolo Bonzini
88c73d16ad memory: fix usage of find_next_bit and find_next_zero_bit
The last two arguments to these functions are the last and first bit to
check relative to the base.  The code was using incorrectly the first
bit and the number of bits.  Fix this in cpu_physical_memory_get_dirty
and cpu_physical_memory_all_dirty.  This requires a few changes in the
iteration; change the code in cpu_physical_memory_set_dirty_range to
match.

Fixes: 5b82b70
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Leon Alrae <leon.alrae@imgtec.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 1455113505-11237-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-10 22:38:24 +00:00
Stefan Hajnoczi
5b82b703b6 memory: RCU ram_list.dirty_memory[] for safe RAM hotplug
Although accesses to ram_list.dirty_memory[] use atomics so multiple
threads can safely dirty the bitmap, the data structure is not fully
thread-safe yet.

This patch handles the RAM hotplug case where ram_list.dirty_memory[] is
grown.  ram_list.dirty_memory[] is change from a regular bitmap to an
RCU array of pointers to fixed-size bitmap blocks.  Threads can continue
accessing bitmap blocks while the array is being extended.  See the
comments in the code for an in-depth explanation of struct
DirtyMemoryBlocks.

I have tested that live migration with virtio-blk dataplane works.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1453728801-5398-2-git-send-email-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Paolo Bonzini
8bafcb2164 memory: add early bail out from cpu_physical_memory_set_dirty_range
This condition is true in the common case, so we can cut out the body of
the function.  In addition, this makes it easier for the compiler to do
at least partial inlining, even if it decides that fully inlining the
function is unreasonable.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
zhanghailiang
4c4bad4861 ram: Split host_from_stream_offset() into two helper functions
Split host_from_stream_offset() into two parts:
One is to get ram block, which the block idstr may be get from migration
stream, the other is to get hva (host) address from block and the offset.
Besides, we will do the check working in a new helper offset_in_ramblock().

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1452829066-9764-2-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:50 +05:30
Paolo Bonzini
508127e243 log: do not unnecessarily include qom/cpu.h
Split the bits that require it to exec/log.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1452174932-28657-8-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:10 +00:00
Peter Crosthwaite
f0c02d15b5 memory: Add address_space_init_shareable()
This will either create a new AS or return a pointer to an
already existing equivalent one, if we have already created
an AS for the specified root memory region.

The motivation is to reuse address spaces as much as possible.
It's going to be quite common that bus masters out in device land
have pointers to the same memory region for their mastering yet
each will need to create its own address space. Let the memory
API implement sharing for them.

Aside from the perf optimisations, this should reduce the amount
of redundant output on info mtree as well.

Thee returned value will be malloced, but the malloc will be
automatically freed when the AS runs out of refs.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[PMM: dropped check for NULL root as unused; added doc-comment;
 squashed Peter C's reference-counting patch into this one;
 don't compare name string when deciding if we can share ASes;
 read as->malloced before the unref of as->root to avoid possible
 read-after-free if as->root was the owner of as]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:06 +00:00
Peter Maydell
651a5bc037 exec.c: Add cpu_get_address_space()
Add a function to return the AddressSpace for a CPU based on
its numerical index. (Callers outside exec.c don't have access
to the CPUAddressSpace struct so can't just fish it out of the
CPUState struct directly.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:05 +00:00
Peter Maydell
a54c87b68a exec.c: Pass MemTxAttrs to iotlb_to_region so it uses the right AS
Pass the MemTxAttrs for the memory access to iotlb_to_region(); this
allows it to determine the correct AddressSpace to use for the lookup.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:05 +00:00
Peter Maydell
d7898cda81 cputlb.c: Use correct address space when looking up MemoryRegionSection
When looking up the MemoryRegionSection for the new TLB entry in
tlb_set_page_with_attrs(), use cpu_asidx_from_attrs() to determine
the correct address space index for the lookup, and pass it into
address_space_translate_for_iotlb().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:05 +00:00
Peter Maydell
1787cc8ee5 exec-all.h: Document tlb_set_page_with_attrs, tlb_set_page
Add documentation comments for tlb_set_page_with_attrs()
and tlb_set_page().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:04 +00:00
Peter Maydell
12ebc9a76d exec.c: Allow target CPUs to define multiple AddressSpaces
Allow multiple calls to cpu_address_space_init(); each
call adds an entry to the cpu->ases array at the specified
index. It is up to the target-specific CPU code to actually use
these extra address spaces.

Since this multiple AddressSpace support won't work with
KVM, add an assertion to avoid confusing failures.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:04 +00:00
Peter Maydell
56943e8cc1 exec.c: Don't set cpu->as until cpu_address_space_init
Rather than setting cpu->as unconditionally in cpu_exec_init
(and then having target-i386 override this later), don't set
it until the first call to cpu_address_space_init.

This requires us to initialise the address space for
both TCG and KVM (KVM doesn't need the AS listener but
it does require cpu->as to be set).

For target CPUs which don't set up any address spaces (currently
everything except i386), add the default address_space_memory
in qemu_init_vcpu().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:04 +00:00
Tetsuya Mukawa
56a571d9c8 ivshmem: Store file descriptor for vhost-user negotiation
If virtio-net driver allocates memory in ivshmem shared memory,
vhost-net will work correctly, but vhost-user will not work because
a fd of shared memory will not be sent to vhost-user backend.
This patch fixes ivshmem to store file descriptor of shared memory.
It will be used when vhost-user negotiates vhost-user backend.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-01-09 23:20:20 +02:00
Paolo Bonzini
3cc8f88499 memory: try to inline constant-length reads
memcpy can take a large amount of time for small reads and writes.
Handle the common case of reading s/g descriptors from memory (there
is no corresponding "write" case that is as common, because writes
often use address_space_st* functions) by inlining the relevant
parts of address_space_read into the caller.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
1619d1fe73 memory: inline a few small accessors
These are used in the address_space_* fast paths.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
a203ac702e memory: extract first iteration of address_space_read and address_space_write
We want to inline the case where there is only one iteration, because
then the compiler can also inline the memcpy.  As a start, extract
everything after the first address_space_translate call.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
612263cf33 memory: avoid unnecessary object_ref/unref
For the common case of DMA into non-hotplugged RAM, it is unnecessary
but expensive to do object_ref/unref.  Add back an owner field to
MemoryRegion, so that these memory regions can skip the reference
counting.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
Paolo Bonzini
a676854f34 memory: reorder MemoryRegion fields
Order fields so that all fields accessed during a RAM read/write fit in
the same cache line.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
Paolo Bonzini
49b24afcb1 exec: always call qemu_get_ram_ptr within rcu_read_lock
Simplify the code and document the assumption.  The only caller
that is not within rcu_read_lock is memory_region_get_ram_ptr.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
Eduardo Habkost
a29ac16632 exec: Eliminate qemu_ram_free_from_ptr()
Replace qemu_ram_free_from_ptr() with qemu_ram_free().

The only difference between qemu_ram_free_from_ptr() and
qemu_ram_free() is that g_free_rcu() is used instead of
call_rcu(reclaim_ramblock). We can safely replace it because:

* RAM blocks allocated by qemu_ram_alloc_from_ptr() always have
  RAM_PREALLOC set;
* reclaim_ramblock(block) will do nothing except g_free(block)
  if RAM_PREALLOC is set at block->flags.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1446844805-14492-2-git-send-email-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 15:24:33 +01:00
Paolo Bonzini
0c2d70c448 translate-all: ensure host page mask is always extended with 1's
Anthony reported that >4GB guests on Xen with 32bit QEMU broke after
commit 4ed023c ("Round up RAMBlock sizes to host page sizes", 2015-11-05).

In that patch sizes are masked against qemu_host_page_size/mask which
are uintptr_t, and thus 32bit on a 32bit QEMU, even though the ram space
might be bigger than 4GB on Xen.

Since ram_addr_t is not available on user-mode emulation targets, ensure
that we get a sign extension when masking away the low bits of the address.
Remove the ~10 year old scary comment that the type of these variables
is probably wrong, with another equally scary comment.  The new comment
however does not have "???" in it, which is arguably an improvement.

For completeness use the alignment macros in linux-user and bsd-user
instead of manually doing an &.  linux-user and bsd-user are not affected
by the Xen issue, however.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reported-by: Anthony PERARD <anthony.perard@citrix.com>
Fixes: 4ed023ce2a
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-02 13:12:30 +01:00
Dr. David Alan Gilbert
e3dd74934f qemu_ram_block_by_name
Add a function to find a RAMBlock by name; use it in two
of the places that already open code that loop; we've
got another use later in postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 14:51:48 +01:00
Dr. David Alan Gilbert
422148d3e5 qemu_ram_block_from_host
Postcopy sends RAMBlock names and offsets over the wire (since it can't
rely on the order of ramaddr being the same), and it starts out with
HVA fault addresses from the kernel.

qemu_ram_block_from_host translates a HVA into a RAMBlock, an offset
in the RAMBlock and the global ram_addr_t value.

Rewrite qemu_ram_addr_from_host to use qemu_ram_block_from_host.

Provide qemu_ram_get_idstr since its the actual name text sent on the
wire.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 14:51:48 +01:00
Dr. David Alan Gilbert
87f50caa30 Move page_size_init earlier
The HOST_PAGE_ALIGN macros don't work until the page size variables
have been set up; later in postcopy I use those macros in the RAM
code, and it can be triggered using -object.

Fix this by initialising page_size_init() earlier - it's currently
initialised inside the accelerators, move it up into vl.c.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 14:51:48 +01:00
Pavel Dovgalyuk
56c0269a9e cpu-exec: allow temporary disabling icount
This patch is required for deterministic replay to generate an exception
by trying executing an instruction without changing icount.
It adds new flag to TB for disabling icount while translating it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162359.8676.77011.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-05 12:19:09 +01:00
Pavel Fedin
a05f686ff3 hw/pci: Introduce pci_requester_id()
For GICv3 ITS implementation we are going to use requester IDs in KVM IRQ
routing code. This patch introduces reusable convenient way to obtain this
ID from the device pointer. The new function is now used in some places,
where the same calculation was used.

MemTxAttrs.stream_id also renamed to requester_id in order to better
reflect semantics of the field.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <5814bcb03a297f198e796b13ed9c35059c52f89b.1444916432.git.p.fedin@samsung.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-19 10:13:07 +02:00
Paolo Bonzini
88401cbc5b exec: remove non-TCG stuff from exec-all.h header.
The header is included from basically everywhere, thanks to cpu.h.
It should be moved to the (TCG only) files that actually need it.
As a start, remove non-TCG stuff.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-12 18:29:26 +02:00
Peter Maydell
32857f4d5e exec.c: Collect AddressSpace related fields into a CPUAddressSpace struct
Gather up all the fields currently in CPUState which deal with the CPU's
AddressSpace into a separate CPUAddressSpace struct. This paves the way
for allowing the CPU to know about more than one AddressSpace.

The rearrangement also allows us to make the MemoryListener a directly
embedded object in the CPUAddressSpace (it could not be embedded in
CPUState because 'struct MemoryListener' isn't defined for the user-only
builds). This allows us to resolve the FIXME in tcg_commit() by going
directly from the MemoryListener to the CPUAddressSpace.

This patch extracts the actual update of the cached dispatch pointer
from cpu_reload_memory_map() (which is renamed accordingly to
cpu_reloading_memory_map() as it is only responsible for breaking
cpu-exec.c's RCU critical section now). This lets us keep the definition
of the CPUAddressSpace struct private to exec.c.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1443709790-25180-4-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-12 18:29:26 +02:00
Peter Maydell
1d27b91723 VFIO updates 2015-10-07
- Change platform device IRQ setup sequence for compatibility
    with upcoming IRQ forwarding (Eric Auger)
  - Extensions to support vfio-pci devices on spapr-pci-host-bridge
    (David Gibson) [clang problem patch dropped]
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWFTqsAAoJECObm247sIsiXo8P/1hLkZGQ7pqHXj6M+UmkM9ur
 Px6t+ZnFbhyf3tiU8Z0KoB7u+du73Z8E7swKqqcaal68j/zFhNtYC6ACSSGhOsDo
 ROR+/fg6HIJUeKkwVNKTBN5l8s6W6QLxPc/JLWYPI4YwIJj0GEGJNjoebUrcsjtU
 pCkezdMo0Wy2rDJzg5KWeSoZqoXIiWHo5MccgRsWQLf2dVAc6P8T5iNQFqSqy2N/
 1lVLNSoneCWcD+Erw7HjgwP83jwnZWKjPScJvckzXznuHa02k1wSN/ipNf2ENcrz
 C/jXcPczmEsUDpKu6ujtPj2/+X2F+Pz+C+rJsWfgUKo+iiwNqfziuZX0GEd+BqWD
 g8VxvS6+eZ6V6NN2Mhyofdp3hlWI4bcee5ORxAFv4CQjKV3etVSlkFhMARDwmw5V
 h38vvrEDNRxd6DyMR29mgUZ4wIf8u9wicpuQc4CevebPGUzXmMk3KH2hfvD1BJlt
 /SmmZMEkQTBbYQaEChX/op0H0ype+RkoVEs2TYxlGBL0LPkY2FOCCvEbPYCweuVf
 UNFjx4kj1NK4/CvwsXrFfzORp5T21XFWOakbWL+vGM06fBMo6oRmKoMRmZJxmCvT
 k5dBFazeSV5m9t2XS6GQeJoenMzVo9o3s2hS+WhjQqjVgLcC7HbPF+gjcQekRlB1
 wsc5badWI35H+Uio6kqF
 =MzWe
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20151007.0' into staging

VFIO updates 2015-10-07

 - Change platform device IRQ setup sequence for compatibility
   with upcoming IRQ forwarding (Eric Auger)
 - Extensions to support vfio-pci devices on spapr-pci-host-bridge
   (David Gibson) [clang problem patch dropped]

# gpg: Signature made Wed 07 Oct 2015 16:30:52 BST using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"

* remotes/awilliam/tags/vfio-update-20151007.0:
  vfio: Allow hotplug of containers onto existing guest IOMMU mappings
  memory: Allow replay of IOMMU mapping notifications
  vfio: Record host IOMMU's available IO page sizes
  vfio: Check guest IOVA ranges against host IOMMU capabilities
  vfio: Generalize vfio_listener_region_add failure path
  vfio: Remove unneeded union from VFIOContainer
  hw/vfio/platform: do not set resamplefd for edge-sensitive IRQS
  hw/vfio/platform: change interrupt/unmask fields into pointer
  hw/vfio/platform: irqfd setup sequence update

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-10-08 16:50:34 +01:00
Richard Henderson
126d89e8cd tcg: Adjust CODE_GEN_AVG_BLOCK_SIZE
At present, the "average" guestimate of TB size is way too small, leading
to many unused entries in the pre-allocated TB array.  For a guest with 1GB
ram, we're currently allocating 256MB for the array.

Survey arm, alpha, aarch64, ppc, sparc, i686, x86_64 guests running on
x86_64 and ppc64 hosts and select a new average.  The size of the array
drops to 81MB with no more flushing than before.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-10-07 20:40:00 +11:00
Richard Henderson
b125f9dc7b tcg: Check for overflow via highwater mark
We currently pre-compute an worst case code size for any TB, which
works out to be 122kB.  Since the average TB size is near 1kB, this
wastes quite a lot of storage.

Instead, check for overflow in between generating code for each opcode.
The overhead of the check isn't measurable and wastage is minimized.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-10-07 20:36:53 +11:00
Richard Henderson
4e5e121515 tcg: Remove gen_intermediate_code_pc
It is no longer used, so tidy up everything reached by it.
This includes the gen_opc_* arrays, the search_pc parameter
and the inline gen_intermediate_code_internal functions.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-10-07 20:36:52 +11:00
Richard Henderson
fca8a500d5 tcg: Save insn data and use it in cpu_restore_state_from_tb
We can now restore state without retranslation.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-10-07 20:36:51 +11:00
Richard Henderson
bad729e272 tcg: Pass data argument to restore_state_to_opc
The gen_opc_* arrays are already redundant with the data stored in
the insn_start arguments.  Transition restore_state_to_opc to use
data from the latter.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-10-07 20:36:51 +11:00
Richard Henderson
fec88f64bd tcg: Merge cpu_gen_code into tb_gen_code
As it's only caller, this tidies things a bit.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-10-07 20:36:49 +11:00
David Gibson
a788f227ef memory: Allow replay of IOMMU mapping notifications
When we have guest visible IOMMUs, we allow notifiers to be registered
which will be informed of all changes to IOMMU mappings.  This is used by
vfio to keep the host IOMMU mappings in sync with guest IOMMU mappings.

However, unlike with a memory region listener, an iommu notifier won't be
told about any mappings which already exist in the (guest) IOMMU at the
time it is registered.  This can cause problems if hotplugging a VFIO
device onto a guest bus which had existing guest IOMMU mappings, but didn't
previously have an VFIO devices (and hence no host IOMMU mappings).

This adds a memory_region_iommu_replay() function to handle this case.  It
replays any existing mappings in an IOMMU memory region to a specified
notifier.  Because the IOMMU memory region doesn't internally remember the
granularity of the guest IOMMU it has a small hack where the caller must
specify a granularity at which to replay mappings.

If there are finer mappings in the guest IOMMU these will be reported in
the iotlb structures passed to the notifier which it must handle (probably
causing it to flag an error).  This isn't new - the VFIO iommu notifier
must already handle notifications about guest IOMMU mappings too short
for it to represent in the host IOMMU.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-10-05 12:39:03 -06:00
Peter Crosthwaite
dfccc76023 include/exec: Move cputlb exec.c defs out
Move the architecture agnostic function prototypes for exec.c out of
cputlb.h to exec-all.h. This allows hiding of the arch specific
cputlb.h from exec.c which should be getting close to having no
architecture specifics. Prepares support for multi-arch, which will have
a minimal cpu.h that services exec.c but not cputlb.h.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <b4fe754c58c860315e35d44430c26b1c967ce2c9.1441614289.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-16 17:33:33 +02:00
Peter Crosthwaite
bcae01e468 cputlb: Change tlb_set_dirty() arg to cpu
Change tlb_set_dirty() to accept a CPU instead of an env pointer. This
allows for removal of another CPUArchState usage from prototypes that
need to be QOMified.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <d2b1dcbe7945112989861d8ba7369449c11cc273.1441614289.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-16 17:33:33 +02:00
Peter Crosthwaite
9a13565d52 cputlb: move CPU_LOOP() for tlb_reset() to exec.c
To prepare for multi-arch, cputlb.c should only have awareness of one
single architecture. This means it should not have access to the full
CPU lists which may be heterogeneous. Instead, push the CPU_LOOP() up
to the one and only caller in exec.c.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <db06dc6c49f8970caaf116d0385f00ee10a56f2f.1441614289.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-16 17:33:33 +02:00
Peter Maydell
a2aa09e181 * Support for jemalloc
* qemu_mutex_lock_iothread "No such process" fix
 * cutils: qemu_strto* wrappers
 * iohandler.c simplification
 * Many other fixes and misc patches.
 
 And some MTTCG work (with Emilio's fixes squashed):
 * Signal-free TCG kick
 * Removing spinlock in favor of QemuMutex
 * User-mode emulation multi-threading fixes/docs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJV8Tk7AAoJEL/70l94x66Ds3QH/3bi0RRR2NtKIXAQrGo5tfuD
 NPMu1K5Hy+/26AC6mEVNRh4kh7dPH5E4NnDGbxet1+osvmpjxAjc2JrxEybhHD0j
 fkpzqynuBN6cA2Gu5GUNoKzxxTmi2RrEYigWDZqCftRXBeO2Hsr1etxJh9UoZw5H
 dgpU3j/n0Q8s08jUJ1o789knZI/ckwL4oXK4u2KhSC7ZTCWhJT7Qr7c0JmiKReaF
 JEYAsKkQhICVKRVmC8NxML8U58O8maBjQ62UN6nQpVaQd0Yo/6cstFTZsRrHMHL3
 7A2Tyg862cMvp+1DOX3Bk02yXA+nxnzLF8kUe0rYo6llqDBDStzqyn1j9R0qeqA=
 =nB06
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Support for jemalloc
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.

And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs

# gpg: Signature made Thu 10 Sep 2015 09:03:07 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (44 commits)
  cutils: work around platform differences in strto{l,ul,ll,ull}
  cpu-exec: fix lock hierarchy for user-mode emulation
  exec: make mmap_lock/mmap_unlock globally available
  tcg: comment on which functions have to be called with mmap_lock held
  tcg: add memory barriers in page_find_alloc accesses
  remove unused spinlock.
  replace spinlock by QemuMutex.
  cpus: remove tcg_halt_cond and tcg_cpu_thread globals
  cpus: protect work list with work_mutex
  scripts/dump-guest-memory.py: fix after RAMBlock change
  configure: Add support for jemalloc
  add macro file for coccinelle
  configure: factor out adding disas configure
  vhost-scsi: fix wrong vhost-scsi firmware path
  checkpatch: remove tests that are not relevant outside the kernel
  checkpatch: adapt some tests to QEMU
  CODING_STYLE: update mixed declaration rules
  qmp: Add example usage of strto*l() qemu wrapper
  cutils: Add qemu_strtoull() wrapper
  cutils: Add qemu_strtoll() wrapper
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-14 16:13:16 +01:00
Pavel Dovgalyuk
1c3c8af1fb cpu-exec: introduce loop exit with restore function
This patch introduces loop exit function, which also
restores guest CPU state according to the value of host
program counter.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150710095702.13280.97477.stgit@PASHA-ISP>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-09-11 08:16:16 -07:00
Pavel Dovgalyuk
b8611499b9 softmmu: remove now unused functions
Now that the cpu_ld/st_* function directly call helper_ret_ld/st, we can
drop the old helper_ld/st functions.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150710095656.13280.7085.stgit@PASHA-ISP>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-09-11 08:16:05 -07:00
Pavel Dovgalyuk
282dffc8a4 softmmu: add helper function to pass through retaddr
This patch introduces several helpers to pass return address
which points to the TB. Correct return address allows correct
restoring of the guest PC and icount. These functions should be used when
helpers embedded into TB invoke memory operations.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150710095650.13280.32255.stgit@PASHA-ISP>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-09-11 08:15:32 -07:00
Benjamin Herrenschmidt
97ed5ccdee tlb: Add "ifetch" argument to cpu_mmu_index()
This is set to true when the index is for an instruction fetch
translation.

The core get_page_addr_code() sets it, as do the SOFTMMU_CODE_ACCESS
acessors.

All targets ignore it for now, and all other callers pass "false".

This will allow targets who wish to split the mmu index between
instruction and data accesses to do so. A subsequent patch will
do just that for PowerPC.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Message-Id: <1439796853-4410-2-git-send-email-benh@kernel.crashing.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-09-11 08:15:28 -07:00
Veres Lajos
67cc32ebfd typofixes - v4
Signed-off-by: Veres Lajos <vlajos@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-11 10:45:43 +03:00
Daniel P. Berrange
b6af097528 maint: remove / fix many doubled words
Many source files have doubled words (eg "the the", "to to",
and so on). Most of these can simply be removed, but a couple
were actual mis-spellings (eg "to to" instead of "to do").
There was even one triple word score "to to to" :-)

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-11 10:21:38 +03:00
Paolo Bonzini
8fd19e6cfd exec: make mmap_lock/mmap_unlock globally available
There is some iffy lock hierarchy going on in translate-all.c.  To
fix it, we need to take the mmap_lock in cpu-exec.c.  Make the
functions globally available.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-09 15:34:56 +02:00
KONRAD Frederic
2496ff1311 remove unused spinlock.
This just removes spinlock as it is not used anymore.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-Id: <1439220437-23957-6-git-send-email-fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-09 15:34:55 +02:00
KONRAD Frederic
677ef6230b replace spinlock by QemuMutex.
spinlock is only used in two cases:
  * cpu-exec.c: to protect TranslationBlock
  * mem_helper.c: for lock helper in target-i386 (which seems broken).

It's a pthread_mutex_t in user-mode, so we can use QemuMutex directly,
with an #ifdef.  The #ifdef will be removed when multithreaded TCG
will need the mutex as well.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-Id: <1439220437-23957-5-git-send-email-fred.konrad@greensocs.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
[Merge Emilio G. Cota's patch to remove volatile. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-09 15:34:55 +02:00
Dr. David Alan Gilbert
3c9589e180 Move RAMBlock and ram_list to ram_addr.h
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1439547914-18249-1-git-send-email-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-09 15:34:54 +02:00
Paolo Bonzini
e0c382113f tcg: signal-free qemu_cpu_kick
Signals are slow and do not exist on Win32.  The previous patches
have done most of the legwork to introduce memory barriers (some
of them were even there already for the sake of Windows!) and
we can now set the flags directly in the iothread.

qemu_cpu_kick_thread is not used anymore on TCG, since the TCG thread is
never outside usermode while the CPU is running (not halted).  Instead run
the content of the signal handler (now in qemu_cpu_kick_no_halt) directly.
qemu_cpu_kick_no_halt is also used in qemu_mutex_lock_iothread to avoid
the overhead of qemu_cond_broadcast.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-09 15:34:54 +02:00
Paolo Bonzini
9373e63297 tcg: introduce tcg_current_cpu
This is already useful on Windows in order to remove tls.h, because
accesses to current_cpu are done from a different thread on that
platform.  It will be used on POSIX platforms as soon TCG stops using
signals to interrupt the execution of translated code.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-09 15:34:53 +02:00
Peter Maydell
44d4a499b7 include/exec/softmmu-semi.h: Add support for 64-bit values
Add support for getting and setting 64-bit values in the
softmmu semihosting support functions. This will be needed
for 64-bit ARM semihosting.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Christopher Covington <cov@codeaurora.org>
Message-id: 1439483745-28752-6-git-send-email-peter.maydell@linaro.org
2015-09-07 10:39:27 +01:00
Peter Maydell
19239b39e7 gdbstub: Implement gdb_do_syscallv()
Implement a variant of the existing gdb_do_syscall() which
takes a va_list.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Christopher Covington <cov@codeaurora.org>
Message-id: 1439483745-28752-4-git-send-email-peter.maydell@linaro.org
2015-09-07 10:39:27 +01:00
Peter Crosthwaite
a17d448274 exec-all: Translate TCI return addresses backwards too
This subtraction of return addresses applies directly to TCI as well as
host-TCG. This fixes Linux boots for at least Microblaze, CRIS, ARM and
SH4 when using TCI.

[sw: Removed indentation for preprocessor statement]
[sw: The patch also fixes Linux boot for x86_64]

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
2015-08-26 20:50:46 +02:00
Peter Maydell
d7a74a9d4a cputlb: Add functions for flushing TLB for a single MMU index
Guest CPU TLB maintenance operations may be sufficiently
specialized to only need to flush TLB entries corresponding
to a particular MMU index. Implement cputlb functions for
this, to avoid the inefficiency of flushing TLB entries
which we don't need to.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1439548879-1972-2-git-send-email-peter.maydell@linaro.org
2015-08-25 16:18:33 +01:00
Laurent Vivier
b76f21a707 linux-user: remove useless macros GUEST_BASE and RESERVED_VA
As we have removed CONFIG_USE_GUEST_BASE, we always use a guest base
and the macros GUEST_BASE and RESERVED_VA become useless: replace
them by their values.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1440420834-8388-1-git-send-email-laurent@vivier.eu>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:14:30 -07:00
Laurent Vivier
4cbea59869 linux-user: remove --enable-guest-base/--disable-guest-base
All tcg host architectures now support the guest base and as
there is no real performance lost, it can be always enabled.

Anyway, guest base use can be disabled lively by setting guest
base to 0.

CONFIG_USE_GUEST_BASE is defined as (USE_GUEST_BASE && USER_ONLY),
it should have to be replaced by CONFIG_USER_ONLY in non CONFIG_USER_ONLY
parts, but as some other parts are using !CONFIG_SOFTMMU I have chosen to
use !CONFIG_SOFTMMU instead.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1440373328-9788-2-git-send-email-laurent@vivier.eu>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:14:17 -07:00
Peter Maydell
5452b6f61a * SCSI fixes from Stefan and Fam
* vhost-scsi fix from Igor and Lu Lina
 * a build system fix from Daniel
 * two more multi-arch-related patches from Peter C.
 * TCG patches from myself and Sergey Fedorov
 * RCU improvement from Wen Congyang
 * a few more simple cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJVzmCgAAoJEL/70l94x66DhFgH/1m3iGac2Ks3vAUAdS2HBcxC
 EeziMwWFmkrfbtzUkz/jE0NG5uA2Bs8OFHsC8vmQFwkpDbGUlJ1zd5/N5UOHMG3d
 zF0vd+nKNw9C1Fo0/LPyQSeP64/xXEMTmFLqmYf4ZOowz8lr/m6WYrMIzKUoXSEn
 FeRtq78moDT8qwF372j8aoQUUpsctXDHBQHORZdcERvlc4mxojeJ3+mNViR2bv3r
 92PwGvrJ26mQXEKmGo5O1VM4k7QVg7xJQfgE11x7ShE2E9fJDMgts0Q/xCjWCLwS
 BXtEtbd9QeFEfG/mlRFevGtuvksq98m0hN7lAWb13zWmlJFuLyyMmlGfGAlU55Q=
 =Y2DB
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* SCSI fixes from Stefan and Fam
* vhost-scsi fix from Igor and Lu Lina
* a build system fix from Daniel
* two more multi-arch-related patches from Peter C.
* TCG patches from myself and Sergey Fedorov
* RCU improvement from Wen Congyang
* a few more simple cleanups

# gpg: Signature made Fri 14 Aug 2015 22:41:52 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  disas: Defeature print_target_address
  hw: fix mask for ColdFire UART command register
  scsi-generic: identify AIO callbacks more clearly
  scsi-disk: identify AIO callbacks more clearly
  scsi: create restart bottom half in the right AioContext
  configure: only add CONFIG_RDMA to config-host.h once
  qemu-nbd: remove unnecessary qemu_notify_event()
  vhost-scsi: Clarify vhost_virtqueue_mask argument
  exec: use macro ROUND_UP for alignment
  rcu: Allow calling rcu_(un)register_thread() during synchronize_rcu()
  exec: drop cpu_can_do_io, just read cpu->can_do_io
  cpu_defs: Simplify CPUTLB padding logic
  cpu-exec: Do not invalidate original TB in cpu_exec_nocache()
  vhost/scsi: call vhost_dev_cleanup() at unrealize() time
  virtio-scsi-test: Add test case for tail unaligned WRITE SAME
  scsi-disk: Fix assertion failure on WRITE SAME
  tests: virtio-scsi: clear unit attention after reset
  scsi-disk: fix cmd.mode field typo
  virtio-scsi: use virtqueue_map_sg() when loading requests

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-18 17:06:41 +01:00
Paolo Bonzini
414b15c909 exec: drop cpu_can_do_io, just read cpu->can_do_io
After commit 626cf8f (icount: set can_do_io outside TB execution,
2014-12-08), can_do_io is set to 1 if not executing code.  It is
no longer necessary to make this assumption in cpu_can_do_io.

It is also possible to remove the use_icount test, simply by
never setting cpu->can_do_io to 0 unless use_icount is true.

With these changes cpu_can_do_io boils down to a read of
cpu->can_do_io.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-08-14 23:40:32 +02:00
Pavel Fedin
6d6d2abf2c Merge memory_region_init_reservation() into memory_region_init_io()
Just specifying ops = NULL in some cases can be more convenient than having
two functions.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 78a379ab1b6b30ab497db7971ad336dad1dbee76.1438758065.git.p.fedin@samsung.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-13 11:26:21 +01:00
Peter Crosthwaite
b4a4b8d0e0 cpu_defs: Simplify CPUTLB padding logic
There was a complicated subtractive arithmetic for determining the
padding on the CPUTLBEntry structure. Simplify this with a union.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <1436130533-18565-1-git-send-email-crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-08-06 12:04:08 +02:00
Sergey Fedorov
02d57ea115 cpu-exec: Do not invalidate original TB in cpu_exec_nocache()
Instead of invalidating an original TB in cpu_exec_nocache()
prematurely, just save a link to it in the temporary generated TB. If
cpu_io_recompile() is raised subsequently from the temporary TB,
invalidate the original one as well. That allows reusing the original TB
each time cpu_exec_nocache() is called to handle expired instruction
counter in icount mode.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-Id: <1435656909-29116-1-git-send-email-serge.fdrv@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-08-06 12:04:08 +02:00
Paolo Bonzini
deb809edb8 memory: count number of active VGA logging clients
For a board that has multiple framebuffer devices, both of them
might want to use DIRTY_MEMORY_VGA on the same memory region.
The lack of reference counting in memory_region_set_log makes
this very awkward to implement.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-24 13:57:45 +02:00
Peter Crosthwaite
4bad9e392e cpu: Change cpu_exec_init() arg to cpu, not env
The callers (most of them in target-foo/cpu.c) to this function all
have the cpu pointer handy. Just pass it to avoid an ENV_GET_CPU() from
core code (in exec.c).

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Anthony Green <green@moxielogic.com>
Cc: Jia Liu <proljc@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09 15:20:40 +02:00
Peter Crosthwaite
f7ec7f7b26 gdbstub: Change gdbserver_fork() to accept cpu instead of env
All callsites to this function navigate the cpu->env_ptr only for the
function to take the env ptr back to the original cpu ptr. Change the
function to just pass in the CPU pointer instead. Removes a core code
usage of ENV_GET_CPU() (in gdbstub.c).

Cc: Riku Voipio <riku.voipio@iki.fi>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09 15:20:40 +02:00
Peter Crosthwaite
bbd77c180d translate-all: Change tb_flush() env argument to cpu
All of the core-code usages of this API have the cpu pointer handy so
pass it in. There are only 3 architecture specific usages (2 of which
are commented out) which can just use ENV_GET_CPU() locally to get the
cpu pointer. The reduces core code usage of the CPU env, which brings
us closer to common-obj'ing these core files.

Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09 15:20:40 +02:00
Bharata B Rao
5a790cc4b9 cpu: Add Error argument to cpu_exec_init()
Add an Error argument to cpu_exec_init() to let users collect the
error. This is in preparation to change the CPU enumeration logic
in cpu_exec_init(). With the new enumeration logic, cpu_exec_init()
can fail if cpu_index values corresponding to max_cpus have already
been handed out.

Since all current callers of cpu_exec_init() are from instance_init,
use error_abort Error argument to abort in case of an error.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09 15:20:40 +02:00
Li Zhijian
dd63169766 migration: extend migration_bitmap
Prevously, if we hotplug a device(e.g. device_add e1000) during
migration is processing in source side, qemu will add a new ram
block but migration_bitmap is not extended.
In this case, migration_bitmap will overflow and lead qemu abort
unexpectedly.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-07-07 14:54:56 +02:00
Peter Crosthwaite
4e51361d79 cpu-all: complete "real" host page size API
Currently the "host" page size alignment API is really aligning to both
host and target page sizes. There is the qemu_real_page_size which can
be used for the actual host page size but it's missing a mask and ALIGN
macro as provided for qemu_page_size. Complete the API. This allows
system level code that cares about the host page size to use a
consistent alignment interface without having to un-needingly align to
the target page size. This also reduces system level code dependency
on the cpu specific TARGET_PAGE_SIZE.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-07-06 12:15:12 -06:00
Peter Maydell
7edd8e4660 * more of Peter Crosthwaite's multiarch preparation patches
* unlocked MMIO support in KVM
 * support for compilation with ICC
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJVmnuoAAoJEL/70l94x66DKTUH/RFrc20KXRkn/Pb/8qHY/wFz
 Wt3YaS5VYPHElHbxHSdpwlV3K50FAX4QaC25Dnw4dsTelyxe5k7od+I7x8PQxD9v
 3N+mFFF1BV6PqXTPVnUCnb14EXprJX524E97O6Z3lDGcwSLHDxeveSCk3IvMFErz
 JzP3vtigSvtdPPQXlGcndP/r1EXeVjgNIsZ+NKaI/kmoSz1fHFrCN8hTnrxA9RSI
 ZPhfmgHI5EMFtAf/HiZID6GSHOHajgeRT2bIiiy1okS++no0uRZlVMvcnFNPZHoG
 e9XCGBXJSdmCoi7sIgShXirKszxYkRTbCyxxjz6aYfhrQzo0h+Yn9OPuvQrgynE=
 =+YEv
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* more of Peter Crosthwaite's multiarch preparation patches
* unlocked MMIO support in KVM
* support for compilation with ICC

# gpg: Signature made Mon Jul  6 13:59:20 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  exec: skip MMIO regions correctly in cpu_physical_memory_write_rom_internal
  Stop including qemu-common.h in memory.h
  kvm: Switch to unlocked MMIO
  acpi: mark PMTIMER as unlocked
  kvm: Switch to unlocked PIO
  kvm: First step to push iothread lock out of inner run loop
  memory: let address_space_rw/ld*/st* run outside the BQL
  exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st*
  memory: Add global-locking property to memory regions
  main-loop: introduce qemu_mutex_iothread_locked
  main-loop: use qemu_mutex_lock_iothread consistently
  Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES
  cpu-defs: Move out TB_JMP defines
  include/exec: Move tb hash functions out
  include/exec: Move standard exceptions to cpu-all.h
  cpu-defs: Move CPU_TEMP_BUF_NLONGS to tcg
  memory_mapping: Rework cpu related includes
  cutils: allow compilation with icc
  qemu-common: add VEC_OR macro

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-06 14:03:44 +01:00
Peter Maydell
fba0a593b2 Stop including qemu-common.h in memory.h
Including qemu-common.h from other header files is generally a bad
idea, because it means it's very easy to end up with a circular
dependency. For instance, if we wanted to include memory.h from
qom/cpu.h we'd end up with this loop:
 memory.h -> qemu-common.h -> cpu.h -> cpu-qom.h -> qom/cpu.h -> memory.h

Remove the include from memory.h. This requires us to fix up a few
other files which were inadvertently getting declarations indirectly
through memory.h.

The biggest change is splitting the fprintf_function typedef out
into its own header so other headers can get at it without having
to include qemu-common.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1435933104-15216-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06 14:59:09 +02:00
Jan Kiszka
196ea13104 memory: Add global-locking property to memory regions
This introduces the memory region property "global_locking". It is true
by default. By setting it to false, a device model can request BQL-free
dispatching of region accesses to its r/w handlers. The actual BQL
break-up will be provided in a separate patch.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Frederic Konrad <fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-4-git-send-email-pbonzini@redhat.com>
2015-07-01 15:45:50 +02:00
Peter Crosthwaite
41da4bd642 cpu-defs: Move out TB_JMP defines
These are not Architecture specific in any way so move them out of
cpu-defs.h. tb-hash.h is an appropriate place as a leading user and
their strong relationship to TB hashing and caching.

Reviewed-by: Richard Henderson <rth@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <43ceca65a3fa240efac49aa0bf604ad0442e1710.1433052532.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-26 16:00:51 +02:00
Peter Crosthwaite
e1b89321ba include/exec: Move tb hash functions out
This is one of very few things in exec-all with a genuine CPU
architecture dependency. Move these hashing helpers to a new
header to trim exec-all.h down to a near architecture-agnostic
header.

The defs are only used by cpu-exec and translate-all which are both
arch-obj's so the new tb-hash.h has no core code usage.

Reviewed-by: Richard Henderson <rth@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <9d048b96f7cfa64a4d9c0b88e0dd2877fac51d41.1433052532.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-26 16:00:51 +02:00
Peter Crosthwaite
9e0dc48c9f include/exec: Move standard exceptions to cpu-all.h
These exception indicies are generic and don't have any reliance on the
per-arch cpu.h defs. Move them to cpu-all.h so they can be used by core
code that does not have access to cpu-defs.h.

Reviewed-by: Richard Henderson <rth@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <dbebd3062c7cd4332240891a3564e73f374ddfcd.1433052532.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-26 16:00:51 +02:00
Peter Crosthwaite
6e0b07306d cpu-defs: Move CPU_TEMP_BUF_NLONGS to tcg
The usages of this define are pure TCG and there is no architecture
specific variation of the value. Localise it to the TCG engine to
remove another architecture agnostic piece from cpu-defs.h.

This follows on from a28177820a where
temp_buf was moved out of the CPU_COMMON obsoleting the need for
the super early definition.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <498e8e5325c1a1aff79e5bcfc28cb760ef6b214e.1433052532.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-26 16:00:50 +02:00
Maciej W. Rozycki
9f6f7ca149 include/softmmu-semi.h: Make semihosting support 64-bit clean
Correct addresses passed around in semihosting to use a data type suitable
for both 32-bit and 64-bit targets.

Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2015-06-26 09:08:50 +01:00
Leon Alrae
a59d31a1eb semihosting: add --semihosting-config arg sub-argument
Add new "arg" sub-argument to the --semihosting-config allowing the user
to pass multiple input arguments separately. It is required for example
by UHI semihosting to construct argc and argv.

Also, update ARM semihosting to support new option (at the moment it is
the only target which cares about arguments).

If the semihosting is enabled and no semihosting args have been specified,
then fall back to -kernel/-append. The -append string is split on whitespace
before initializing semihosting.argv[1..n]; this is different from what
QEMU MIPS machines' pseudo-bootloaders do (i.e. argv[1] contains the whole
-append), but is more intuitive from UHI user's point of view and Linux
kernel just does not care as it concatenates argv[1..n] into single cmdline
string anyway.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Message-id: 1434643256-16858-3-git-send-email-leon.alrae@imgtec.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-19 14:17:45 +01:00
Leon Alrae
cfe67cef48 semihosting: create SemihostingConfig structure and semihost.h
Remove semihosting_enabled and semihosting_target and replace them with
SemihostingConfig structure containing equivalent fields. The structure
is defined in vl.c where it is actually set.

Also introduce separate header file include/exec/semihost.h allowing to
access semihosting config related stuff from target specific semihosting
code.

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1434643256-16858-2-git-send-email-leon.alrae@imgtec.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-19 14:17:45 +01:00
Aurelien Jarno
2e83c49626 softmmu: provide tlb_vaddr_to_host function for user mode
To avoid to many #ifdef in target code, provide a tlb_vaddr_to_host for
both user and softmmu modes. In the first case the function always
succeed and just call the g2h function.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-17 12:40:51 +02:00
Alexander Graf
8be656b87c linux-user: Allocate thunk size dynamically
We store all struct types in an array of static size without ever
checking whether we overrun it. Of course some day someone (like me
in another, ancient ALSA enabling patch set) will run into the limit
without realizing it.

So let's make the allocation dynamic. We already know the number of
structs that we want to allocate, so we only need to pass the variable
into the respective piece of code.

Also, to ensure we don't accidently overwrite random memory, add some
asserts to sanity check whether a thunk is actually part of our array.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2015-06-15 11:36:58 +03:00
Peter Maydell
4cb618abc1 MIPS patches 2015-06-12
Changes:
 * improve dp8393x network card and rc4030 chipset emulation
 * support misaligned R6 and MSA memory accesses
 * support MIPS eXtended and Large Physical Addressing
 * add Config5.FRE bit and ERETNC instruction (Config5.LLB)
 * support ememsize on MALTA
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJVeppzAAoJEFIRjjwLKdprnk8H/1owSOreh0sMFbosvqlEhjXl
 lvjjuprWMdX+8M1JlaDvTbw6+LDB3Rihp3A6/I9A0GFiZaORmPzg7efULAI1H6ST
 0HfxMAO17eW+PJ3lvk0HidNDr01+RzTvwpizrHgQ9WJubJv0xREU+YG5yn1gPS4N
 aMMTKCAQFDba7iQQLKXUYvLz76+xyzW4VIvHVLx/SU86yPg9T7CwLpppipR8+zY5
 3BC4NUw/xLlS0LCYQGM8XmYgBiQ6lEAz/Y29bGlUg+LeYysjSgNSeoNbOs1M3kQp
 X0Hn7b28I1CjM2wZQ9GkT/ig+jhMvw27motnAe8vKood4ytfcor+dCCS13sE8fg=
 =F564
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150612' into staging

MIPS patches 2015-06-12

Changes:
* improve dp8393x network card and rc4030 chipset emulation
* support misaligned R6 and MSA memory accesses
* support MIPS eXtended and Large Physical Addressing
* add Config5.FRE bit and ERETNC instruction (Config5.LLB)
* support ememsize on MALTA

# gpg: Signature made Fri Jun 12 09:38:11 2015 BST using RSA key ID 0B29DA6B
# gpg: Good signature from "Leon Alrae <leon.alrae@imgtec.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8DD3 2F98 5495 9D66 35D4  4FC0 5211 8E3C 0B29 DA6B

* remotes/lalrae/tags/mips-20150612: (29 commits)
  target-mips: enable XPA and LPA features
  target-mips: remove misleading comments in translate_init.c
  target-mips: add MTHC0 and MFHC0 instructions
  target-mips: add CP0.PageGrain.ELPA support
  target-mips: support Page Frame Number Extension field
  target-mips: extend selected CP0 registers to 64-bits in MIPS32
  target-mips: correct MFC0 for CP0.EntryLo in MIPS64
  net/dp8393x: fix hardware reset
  net/dp8393x: correctly reset in_use field
  net/dp8393x: add load/save support
  net/dp8393x: add PROM to store MAC address
  net/dp8393x: QOM'ify
  net/dp8393x: use dp8393x_ prefix for all functions
  net/dp8393x: do not use old_mmio accesses
  net/dp8393x: always calculate proper checksums
  dma/rc4030: convert to QOM
  dma/rc4030: use trace events instead of custom logging
  dma/rc4030: document register at offset 0x210
  dma/rc4030: do not use old_mmio accesses
  dma/rc4030: use AddressSpace and address_space_rw in users
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-12 12:49:40 +01:00
Dr. David Alan Gilbert
e3807054e2 qemu_ram_foreach_block: pass up error value, and down the ramblock name
check the return value of the function it calls and error if it's non-0
Fixup qemu_rdma_init_one_block that is the only current caller,
  and rdma_add_block the only function it calls using it.

Pass the name of the ramblock to the function; helps in debugging.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Michael R. Hines <mrhines@us.ibm.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-06-12 06:54:01 +02:00
Yongbok Kim
3b4afc9e75 softmmu: Add probe_write()
Probe for whether the specified guest write access is permitted.
If it is not permitted then an exception will be taken in the same
way as if this were a real write access (and we will not return).
Otherwise the function will return, and there will be a valid
entry in the TLB for this access.

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2015-06-11 10:13:28 +01:00
Paolo Bonzini
f794aa4a2f target-i386: introduce cpu_get_mem_attrs
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:10:00 +02:00
Stefan Hajnoczi
5f2cb94688 memory: make cpu_physical_memory_sync_dirty_bitmap() fully atomic
The fast path of cpu_physical_memory_sync_dirty_bitmap() directly
manipulates the dirty bitmap.  Use atomic_xchg() to make the
test-and-clear atomic.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-7-git-send-email-stefanha@redhat.com>
[Only do xchg on nonzero words. - Paolo]
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:10:00 +02:00
Stefan Hajnoczi
03eebc9e32 memory: replace cpu_physical_memory_reset_dirty() with test-and-clear
The cpu_physical_memory_reset_dirty() function is sometimes used
together with cpu_physical_memory_get_dirty().  This is not atomic since
two separate accesses to the dirty memory bitmap are made.

Turn cpu_physical_memory_reset_dirty() and
cpu_physical_memory_clear_dirty_range_type() into the atomic
cpu_physical_memory_test_and_clear_dirty().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-6-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:10:00 +02:00
Stefan Hajnoczi
20015f72bd migration: move dirty bitmap sync to ram_addr.h
The dirty memory bitmap is managed by ram_addr.h and copied to
migration_bitmap[] periodically during live migration.

Move the code to sync the bitmap to ram_addr.h where related code lives.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-5-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:10:00 +02:00
Stefan Hajnoczi
d114875b9a memory: use atomic ops for setting dirty memory bits
Use set_bit_atomic() and bitmap_set_atomic() so that multiple threads
can dirty memory without race conditions.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-4-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:10:00 +02:00
Paolo Bonzini
9460dee4b2 memory: do not touch code dirty bitmap unless TCG is enabled
cpu_physical_memory_set_dirty_lebitmap unconditionally syncs the
DIRTY_MEMORY_CODE bitmap.  This however is unused unless TCG is
enabled.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:10:00 +02:00
Paolo Bonzini
e87f7778b6 exec: only check relevant bitmaps for cleanliness
Most of the time, not all bitmaps have to be marked as dirty;
do not do anything if the interesting ones are already dirty.
Previously, any clean bitmap would have cause all the bitmaps to be
marked dirty.

In fact, unless running TCG most of the time bitmap operations need
not be done at all, because memory_region_is_logging returns zero.
In this case, skip the call to cpu_physical_memory_range_includes_clean
altogether as well.

With this patch, cpu_physical_memory_set_dirty_range is called
unconditionally, so there need not be anymore a separate call to
xen_modified_memory.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:10:00 +02:00
Paolo Bonzini
72b47e79ce exec: invert return value of cpu_physical_memory_get_clean, rename
While it is obvious that cpu_physical_memory_get_dirty returns true even if
a single page is dirty, the same is not true for cpu_physical_memory_get_clean;
one would expect that it returns true only if all the pages are clean, but
it actually looks for even one clean page.  (By contrast, the caller of that
function, cpu_physical_memory_range_includes_clean, has a good name).

To clarify, rename the function to cpu_physical_memory_all_dirty and return
true if _all_ the pages are dirty.  This is the opposite of the previous
meaning, because "all are 1" is the same as "not (any is 0)", so we have to
modify cpu_physical_memory_range_includes_clean as well.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
58d2707e87 exec: pass client mask to cpu_physical_memory_set_dirty_range
This cuts in half the cost of bitmap operations (which will become more
expensive when made atomic) during migration on non-VRAM regions.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
9564f52da7 cputlb: remove useless arguments to tlb_unprotect_code_phys, rename
These days modification of the TLB is done in notdirty_mem_write,
so the virtual address and env pointer as unnecessary.

The new name of the function, tlb_unprotect_code, is consistent with
tlb_protect_code.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
1652b97476 exec: move functions to translate-all.h
Remove them from the sundry exec-all.h header, since they are only used by
the TCG runtime in exec.c and user-exec.c.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
49dfcec403 ram_addr: tweaks to xen_modified_memory
Invoke xen_modified_memory from cpu_physical_memory_set_dirty_range_nocode;
it is akin to DIRTY_MEMORY_MIGRATION, so set it together with that bitmap.
The remaining call from invalidate_and_set_dirty's "else" branch will go
away soon.

Second, fix the second argument to the function in the
cpu_physical_memory_set_dirty_lebitmap call site.  That function is only used
by KVM, but it is better to be clean anyway.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
677e7805cf memory: track DIRTY_MEMORY_CODE in mr->dirty_log_mask
DIRTY_MEMORY_CODE is only needed for TCG.  By adding it directly to
mr->dirty_log_mask, we avoid testing for TCG everywhere a region is
checked for the enabled/disabled state of dirty logging.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
b2dfd71c48 memory: prepare for multiple bits in the dirty log mask
When the dirty log mask will also cover other bits than DIRTY_MEMORY_VGA,
some listeners may be interested in the overall zero/non-zero value of
the dirty log mask; others may be interested in the value of single bits.

For this reason, always call log_start/log_stop if bits have respectively
appeared or disappeared, and pass the old and new values of the dirty log
mask so that listeners can distinguish the kinds of change.

For example, KVM checks if dirty logging used to be completely disabled
(in log_start) or is now completely disabled (in log_stop).  On the
other hand, Xen has to check manually if DIRTY_MEMORY_VGA changed,
since that is the only bit it cares about.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
2d1a35bef0 memory: differentiate memory_region_is_logging and memory_region_get_dirty_log_mask
For now memory regions only track DIRTY_MEMORY_VGA individually, but
this will change soon.  To support this, split memory_region_is_logging
in two functions: one that returns a given bit from dirty_log_mask,
and one that returns the entire mask.  memory_region_is_logging gets an
extra parameter so that the compiler flags misuse.

While VGA-specific users (including the Xen listener!) will want to keep
checking that bit, KVM and vhost check for "any bit except migration"
(because migration is handled via the global start/stop listener
callbacks).

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:58 +02:00
Paolo Bonzini
dbddac6da0 memory: the only dirty memory flag for users is DIRTY_MEMORY_VGA
DIRTY_MEMORY_MIGRATION is triggered by memory_global_dirty_log_start
and memory_global_dirty_log_stop, so it cannot be used with
memory_region_set_log.

Specify this in the documentation and assert it.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:58 +02:00
Paolo Bonzini
1de29aef17 softmmu: support up to 12 MMU modes
At 8k per TLB (for 64-bit host or target), 8 or more modes
make the TLBs bigger than 64k, and some RISC TCG backends do
not like that.  On the affected hosts, cut the TLB size in
half---there is still a measurable speedup on PPC with the
next patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1424436345-37924-3-git-send-email-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-03 23:56:56 +02:00
Pavel Fedin
38d40ff10f Add stream ID to MSI write
GICv3 ITS distinguishes between devices by using hardwired device IDs passed on the bus.
This patch implements passing these IDs in qemu.
SMMU is also known to use stream IDs, therefore this addition can also be useful for
implementing platforms with SMMU.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

 Changes from v1:
- Added bus number to the stream ID
- Added stream ID not only to MSI-X, but also to plain MSI. Some common code was made into
msi_send_message() function.
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-05-31 20:29:02 +02:00
Paolo Bonzini
41063e1e7a exec: move rcu_read_lock/unlock to address_space_translate callers
Once address_space_translate will be called outside the BQL, the returned
MemoryRegion might disappear as soon as the RCU read-side critical section
ends.  Avoid this by moving the critical section to the callers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1426684909-95030-3-git-send-email-pbonzini@redhat.com>
2015-04-30 16:55:32 +02:00
Peter Maydell
06feaacfb4 - miscellaneous cleanups for TCG (Emilio) and NBD (Bogdan)
- next part in the thread-safe address_space_* saga: atomic access
   to the bounce buffer and the map_clients list, from Fam
 - optional support for linking with tcmalloc, also from Fam
 - reapplying Peter Crosthwaite's "Respect as_translate_internal
   length clamp" after fixing the SPARC fallout.
 - build system fix from Wei Liu
 - small acpi-build and ioport cleanup by myself
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJVQJd4AAoJEL/70l94x66DYFYH/3ifhqWZsd4dfJri0CGAHI4i
 SpPmNeouc8W+F/3lwf6Inrh5NnTgd5QzoUBMQaWVkQKwUiWls8g2mXkT3jo0iDqT
 /B40YXnZjNm20MixNaZmk9AsOF6OqPM8EMufau874k5zTlx3tCGAW1QD+I1N7WK7
 DfsFsIUD1svo2prn55fSoitMG1TIVPnpcklb4YGJRbAacQYUDhr5KAIhT1quDR2R
 93BvToyQmPqRQ4YKqnJLp8HAkL4FaJumfFZVvyh2cZvyaYGN/RVdi2Dw985dJDPX
 /z4enE4GCAs4RDw3lZ1RDbiZDqpT2ibFgASg/arX3SxzqHirOGvMdkOjO99r9j4=
 =aLjh
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

- miscellaneous cleanups for TCG (Emilio) and NBD (Bogdan)
- next part in the thread-safe address_space_* saga: atomic access
  to the bounce buffer and the map_clients list, from Fam
- optional support for linking with tcmalloc, also from Fam
- reapplying Peter Crosthwaite's "Respect as_translate_internal
  length clamp" after fixing the SPARC fallout.
- build system fix from Wei Liu
- small acpi-build and ioport cleanup by myself

# gpg: Signature made Wed Apr 29 09:34:00 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (22 commits)
  nbd/trivial: fix type cast for ioctl
  translate-all: use bitmap helpers for PageDesc's bitmap
  target-i386: disable LINT0 after reset
  Makefile.target: prepend $libs_softmmu to $LIBS
  milkymist: do not modify libs-softmmu
  configure: Add support for tcmalloc
  exec: Respect as_translate_internal length clamp
  ioport: reserve the whole range of an I/O port in the AddressSpace
  ioport: loosen assertions on emulation of 16-bit ports
  ioport: remove wrong comment
  ide: there is only one data port
  gus: clean up MemoryRegionPortio
  sb16: remove useless mixer_write_indexw
  sun4m: fix slavio sysctrl and led register sizes
  acpi-build: remove dependency from ram_addr.h
  memory: add memory_region_ram_resize
  dma-helpers: Fix race condition of continue_after_map_failure and dma_aio_cancel
  exec: Notify cpu_register_map_client caller if the bounce buffer is available
  exec: Protect map_client_list with mutex
  linux-user, bsd-user: Remove two calls to cpu_exec_init_all
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-04-30 12:04:11 +01:00
Paolo Bonzini
37d7c08413 memory: add memory_region_ram_resize
This is a simple MemoryRegion wrapper for qemu_ram_resize.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-27 18:24:18 +02:00
Fam Zheng
e95205e1f9 dma-helpers: Fix race condition of continue_after_map_failure and dma_aio_cancel
If DMA's owning thread cancels the IO while the bounce buffer's owning thread
is notifying the "cpu client list", a use-after-free happens:

     continue_after_map_failure               dma_aio_cancel
     ------------------------------------------------------------------
     aio_bh_new
                                              qemu_bh_delete
     qemu_bh_schedule (use after free)

Also, the old code doesn't run the bh in the right AioContext.

Fix both problems by passing a QEMUBH to cpu_register_map_client.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1426496617-10702-6-git-send-email-famz@redhat.com>
[Remove unnecessary forward declaration. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-27 18:24:18 +02:00
Peter Maydell
0995bf8cd9 target-arm: Add user-mode transaction attribute
Add a transaction attribute indicating that a memory access is being
done from user-mode (unprivileged). This corresponds to an equivalent
signal in ARM AMBA buses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:25 +01:00
Peter Maydell
8bf5b6a9c1 target-arm: Honour NS bits in page tables
Honour the NS bit in ARM page tables:
 * when adding entries to the TLB, include the Secure/NonSecure
   transaction attribute
 * set the NS bit in the PAR when doing ATS operations

Note that we don't yet correctly use the NSTable bit to
cause the page table walk itself to use the right attributes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:25 +01:00
Peter Maydell
500131154d exec.c: Add new address_space_ld*/st* functions
Add new address_space_ld*/st* functions which allow transaction
attributes and error reporting for basic load and stores. These
are named to be in line with the address_space_read/write/rw
buffer operations.

The existing ld/st*_phys functions are now wrappers around
the new functions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:24 +01:00
Peter Maydell
5c9eb0286c exec.c: Make address_space_rw take transaction attributes
Make address_space_rw take transaction attributes, rather
than always using the 'unspecified' attributes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:24 +01:00
Peter Maydell
fadc1cbe85 Add MemTxAttrs to the IOTLB
Add a MemTxAttrs field to the IOTLB, and allow target-specific
code to set it via a new tlb_set_page_with_attrs() function;
pass the attributes through to the device when making IO accesses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:24 +01:00
Peter Maydell
e469b22ffd Make CPU iotlb a structure rather than a plain hwaddr
Make the CPU iotlb a structure rather than a plain hwaddr;
this will allow us to add transaction attributes to it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:23 +01:00
Peter Maydell
3b64349539 memory: Replace io_mem_read/write with memory_region_dispatch_read/write
Rather than retaining io_mem_read/write as simple wrappers around
the memory_region_dispatch_read/write functions, make the latter
public and change all the callers to use them, since we need to
touch all the callsites anyway to add MemTxAttrs and MemTxResult
support. Delete io_mem_read and io_mem_write entirely.

(All the callers currently pass MEMTXATTRS_UNSPECIFIED
and convert the return value back to bool or ignore it.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:23 +01:00
Peter Maydell
cc05c43ad9 memory: Define API for MemoryRegionOps to take attrs and return status
Define an API so that devices can register MemoryRegionOps whose read
and write callback functions are passed an arbitrary pointer to some
transaction attributes and can return a success-or-failure status code.
This will allow us to model devices which:
 * behave differently for ARM Secure/NonSecure memory accesses
 * behave differently for privileged/unprivileged accesses
 * may return a transaction failure (causing a guest exception)
   for erroneous accesses

This patch defines the new API and plumbs the attributes parameter through
to the memory.c public level functions io_mem_read() and io_mem_write(),
where it is currently dummied out.

The success/failure response indication is also propagated out to
io_mem_read() and io_mem_write(), which retain the old-style
boolean true-for-error return.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:23 +01:00
Richard Henderson
42a268c241 tcg: Change translator-side labels to a pointer
This is improved type checking for the translators -- it's no longer
possible to accidentally swap arguments to the branch functions.

Note that the code generating backends still manipulate labels as int.

With notable exceptions, the scope of the change is just a few lines
for each target, so it's not worth building extra machinery to do this
change in per-target increments.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Anthony Green <green@moxielogic.com>
Cc: Jia Liu <proljc@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-03-13 12:28:18 -07:00
zhanghailiang
87a45cfee6 pc-dimm: add a function to calculate VM's current RAM size
The global parameter 'ram_size' does not take into account
the hotplugged memory.

In some codes, we use 'ram_size' as current VM's real RAM size,
which is not correct.

Add function 'get_current_ram_size' to calculate VM's current RAM size,
it will enumerate present memory devices and also plus ram_size.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-03-04 13:00:04 -05:00
Peter Maydell
73104fd399 - vhost-scsi: add bootindex property
- RCU: fix MemoryRegion lifetime issues in PCI; document the rules;
 convert of AddressSpaceDispatch and RAMList
 - KVM: add kvm_exit reasons for aarch64
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJU4hugAAoJEL/70l94x66DZXEH/i72tOgvKZfAjfq2xmHXNEsr
 roCfTFIIjKK7feyW6YgwT5pgex6I5umFsO+uIyI/wbu8nDl/3NYEQBT4fR2cGfli
 GKeJOEu8kf+Zt8U+fbxyVQclbuU5S0Ujsg1fX4QXC4swB5fGLT2cRWJ5qd6hKBQs
 GflBuLa7h4eOzcTtOPpqRIwZ8mQE0uxv/hKq9kYLKHXJN2aWsiOls8KQ2CXj2yAl
 p6bMS5f0H0S/1hvQcQV9EazX7owlPIEet3AmSL1TC2sjJ8hrNGMBoFPtUys1uqjc
 B3CwuGi0JtWIduFYV9vZ/Ze4G7Y2iZlqc5vDxIl94d+iFmoHymDOi3mFUZ3H8XQ=
 =Lk9p
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

- vhost-scsi: add bootindex property
- RCU: fix MemoryRegion lifetime issues in PCI; document the rules;
convert of AddressSpaceDispatch and RAMList
- KVM: add kvm_exit reasons for aarch64

# gpg: Signature made Mon Feb 16 16:32:32 2015 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (21 commits)
  Convert ram_list to RCU
  exec: convert ram_list to QLIST
  cosmetic changes preparing for the following patches
  exec: protect mru_block with RCU
  rcu: add g_free_rcu
  rcu: introduce RCU-enabled QLIST
  exec: RCUify AddressSpaceDispatch
  exec: make iotlb RCU-friendly
  exec: introduce cpu_reload_memory_map
  docs: clarify memory region lifecycle
  pci: split shpc_cleanup and shpc_free
  pcie: remove mmconfig memory leak and wrap mmconfig update with transaction
  memory: keep the owner of the AddressSpace alive until do_address_space_destroy
  rcu: run RCU callbacks under the BQL
  rcu: do not let RCU callbacks pile up indefinitely
  vhost-scsi: set the bootable value of channel/target/lun
  vhost-scsi: add a property for booting
  vhost-scsi: expose the TYPE_FW_PATH_PROVIDER interface
  vhost-scsi: add bootindex property
  qdev: support to get a device firmware path directly
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-02-24 13:58:18 +00:00
Mike Day
0dc3f44aca Convert ram_list to RCU
Allow "unlocked" reads of the ram_list by using an RCU-enabled QLIST.

The ramlist mutex is kept.  call_rcu callbacks are run with the iothread
lock taken, but that may change in the future.  Writers still take the
ramlist mutex, but they no longer need to assume that the iothread lock
is taken.

Readers of the list, instead, no longer require either the iothread
or ramlist mutex, but they need to use rcu_read_lock() and
rcu_read_unlock().

One place in arch_init.c was downgrading from write side to read side
like this:

    qemu_mutex_lock_iothread()
    qemu_mutex_lock_ramlist()
    ...
    qemu_mutex_unlock_iothread()
    ...
    qemu_mutex_unlock_ramlist()

and the equivalent idiom is:

    qemu_mutex_lock_ramlist()
    rcu_read_lock()
    ...
    qemu_mutex_unlock_ramlist()
    ...
    rcu_read_unlock()

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Mike Day <ncmike@ncultra.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-16 17:31:55 +01:00
Mike Day
0d53d9fe8a exec: convert ram_list to QLIST
QLIST has RCU-friendly primitives, so switch to it.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Mike Day <ncmike@ncultra.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-16 17:30:20 +01:00
Mike Day
ae3a7047d0 cosmetic changes preparing for the following patches
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Mike Day <ncmike@ncultra.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-16 17:30:20 +01:00