Commit Graph

702 Commits

Author SHA1 Message Date
Peter Maydell
33836a7315 TCG patch queue:
Workaround macos assembler lossage.
 Eliminate tb_lock.
 Fix TB code generation overflow.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbJBZIAAoJEGTfOOivfiFfy0gH/1brodMhJbTS6/k9+FyXWEy5
 zYjCGKKlMZk//Y+4wcF5tXY/qDRNWk80j6KyxumNp3gCBehx6u59EEsrJRQaxBHm
 nYbDoE3Fy0J4KgRzdGmkYtl89XDK1++Ea9uL9N/stg2MSodzqoV6uudLYr/f+nRj
 4MkS+7BI+aJ4/XIKLU+/+cRo+5FdD0hNEabjlUxTOSrfJbr/YxbnVINX01A4yD6q
 LSzwLAEqpJehFBQjeSLu93ztrapj/1vEaguPOf04F6pXgOLpvSPlPahqwwk4qRwS
 OFgWwSPby3jrNLYZcufx2cY5pG3i4wDGK3z/B35hnDEGwYp1fNt6xdq+EzmHhaM=
 =ibt/
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20180615' into staging

TCG patch queue:

Workaround macos assembler lossage.
Eliminate tb_lock.
Fix TB code generation overflow.

# gpg: Signature made Fri 15 Jun 2018 20:40:56 BST
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-tcg-20180615:
  tcg: Reduce max TB opcode count
  tcg: remove tb_lock
  translate-all: remove tb_lock mention from cpu_restore_state_from_tb
  cputlb: remove tb_lock from tlb_flush functions
  translate-all: protect TB jumps with a per-destination-TB lock
  translate-all: discard TB when tb_link_page returns an existing matching TB
  translate-all: introduce assert_no_pages_locked
  translate-all: add page_locked assertions
  translate-all: use per-page locking in !user-mode
  translate-all: move tb_invalidate_phys_page_range up in the file
  translate-all: work page-by-page in tb_invalidate_phys_range_1
  translate-all: remove hole in PageDesc
  translate-all: make l1_map lockless
  translate-all: iterate over TBs in a page with PAGE_FOR_EACH_TB
  tcg: move tb_ctx.tb_phys_invalidate_count to tcg_ctx
  tcg: track TBs with per-region BST's
  qht: return existing entry when qht_insert fails
  qht: require a default comparison function
  tcg/i386: Use byte form of xgetbv instruction

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-21 17:54:26 +01:00
Emilio G. Cota
0ac20318ce tcg: remove tb_lock
Use mmap_lock in user-mode to protect TCG state and the page descriptors.
In !user-mode, each vCPU has its own TCG state, so no locks needed.
Per-page locks are used to protect the page descriptors.

Per-TB locks are used in both modes to protect TB jumps.

Some notes:

- tb_lock is removed from notdirty_mem_write by passing a
  locked page_collection to tb_invalidate_phys_page_fast.

- tcg_tb_lookup/remove/insert/etc have their own internal lock(s),
  so there is no need to further serialize access to them.

- do_tb_flush is run in a safe async context, meaning no other
  vCPU threads are running. Therefore acquiring mmap_lock there
  is just to please tools such as thread sanitizer.

- Not visible in the diff, but tb_invalidate_phys_page already
  has an assert_memory_lock.

- cpu_io_recompile is !user-only, so no mmap_lock there.

- Added mmap_unlock()'s before all siglongjmp's that could
  be called in user-mode while mmap_lock is held.
  + Added an assert for !have_mmap_lock() after returning from
    the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic.

Performance numbers before/after:

Host: AMD Opteron(tm) Processor 6376

                 ubuntu 17.04 ppc64 bootup+shutdown time

  700 +-+--+----+------+------------+-----------+------------*--+-+
      |    +    +      +            +           +           *B    |
      |         before ***B***                            ** *    |
      |tb lock removal ###D###                         ***        |
  600 +-+                                           ***         +-+
      |                                           **         #    |
      |                                        *B*          #D    |
      |                                     *** *         ##      |
  500 +-+                                ***           ###      +-+
      |                             * ***           ###           |
      |                            *B*          # ##              |
      |                          ** *          #D#                |
  400 +-+                      **            ##                 +-+
      |                      **           ###                     |
      |                    **           ##                        |
      |                  **         # ##                          |
  300 +-+  *           B*          #D#                          +-+
      |    B         ***        ###                               |
      |    *       **       ####                                  |
      |     *   ***      ###                                      |
  200 +-+   B  *B     #D#                                       +-+
      |     #B* *   ## #                                          |
      |     #*    ##                                              |
      |    + D##D#     +            +           +            +    |
  100 +-+--+----+------+------------+-----------+------------+--+-+
           1    8      16      Guest CPUs       48           64
  png: https://imgur.com/HwmBHXe

              debian jessie aarch64 bootup+shutdown time

  90 +-+--+-----+-----+------------+------------+------------+--+-+
     |    +     +     +            +            +            +    |
     |         before ***B***                                B    |
  80 +tb lock removal ###D###                              **D  +-+
     |                                                   **###    |
     |                                                 **##       |
  70 +-+                                             ** #       +-+
     |                                             ** ##          |
     |                                           **  #            |
  60 +-+                                       *B  ##           +-+
     |                                       **  ##               |
     |                                    ***  #D                 |
  50 +-+                               ***   ##                 +-+
     |                             * **   ###                     |
     |                           **B*  ###                        |
  40 +-+                     ****  # ##                         +-+
     |                   ****     #D#                             |
     |             ***B**      ###                                |
  30 +-+    B***B**        ####                                 +-+
     |    B *   *     # ###                                       |
     |     B       ###D#                                          |
  20 +-+   D  ##D##                                             +-+
     |      D#                                                    |
     |    +     +     +            +            +            +    |
  10 +-+--+-----+-----+------------+------------+------------+--+-+
          1     8     16      Guest CPUs        48           64
  png: https://imgur.com/iGpGFtv

The gains are high for 4-8 CPUs. Beyond that point, however, unrelated
lock contention significantly hurts scalability.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 08:18:48 -10:00
Emilio G. Cota
194125e3eb translate-all: protect TB jumps with a per-destination-TB lock
This applies to both user-mode and !user-mode emulation.

Instead of relying on a global lock, protect the list of incoming
jumps with tb->jmp_lock. This lock also protects tb->cflags,
so update all tb->cflags readers outside tb->jmp_lock to use
atomic reads via tb_cflags().

In order to find the destination TB (and therefore its jmp_lock)
from the origin TB, we introduce tb->jmp_dest[].

I considered not using a linked list of jumps, which simplifies
code and makes the struct smaller. However, it unnecessarily increases
memory usage, which results in a performance decrease. See for
instance these numbers booting+shutting down debian-arm:
                      Time (s)  Rel. err (%)  Abs. err (s)  Rel. slowdown (%)
------------------------------------------------------------------------------
 before                  20.88          0.74      0.154512                 0.
 after                   20.81          0.38      0.079078        -0.33524904
 GTree                   21.02          0.28      0.058856         0.67049808
 GHashTable + xxhash     21.63          1.08      0.233604          3.5919540

Using a hash table or a binary tree to keep track of the jumps
doesn't really pay off, not only due to the increased memory usage,
but also because most TBs have only 0 or 1 jumps to them. The maximum
number of jumps when booting debian-arm that I measured is 35, but
as we can see in the histogram below a TB with that many incoming jumps
is extremely rare; the average TB has 0.80 incoming jumps.

n_jumps: 379208; avg jumps/tb: 0.801099
dist: [0.0,1.0)|▄█▁▁▁▁▁▁▁▁▁▁▁ ▁▁▁▁▁▁ ▁▁▁  ▁▁▁     ▁|[34.0,35.0]

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 08:18:48 -10:00
Emilio G. Cota
faa9372c07 translate-all: introduce assert_no_pages_locked
The appended adds assertions to make sure we do not longjmp with page
locks held. Note that user-mode has nothing to check, since page_locks
are !user-mode only.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota
0b5c91f74f translate-all: use per-page locking in !user-mode
Groundwork for supporting parallel TCG generation.

Instead of using a global lock (tb_lock) to protect changes
to pages, use fine-grained, per-page locks in !user-mode.
User-mode stays with mmap_lock.

Sometimes changes need to happen atomically on more than one
page (e.g. when a TB that spans across two pages is
added/invalidated, or when a range of pages is invalidated).
We therefore introduce struct page_collection, which helps
us keep track of a set of pages that have been locked in
the appropriate locking order (i.e. by ascending page index).

This commit first introduces the structs and the function helpers,
to then convert the calling code to use per-page locking. Note
that tb_lock is not removed yet.

While at it, rename tb_alloc_page to tb_page_add, which pairs with
tb_page_remove.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota
1e05197f24 translate-all: iterate over TBs in a page with PAGE_FOR_EACH_TB
This commit does several things, but to avoid churn I merged them all
into the same commit. To wit:

- Use uintptr_t instead of TranslationBlock * for the list of TBs in a page.
  Just like we did in (c37e6d7e "tcg: Use uintptr_t type for
  jmp_list_{next|first} fields of TB"), the rationale is the same: these
  are tagged pointers, not pointers. So use a more appropriate type.

- Only check the least significant bit of the tagged pointers. Masking
  with 3/~3 is unnecessary and confusing.

- Introduce the TB_FOR_EACH_TAGGED macro, and use it to define
  PAGE_FOR_EACH_TB, which improves readability. Note that
  TB_FOR_EACH_TAGGED will gain another user in a subsequent patch.

- Update tb_page_remove to use PAGE_FOR_EACH_TB. In case there
  is a bug and we attempt to remove a TB that is not in the list, instead
  of segfaulting (since the list is NULL-terminated) we will reach
  g_assert_not_reached().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota
128ed2278c tcg: move tb_ctx.tb_phys_invalidate_count to tcg_ctx
Thereby making it per-TCGContext. Once we remove tb_lock, this will
avoid an atomic increment every time a TB is invalidated.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota
be2cdc5e35 tcg: track TBs with per-region BST's
This paves the way for enabling scalable parallel generation of TCG code.

Instead of tracking TBs with a single binary search tree (BST), use a
BST for each TCG region, protecting it with a lock. This is as scalable
as it gets, since each TCG thread operates on a separate region.

The core of this change is the introduction of struct tcg_region_tree,
which contains a pointer to a GTree and an associated lock to serialize
accesses to it. We then allocate an array of tcg_region_tree's, adding
the appropriate padding to avoid false sharing based on
qemu_dcache_linesize.

Given a tc_ptr, we first find the corresponding region_tree. This
is done by special-casing the first and last regions first, since they
might be of size != region.size; otherwise we just divide the offset
by region.stride. I was worried about this division (several dozen
cycles of latency), but profiling shows that this is not a fast path.
Note that region.stride is not required to be a power of two; it
is only required to be a multiple of the host's page size.

Note that with this design we can also provide consistent snapshots
about all region trees at once; for instance, tcg_tb_foreach
acquires/releases all region_tree locks before/after iterating over them.
For this reason we now drop tb_lock in dump_exec_info().

As an alternative I considered implementing a concurrent BST, but this
can be tricky to get right, offers no consistent snapshots of the BST,
and performance and scalability-wise I don't think it could ever beat
having separate GTrees, given that our workload is insert-mostly (all
concurrent BST designs I've seen focus, understandably, on making
lookups fast, which comes at the expense of convoluted, non-wait-free
insertions/removals).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Peter Maydell
2ef2f16781 Migration pull 2018-06-15
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbI9eNAAoJEAUWMx68W/3nL5cQAIU5FS3dsYplNgk8MUxEeyRE
 KX/v7b/BPua9DwqNPtre37YHO2ibS6k20nb7jLqj0YhDLhC0EJjeHY7g/u6JJr6w
 UGIqFfzPCiwnQ8wLbjYqohliaS72YMN33MukV1KnC0IpIJEzhXPHNG2FJzqoGobb
 MZbjbw+a8X/6uabzuc+7VdbshRXmLIwNPXPRqILvt5vPKc9GTEfRfR5F5eAZouZZ
 PZ+f8vsFeUHtecqQHLkBlGD5LTx00S/Q7qi8LcJf5UKYUhFc4hIxCBxCw8B+nNq5
 D1hlI5aojkBPlXcleuw6DOyHLFkUu1lxlO3ilo5AsVzpHj/5Gs+/FPAlda0hbe7a
 V3mU7Mq77Nl3zU651pUADkwE/QJ5vW52mr/u4MW4Xj8mTOV3DLCDaZEZZVlAiNua
 Afn9vYYIr4afO/MKdBGI9roOL8GXRrlnKHhYSgcjQ5GMoNdT5dzE+KvVS0lsm/qB
 E/F5fmgThDd82abvS2e5crflitOBedW4caewvVs0W/mrJIY0Yqg0rzootdIydlG0
 dvvCRlhuFIk6Ugk3hR0oj33+noRkzhxmnnjjaJzpAZlTORixXm9YNwUwI9LlwynB
 44h/11GDOQY4sYNYagHe17kUYRAZOfCimwJH0fuNiCfb7nYXtcUeKrgY3yy3s/Xl
 MkuHX0Swz0wM12Jt2PpC
 =QgO4
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20180615a' into staging

Migration pull 2018-06-15

# gpg: Signature made Fri 15 Jun 2018 16:13:17 BST
# gpg:                using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20180615a:
  migration: calculate expected_downtime with ram_bytes_remaining()
  migration/postcopy: Wake rate limit sleep on postcopy request
  migration: Wake rate limiting for urgent requests
  migration/postcopy: Add max-postcopy-bandwidth parameter
  migration: introduce migration_update_rates
  migration: fix counting xbzrle cache_miss_rate
  migration/block-dirty-bitmap: fix dirty_bitmap_load
  migration: Poison ramblock loops in migration
  migration: Fixes for non-migratable RAMBlocks
  typedefs: add QJSON

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-15 18:13:35 +01:00
Peter Maydell
1f871c5e6b exec.c: Handle IOMMUs in address_space_translate_for_iotlb()
Currently we don't support board configurations that put an IOMMU
in the path of the CPU's memory transactions, and instead just
assert() if the memory region fonud in address_space_translate_for_iotlb()
is an IOMMUMemoryRegion.

Remove this limitation by having the function handle IOMMUs.
This is mostly straightforward, but we must make sure we have
a notifier registered for every IOMMU that a transaction has
passed through, so that we can flush the TLB appropriately
when any of the IOMMUs change their mappings.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-5-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell
2c91bcf273 iommu: Add IOMMU index argument to translate method
Add an IOMMU index argument to the translate method of
IOMMUs. Since all of our current IOMMU implementations
support only a single IOMMU index, this has no effect
on the behaviour.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-4-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell
cb1efcf462 iommu: Add IOMMU index argument to notifier APIs
Add support for multiple IOMMU indexes to the IOMMU notifier APIs.
When initializing a notifier with iommu_notifier_init(), the caller
must pass the IOMMU index that it is interested in. When a change
happens, the IOMMU implementation must pass
memory_region_notify_iommu() the IOMMU index that has changed and
that notifiers must be called for.

IOMMUs which support only a single index don't need to change.
Callers which only really support working with IOMMUs with a single
index can use the result of passing MEMTXATTRS_UNSPECIFIED to
memory_region_iommu_attrs_to_index().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-3-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell
21f402093c iommu: Add IOMMU index concept to IOMMU API
If an IOMMU supports mappings that care about the memory
transaction attributes, then it no longer has a unique
address -> output mapping, but more than one. We can
represent these using an IOMMU index, analogous to TCG's
mmu indexes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-2-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell
afa4f6653d bswap: Add new stn_*_p() and ldn_*_p() memory access functions
There's a common pattern in QEMU where a function needs to perform
a data load or store of an N byte integer in a particular endianness.
At the moment this is handled by doing a switch() on the size and
calling the appropriate ld*_p or st*_p function for each size.

Provide a new family of functions ldn_*_p() and stn_*_p() which
take the size as an argument and do the switch() themselves.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611171007.4165-2-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell
2d54f19401 cputlb: Pass cpu_transaction_failed() the correct physaddr
The API for cpu_transaction_failed() says that it takes the physical
address for the failed transaction. However we were actually passing
it the offset within the target MemoryRegion. We don't currently
have any target CPU implementations of this hook that require the
physical address; fix this bug so we don't get confused if we ever
do add one.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611125633.32755-3-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Peter Maydell
ace4109011 cpu-defs.h: Document CPUIOTLBEntry 'addr' field
The 'addr' field in the CPUIOTLBEntry struct has a rather non-obvious
use; add a comment documenting it (reverse-engineered from what
the code that sets it is doing).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180611125633.32755-2-peter.maydell@linaro.org
2018-06-15 15:23:34 +01:00
Dr. David Alan Gilbert
343f632c70 migration: Poison ramblock loops in migration
The migration code should be using the
  RAMBLOCK_FOREACH_MIGRATABLE and qemu_ram_foreach_block_migratable
not the all-block versions;  poison them so that we can't accidentally
use them.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180605162545.80778-3-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-06-15 14:40:56 +01:00
Peter Maydell
b74588a493 migration/next for 20180604
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJbFLygAAoJEPSH7xhYctcjszUP/j7/21sHXwPyB5y4CQHsivLL
 18MnXTkjgSlue3fuPKI9ZzD2VZq19AtwarLuW3IFf15adKfeV0hhZsIMUmu0XTAz
 jgol8YYgoYlk3/on8Ux8PWV9RWKHh7vWAF7hjwJOnI5CFF7gXz6kVjUx03A7RSnB
 1a92lvgl5P8AbK5HZ/i08nOaAxCUWIVIoQeZf3BlZ5WdJE0j+d0Mvp/S8NEq5VPD
 P6oYJZy64QFGotmV1vtYD4G/Hqe8XBw2LoSVgjPSmQpd1m4CXiq0iyCxtC8HT+iq
 0P0eHATRglc5oGhRV6Mi5ixH2K2MvpshdyMKVAbkJchKBHn4k3m6jefhnUXuucwC
 fq9x7VIYzQ80S+44wJri683yUWP3Qmbb6mWYBb1L0jgbTkY1wx431IYMtDVv5q4R
 oXtbe77G6lWKHH0SFrU8TY/fqJvBwpat+QwOoNsluHVjzAzpOpcVokB8pLj592/1
 bHkDwvv7i+B37muOF3PFy++BoliwpRiNbVC5Btw4mnbOoY+AeJHMRBuhJHU6EyJv
 GOFiQ5w5XEXEu98giNqHcc0AC8w7iTRGI472UzpS0AG20wpc1/2jF0yM99stauwD
 oIu3MU4dQlz6QRy8KFnJShqjYNYuOs59USLiYM79ABa9J+JxJxxTAaqRbUFojQR6
 KN2i0QQJ+6MO4Skjbu/3
 =K0eJ
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20180604' into staging

migration/next for 20180604

# gpg: Signature made Mon 04 Jun 2018 05:14:24 BST
# gpg:                using RSA key F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20180604:
  migration: not wait RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect
  migration: remove unnecessary variables len in QIOChannelRDMA
  migration: Don't activate block devices if using -S
  migration: discard non-migratable RAMBlocks
  migration: introduce decompress-error-check

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-04 12:54:00 +01:00
Peter Maydell
163670542f tcg-next queue
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbEdLqAAoJEGTfOOivfiFfQaEH/Rq96S5bo94495KmRJY9e/jw
 lV321YYI7nx7sHtViG/B3iTkvnxzZPWcc7XbBMxyV5xmMQ/5zjS/ynZPFyy/cYRn
 zLM4W0SJ38EqhHTZpkkvw9Nle8UbNWKm5PgND2TyE4hmeuQ98OrQ6Y1GvP4MFpXs
 uQErbmMjYHMq7thbfCO6ulJjjEliRy3AJ2C3fCCCUgBQrJt6JeqbGr/Zzi2y88M9
 IhoK8RbJiWT2O5Tl95q2NOQvr11WbFlu/K0nuaVgbfTwd2tp3ygmRKPpeZ24qA52
 qtwgcIjWHHkkC5s1qaP8oW4FtoMQZdsaOwSOPw0ZBnG+VA7P/h33fWr9f5SistA=
 =UVdE
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/tcg-next-pull-request' into staging

tcg-next queue

# gpg: Signature made Sat 02 Jun 2018 00:12:42 BST
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/tcg-next-pull-request:
  tcg: Pass tb and index to tcg_gen_exit_tb separately

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-04 11:28:31 +01:00
Cédric Le Goater
b895de5027 migration: discard non-migratable RAMBlocks
On the POWER9 processor, the XIVE interrupt controller can control
interrupt sources using MMIO to trigger events, to EOI or to turn off
the sources. Priority management and interrupt acknowledgment is also
controlled by MMIO in the presenter sub-engine.

These MMIO regions are exposed to guests in QEMU with a set of 'ram
device' memory mappings, similarly to VFIO, and the VMAs are populated
dynamically with the appropriate pages using a fault handler.

But, these regions are an issue for migration. We need to discard the
associated RAMBlocks from the RAM state on the source VM and let the
destination VM rebuild the memory mappings on the new host in the
post_load() operation just before resuming the system.

To achieve this goal, the following introduces a new RAMBlock flag
RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
vmstate_unregister_ram() routines. This flag is then used by the
migration to identify RAMBlocks to discard on the source. Some checks
are also performed on the destination to make sure nothing invalid was
sent.

This change impacts the boston, malta and jazz mips boards for which
migration compatibility is broken.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-06-04 05:46:15 +02:00
Richard Henderson
07ea28b418 tcg: Pass tb and index to tcg_gen_exit_tb separately
Do the cast to uintptr_t within the helper, so that the compiler
can type check the pointer argument.  We can also do some more
sanity checking of the index argument.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-01 15:15:27 -07:00
Peter Maydell
afd76ffba9 * Linux header upgrade (Peter)
* firmware.json definition (Laszlo)
 * IPMI migration fix (Corey)
 * QOM improvements (Alexey, Philippe, me)
 * Memory API cleanups (Jay, me, Tristan, Peter)
 * WHPX fixes and improvements (Lucian)
 * Chardev fixes (Marc-André)
 * IOMMU documentation improvements (Peter)
 * Coverity fixes (Peter, Philippe)
 * Include cleanup (Philippe)
 * -clock deprecation (Thomas)
 * Disable -sandbox unless CONFIG_SECCOMP (Yi Min Zhao)
 * Configurability improvements (me)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAlsRd2UUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPG8Qf+M85E8xAQ/bhs90tAymuXkUUsTIFF
 uI76K8eM0K3b2B+vGckxh1gyN5O3GQaMEDL7vITfqbX+EOH5U2lv8V9JRzf2YvbG
 Zahjd4pOCYzR0b9JENA1r5U/J8RntNrBNXlKmGTaXOaw9VCXlZyvgVd9CE3z/e2M
 0jSXMBdF4LB3UzECI24Va8ejJxdSiJcqXA2j3J+pJFxI698i+Z5eBBKnRdo5TVe5
 jl0TYEsbS6CLwhmbLXmt3Qhq+ocZn7YH9X3HjkHEdqDUeYWyT9jwUpa7OHFrIEKC
 ikWm9er4YDzG/vOC0dqwKbShFzuTpTJuMz5Mj4v8JjM/iQQFrp4afjcW2g==
 =RS/B
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Linux header upgrade (Peter)
* firmware.json definition (Laszlo)
* IPMI migration fix (Corey)
* QOM improvements (Alexey, Philippe, me)
* Memory API cleanups (Jay, me, Tristan, Peter)
* WHPX fixes and improvements (Lucian)
* Chardev fixes (Marc-André)
* IOMMU documentation improvements (Peter)
* Coverity fixes (Peter, Philippe)
* Include cleanup (Philippe)
* -clock deprecation (Thomas)
* Disable -sandbox unless CONFIG_SECCOMP (Yi Min Zhao)
* Configurability improvements (me)

# gpg: Signature made Fri 01 Jun 2018 17:42:13 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (56 commits)
  hw: make virtio devices configurable via default-configs/
  hw: allow compiling out SCSI
  memory: Make operations using MemoryRegionIoeventfd struct pass by pointer.
  char: Remove unwanted crlf conversion
  qdev: Remove DeviceClass::init() and ::exit()
  qdev: Simplify the SysBusDeviceClass::init path
  hw/i2c: Use DeviceClass::realize instead of I2CSlaveClass::init
  hw/i2c/smbus: Use DeviceClass::realize instead of SMBusDeviceClass::init
  target/i386/kvm.c: Remove compatibility shim for KVM_HINTS_REALTIME
  Update Linux headers to 4.17-rc6
  target/i386/kvm.c: Handle renaming of KVM_HINTS_DEDICATED
  scripts/update-linux-headers: Handle kernel license no longer being one file
  scripts/update-linux-headers: Handle __aligned_u64
  virtio-gpu-3d: Define VIRTIO_GPU_CAPSET_VIRGL2 elsewhere
  gdbstub: Prevent fd leakage
  docs/interop: add "firmware.json"
  ipmi: Use proper struct reference for KCS vmstate
  vmstate: Add a VSTRUCT type
  tcg: remove softfloat from --disable-tcg builds
  qemu-options: Mark the non-functional -clock option as deprecated
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-01 18:24:16 +01:00
Paolo Bonzini
257a7430e7 memory: get rid of memory_region_init_reservation
The function has been deprecated for 2.5 years, and there are just a handful
of users.  Convert them to memory_region_init_io with NULL callbacks,
and while at it pass the right device as the owner.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01 14:15:10 +02:00
Peter Maydell
0330002cb5 memory.h: Fix typo in documentation comment
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180515134835.3409-1-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01 14:15:10 +02:00
Peter Maydell
7446eb07c1 Make address_space_get_iotlb_entry() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_get_iotlb_entry().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-12-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
efa99a2ff8 Make flatview_translate() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to flatview_translate(); all its
callers now have attrs available.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-11-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
8372d38327 Make MemoryRegion valid.accepts callback take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to the MemoryRegion valid.accepts
callback. We'll need this for subpage_accepts().

We could take the approach we used with the read and write
callbacks and add new a new _with_attrs version, but since there
are so few implementations of the accepts hook we just change
them all.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-9-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
6d7b9a6c3b Make memory_region_access_valid() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to memory_region_access_valid().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.

The callsite in flatview_access_valid() is part of a recursive
loop flatview_access_valid() -> memory_region_access_valid() ->
 subpage_accepts() -> flatview_access_valid(); we make it pass
MEMTXATTRS_UNSPECIFIED for now, until the next several commits
have plumbed an attrs parameter through the rest of the loop
and we can add an attrs parameter to flatview_access_valid().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-8-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
fddffa4268 Make address_space_access_valid() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_access_valid().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-6-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
f26404fbee Make address_space_map() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_map().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-5-peter.maydell@linaro.org
2018-05-31 16:32:35 +01:00
Peter Maydell
bc6b1cec84 Make address_space_translate{, _cached}() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_translate()
and address_space_translate_cached(). Callers either have an
attrs value to hand, or don't care and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-4-peter.maydell@linaro.org
2018-05-31 14:50:52 +01:00
Peter Maydell
c874dc4f5e Make tb_invalidate_phys_addr() take a MemTxAttrs argument
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to tb_invalidate_phys_addr().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180521140402.23318-3-peter.maydell@linaro.org
2018-05-31 14:50:52 +01:00
Peter Maydell
2ce931d012 memory.h: Improve IOMMU related documentation
Add more detail to the documentation for memory_region_init_iommu()
and other IOMMU-related functions and data structures.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20180521140402.23318-2-peter.maydell@linaro.org
2018-05-31 14:50:52 +01:00
Richard Henderson
6c2be133a7 tcg: Fix helper function vs host abi for float16
Depending on the host abi, float16, aka uint16_t, values are
passed and returned either zero-extended in the host register
or with garbage at the top of the host register.

The tcg code generator has so far been assuming garbage, as that
matches the x86 abi, but this is incorrect for other host abis.
Further, target/arm has so far been assuming zero-extended results,
so that it may store the 16-bit value into a 32-bit slot with the
high 16-bits already clear.

Rectify both problems by mapping "f16" in the helper definition
to uint32_t instead of (a typedef for) uint16_t.  This forces
the host compiler to assume garbage in the upper 16 bits on input
and to zero-extend the result on output.

Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180522175629.24932-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-31 14:50:51 +01:00
Peter Maydell
4f71086665 gdbstub: Clarify what gdb_handlesig() is doing
gdb_handlesig()'s behaviour is not entirely obvious at first
glance. Add a doc comment for it, and also add a comment
explaining why it's ok for gdb_do_syscallv() to ignore
gdb_handlesig()'s return value. (Coverity complains about
this: CID 1390850.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180515181958.25837-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-05-25 10:10:55 +02:00
Peter Maydell
75578d6fce linux-user: Assert on bad type in thunk_type_align() and thunk_type_size()
In thunk_type_align() and thunk_type_size() we currently return
-1 if the value at the type_ptr isn't one of the TYPE_* values
we understand. However, this should never happen, and if it does
then the calling code will go confusingly wrong because none
of the callsites try to handle an error return. Switch to an
assertion instead, so that if this does somehow happen we'll have
a nice clear backtrace of what happened rather than a weird crash
or misbehaviour.

This also silences various Coverity complaints about not handling
the negative return value (CID 1005735, 1005736, 1005738, 1390582).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180514174616.19601-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-05-24 20:46:54 +02:00
Peter Maydell
f39ddb3a08 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJa+dImAAoJEPMMOL0/L748umMP/R9miqzm3/Pj1gssD/yA5lZE
 7rx/d14Sefo2T/L1kmg6vbbyOJyLnHf9DtTI/JCgOnC9otOQ1IKfeI4c0coJjI5R
 G0yNHswLB+VMIJ830ivF0QE4Z8f6K1UbBD9y+iQfpFk55rrT+dt9U3YJBrCY7bQ8
 Sk+2dFpk+oVh9oz3GPHRkWua/KGMDrqCjAOlkVCXyNZuJ8yp69wHWI2j81nFQv3v
 1AnXOm6bbjy9zwRuRUJ6yco2vu8uXyor/usA465C66A5XZ7xcrgKIZQge3hV31fo
 4BmnxEviMKM5dM013e2nk0Hh5YwG/Vv+t8YYgco79Mcy9QJIEO4J1JXedUYZ3FJ/
 JhV5nTcYNi1hQMkSxzXWoxyOMElwL8DaPQIZS7c++pOSXCS9oD/baBuPzC/dWBTH
 VThuCYd1EsBe8ZFkgph/oUMYZQHcS2/paGE1RuHlLXVOqd1k9v/d27yxngo1/wn7
 +zesaWkp9aQOC6pij23cgKAq8S1X8KeaOM0UK+KmMNSr1h9nQY146D3/SOqUfAfS
 2DzW7PBnUHFzi6uXE99ZiUSSWWIyIjJEgAOnqnZCrl5LbPCKnjp4/BYupxoUwAWF
 Nr9Gk34R2FhoqqJtplzYWxXfyyCFujQMXHUe/yYx5ITG18totKb+09GbeeSep8Is
 myxrDirmbaqCnj8zfd+M
 =yjsX
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-2.13-pull-request' into staging

# gpg: Signature made Mon 14 May 2018 19:15:02 BST
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-2.13-pull-request:
  linux-user: correctly align types in thunking code
  linux-user: fix UNAME_MACHINE for sparc/sparc64
  linux-user: add sparc/sparc64 specific errno
  linux-user: fix conversion of flock/flock64 l_type field
  linux-user: update sparc/syscall_nr.h to linux header 4.16
  linux-user: fix flock/flock64 padding
  linux-user: define correct fcntl() values for sparc

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-15 10:04:22 +01:00
Laurent Vivier
f606e4d625 linux-user: correctly align types in thunking code
This is a follow up
of patch:

        commit c2e3dee6e0
        Author: Laurent Vivier <laurent@vivier.eu>
        Date:   Sun Feb 13 23:37:34 2011 +0100

            linux-user: Define target alignment size

In my case m68k aligns "int" on 2 not 4. You can check this with the
following program:

int main(void)
{
        struct rtentry rt;
        printf("rt_pad1 %ld %zd\n", offsetof(struct rtentry, rt_pad1),
                sizeof(rt.rt_pad1));
        printf("rt_dst %ld %zd\n", offsetof(struct rtentry, rt_dst),
                sizeof(rt.rt_dst));
        printf("rt_gateway %ld %zd\n", offsetof(struct rtentry, rt_gateway),
                sizeof(rt.rt_gateway));
        printf("rt_genmask %ld %zd\n", offsetof(struct rtentry, rt_genmask),
                sizeof(rt.rt_genmask));
        printf("rt_flags %ld %zd\n", offsetof(struct rtentry, rt_flags),
                sizeof(rt.rt_flags));
        printf("rt_pad2 %ld %zd\n", offsetof(struct rtentry, rt_pad2),
                sizeof(rt.rt_pad2));
        printf("rt_pad3 %ld %zd\n", offsetof(struct rtentry, rt_pad3),
                sizeof(rt.rt_pad3));
        printf("rt_pad4 %ld %zd\n", offsetof(struct rtentry, rt_pad4),
                sizeof(rt.rt_pad4));
        printf("rt_metric %ld %zd\n", offsetof(struct rtentry, rt_metric),
                sizeof(rt.rt_metric));
        printf("rt_dev %ld %zd\n", offsetof(struct rtentry, rt_dev),
                sizeof(rt.rt_dev));
        printf("rt_mtu %ld %zd\n", offsetof(struct rtentry, rt_mtu),
                sizeof(rt.rt_mtu));
        printf("rt_window %ld %zd\n", offsetof(struct rtentry, rt_window),
                sizeof(rt.rt_window));
        printf("rt_irtt %ld %zd\n", offsetof(struct rtentry, rt_irtt),
                sizeof(rt.rt_irtt));
}

And result is :

i386

rt_pad1 0 4
rt_dst 4 16
rt_gateway 20 16
rt_genmask 36 16
rt_flags 52 2
rt_pad2 54 2
rt_pad3 56 4
rt_pad4 62 2
rt_metric 64 2
rt_dev 68 4
rt_mtu 72 4
rt_window 76 4
rt_irtt 80 2

m68k

rt_pad1 0 4
rt_dst 4 16
rt_gateway 20 16
rt_genmask 36 16
rt_flags 52 2
rt_pad2 54 2
rt_pad3 56 4
rt_pad4 62 2
rt_metric 64 2
rt_dev 66 4
rt_mtu 70 4
rt_window 74 4
rt_irtt 78 2

This affects the "route" command :

WITHOUT this patch:

$ sudo route add -net default gw 10.0.3.1 window 1024 irtt 2 eth0
$ netstat -nr
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         10.0.3.1        0.0.0.0         UG        0 67108866  32768 eth0
10.0.3.0        0.0.0.0         255.255.255.0   U         0 0          0 eth0

WITH this patch:

$ sudo route add -net default gw 10.0.3.1 window 1024 irtt 2 eth0
$ netstat -nr
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         10.0.3.1        0.0.0.0         UG        0 1024       2 eth0
10.0.3.0        0.0.0.0         255.255.255.0   U         0 0          0 eth0

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180510205949.26455-1-laurent@vivier.eu>
2018-05-14 12:01:21 +02:00
Peter Maydell
9ba1733a76 * Don't silently truncate extremely long words in the command line
* dtc configure fixes
 * MemoryRegionCache second try
 * Deprecated option removal
 * add support for Hyper-V reenlightenment MSRs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJa9Y2qAAoJEL/70l94x66Df8EIAI4pi+zf1mTlH0Koi+oqOg+d
 geBC6N9IA+n1p90XERnPbuiT19NjON2R1Z907SbzDkijxdNRoYUoQf7Z+ZBTENjn
 dYsVvgLYzajGLWWtJetPPaNFAqeF2z8B3lbVQnGVLzH5pQQ2NS1NJsvXQA2LslLs
 2ll1CJ2EEBhayoBSbHK+0cY85f+DUgK/T1imIV2T/rwcef9Rw218nvPfGhPBSoL6
 tI2xIOxz8bBOvZNg2wdxpaoPuDipBFu6koVVbaGSgXORg8k5CEcKNxInztufdELW
 KZK5ORa3T0uqu5T/GGPAfm/NbYVQ4aTB5mddshsXtKbBhnbSfRYvpVsR4kQB/Hc=
 =oC1r
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Don't silently truncate extremely long words in the command line
* dtc configure fixes
* MemoryRegionCache second try
* Deprecated option removal
* add support for Hyper-V reenlightenment MSRs

# gpg: Signature made Fri 11 May 2018 13:33:46 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (29 commits)
  rename included C files to foo.inc.c, remove osdep.h
  pc-dimm: fix error messages if no slots were defined
  build: Silence dtc directory creation
  shippable: Remove Debian 8 libfdt kludge
  configure: Display if libfdt is from system or git
  configure: Really use local libfdt if the system one is too old
  i386/kvm: add support for Hyper-V reenlightenment MSRs
  qemu-doc: provide details of supported build platforms
  qemu-options: Remove deprecated -no-kvm-irqchip
  qemu-options: Remove deprecated -no-kvm-pit-reinjection
  qemu-options: Bail out on unsupported options instead of silently ignoring them
  qemu-options: Remove remainders of the -tdf option
  qemu-options: Mark -virtioconsole as deprecated
  target/i386: sev: fix memory leaks
  opts: don't silently truncate long option values
  opts: don't silently truncate long parameter keys
  accel: use g_strsplit for parsing accelerator names
  update-linux-headers: drop hyperv.h
  qemu-thread: always keep the posix wrapper layer
  exec: reintroduce MemoryRegion caching
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-14 09:55:09 +01:00
Emilio G. Cota
b542683d77 translator: merge max_insns into DisasContextBase
While at it, use int for both num_insns and max_insns to make
sure we have same-type comparisons.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-09 10:12:21 -07:00
Paolo Bonzini
48564041a7 exec: reintroduce MemoryRegion caching
MemoryRegionCache was reverted to "normal" address_space_* operations
for 2.9, due to lack of support for IOMMUs.  Reinstate the
optimizations, caching only the IOMMU translation at address_cache_init
but not the IOMMU lookup and target AddressSpace translation are not
cached; now that MemoryRegionCache supports IOMMUs, it becomes more widely
applicable too.

The inlined fast path is defined in memory_ldst_cached.inc.h, while the
slow path uses memory_ldst.inc.c as before.  The smaller fast path causes
a little code size reduction in MemoryRegionCache users:

    hw/virtio/virtio.o text size before: 32373
    hw/virtio/virtio.o text size after: 31941

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-09 00:13:38 +02:00
Paolo Bonzini
4269c82bf7 exec: move memory access declarations to a common header, inline *_phys functions
For now, this reduces the text size very slightly due to the newly-added
inlining:

   text size before: 9301965
   text size after: 9300645

Later, however, the declarations in include/exec/memory_ldst.inc.h will be
reused for the MemoryRegionCache slow path functions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-09 00:13:38 +02:00
Laurent Vivier
7f254c5cb8 linux-user: remove useless padding in flock64 structure
Since commit 8efb2ed5ec ("linux-user: Correct signedness of
target_flock l_start and l_len fields"), flock64 structure uses
abi_llong for l_start and l_len in place of "unsigned long long"
this should force them to be aligned accordingly to the target
rules. So we can remove the padding field and the QEMU_PACKED
attribute.

I have compared the result of the following program before and
after the change:

    cat -> flock64_dump  <<EOF
    p/d sizeof(struct target_flock64)
    p/d &((struct target_flock64 *)0)->l_type
    p/d &((struct target_flock64 *)0)->l_whence
    p/d &((struct target_flock64 *)0)->l_start
    p/d &((struct target_flock64 *)0)->l_len
    p/d &((struct target_flock64 *)0)->l_pid
    quit
    EOF

    for file in build/all/*-linux-user/qemu-* ; do
    echo $file
    gdb -batch -nx -x flock64_dump $file 2> /dev/null
    done

The sizeof() changes because we remove the QEMU_PACKED.
The new size is 32 (except for i386 and m68k) and this is
the real size of "struct flock64" on the target architecture.

The following architectures differ:
aarch64_be, aarch64, alpha, armeb, arm, cris, hppa, nios2, or1k,
riscv32, riscv64, s390x.

For a subset of these architectures, I have checked with the following
program the new structure is the correct one:

  #include <stdio.h>
  #define __USE_LARGEFILE64
  #include <fcntl.h>

  int main(void)
  {
	  printf("struct flock64 %d\n", sizeof(struct flock64));
	  printf("l_type %d\n", &((struct flock64 *)0)->l_type);
	  printf("l_whence %d\n", &((struct flock64 *)0)->l_whence);
	  printf("l_start %d\n", &((struct flock64 *)0)->l_start);
	  printf("l_len %d\n", &((struct flock64 *)0)->l_len);
	  printf("l_pid %d\n", &((struct flock64 *)0)->l_pid);
  }

[I have checked aarch64, alpha, hppa, s390x]

For ARM, the target_flock64 becomes the EABI definition, so we need to
define the OABI one in place of the EABI one and use it when it is
needed.

I have also fixed the alignment value for sh4 (to align llong on 4 bytes)
(see c2e3dee6e0 "linux-user: Define target alignment size")
[We should check alignment properties for cris, nios2 and or1k]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180502215730.28162-1-laurent@vivier.eu>
2018-05-03 18:40:19 +02:00
Pavel Dovgalyuk
afd46fcad2 icount: fix cpu_restore_state_from_tb for non-tb-exit cases
In icount mode, instructions that access io memory spaces in the middle
of the translation block invoke TB recompilation.  After recompilation,
such instructions become last in the TB and are allowed to access io
memory spaces.

When the code includes instruction like i386 'xchg eax, 0xffffd080'
which accesses APIC, QEMU goes into an infinite loop of the recompilation.

This instruction includes two memory accesses - one read and one write.
After the first access, APIC calls cpu_report_tpr_access, which restores
the CPU state to get the current eip.  But cpu_restore_state_from_tb
resets the cpu->can_do_io flag which makes the second memory access invalid.
Therefore the second memory access causes a recompilation of the block.
Then these operations repeat again and again.

This patch moves resetting cpu->can_do_io flag from
cpu_restore_state_from_tb to cpu_loop_exit* functions.

It also adds a parameter for cpu_restore_state which controls restoring
icount.  There is no need to restore icount when we only query CPU state
without breaking the TB.  Restoring it in such cases leads to the
incorrect flow of the virtual time.

In most cases new parameter is true (icount should be recalculated).
But there are two cases in i386 and openrisc when the CPU state is only
queried without the need to break the TB.  This patch fixes both of
these cases.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <20180409091320.12504.35329.stgit@pasha-VirtualBox>
[rth: Make can_do_io setting unconditional; move from cpu_exec;
make cpu_loop_exit_{noexc,restore} call cpu_loop_exit.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-04-11 09:05:22 +10:00
KONRAD Frederic
1bb982b8fc gdbstub: send a termination packet instead of crashing gdb
Since the commit:
commit 4486e89c21
Author: Stefan Hajnoczi <stefanha@redhat.com>
Date:   Wed Mar 7 14:42:05 2018 +0000

    vl: introduce vm_shutdown()

GDB crashes when qemu exits (at least on sparc-softmmu):
Remote communication error.  Target disconnected.: Connection reset by peer.
Quitting: putpkt: write failed: Broken pipe.

So send a packet to exit GDB before we exit QEMU:
[Inferior 1 (Thread 0) exited normally]

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com>
Message-id: 1521538773-30802-1-git-send-email-frederic.konrad@adacore.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-27 21:16:27 +01:00
Peter Maydell
ed627b2ad3 virtio,vhost,pci,pc: features, cleanups
SRAT tables for DIMM devices
 new virtio net flags for speed/duplex
 post-copy migration support in vhost
 cleanups in pci
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJasR1rAAoJECgfDbjSjVRpOocH/R9A3g/TkpGjmLzJBrrX1NGO
 I/iq0ttHjqg4OBIChA4BHHjXwYUMs7XQn26B3efrk1otLAJhuqntZIIo3uU0WraA
 5J+4DT46ogs5rZWNzDCZ0zAkSaATDA6h9Nfh7TvPc9Q2WpcIT0cTa/jOtrxRc9Vq
 32hbUKtJSpNxRjwbZvk6YV21HtWo3Tktdaj9IeTQTN0/gfMyOMdgxta3+bymicbJ
 FuF9ybHcpXvrEctHhXHIL4/YVGEH/4shagZ4JVzv1dVdLeHLZtPomdf7+oc0+07m
 Qs+yV0HeRS5Zxt7w5blGLC4zDXczT/bUx8oln0Tz5MV7RR/+C2HwMOHC69gfpSc=
 =vomK
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio,vhost,pci,pc: features, cleanups

SRAT tables for DIMM devices
new virtio net flags for speed/duplex
post-copy migration support in vhost
cleanups in pci

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 20 Mar 2018 14:40:43 GMT
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (51 commits)
  postcopy shared docs
  libvhost-user: Claim support for postcopy
  postcopy: Allow shared memory
  vhost: Huge page align and merge
  vhost+postcopy: Wire up POSTCOPY_END notify
  vhost-user: Add VHOST_USER_POSTCOPY_END message
  libvhost-user: mprotect & madvises for postcopy
  vhost+postcopy: Call wakeups
  vhost+postcopy: Add vhost waker
  postcopy: postcopy_notify_shared_wake
  postcopy: helper for waking shared
  vhost+postcopy: Resolve client address
  postcopy-ram: add a stub for postcopy_request_shared_page
  vhost+postcopy: Helper to send requests to source for shared pages
  vhost+postcopy: Stash RAMBlock and offset
  vhost+postcopy: Send address back to qemu
  libvhost-user+postcopy: Register new regions with the ufd
  migration/ram: ramblock_recv_bitmap_test_byte_offset
  postcopy+vhost-user: Split set_mem_table for postcopy
  vhost+postcopy: Transmit 'listen' to slave
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	scripts/update-linux-headers.sh
2018-03-20 15:48:34 +00:00
Dr. David Alan Gilbert
2ce16640b4 postcopy: use UFFDIO_ZEROPAGE only when available
Use a flag on the RAMBlock to state whether it has the
UFFDIO_ZEROPAGE capability, use it when it's available.

This allows the use of postcopy on tmpfs as well as hugepage
backed files.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:27 +02:00
Dr. David Alan Gilbert
f90bb71bfd qemu_ram_block_host_offset
Utility to give the offset of a host pointer within a RAMBlock
(assuming we already know it's in that RAMBlock)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:26 +02:00
Max Filippov
ebf9a3630c linux-user: fix mmap/munmap/mprotect/mremap/shmat
In linux-user QEMU that runs for a target with TARGET_ABI_BITS bigger
than L1_MAP_ADDR_SPACE_BITS an assertion in page_set_flags fires when
mmap, munmap, mprotect, mremap or shmat is called for an address outside
the guest address space. mmap and mprotect should return ENOMEM in such
case.

Change definition of GUEST_ADDR_MAX to always be the last valid guest
address. Account for this change in open_self_maps.
Add macro guest_addr_valid that verifies if the guest address is valid.
Add function guest_range_valid that verifies if address range is within
guest address space and does not wrap around. Use that macro in
mmap/munmap/mprotect/mremap/shmat for error checking.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180307215010.30706-1-jcmvbkbc@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 19:21:34 +01:00
Paolo Bonzini
b2a44fcad7 address_space_read: address_space_to_flatview needs RCU lock
address_space_read is calling address_space_to_flatview but it can
be called outside the RCU lock.  To fix it, push the rcu_read_lock/unlock
pair up from flatview_read_full to address_space_read's constant size
fast path and address_space_read_full.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06 14:01:28 +01:00