Commit Graph

744 Commits

Author SHA1 Message Date
Emilio G. Cota
fe656e3185 include: move exec/tb-hash-xx.h to qemu/xxhash.h
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-12-17 06:04:44 +03:00
Emilio G. Cota
c971d8fa73 exec: introduce qemu_xxhash{2,4,5,6,7}
Before moving them all to include/qemu/xxhash.h.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-12-17 06:04:44 +03:00
Marc-André Lureau
c26763f8ec memory: learn about non-volatile memory region
Add a new flag to mark memory region that are used as non-volatile, by
NVDIMM for example. That bit is propagated down to the flat view, and
reflected in HMP info mtree with a "nv-" prefix on the memory type.

This way, guest_phys_blocks_region_add() can skip the NV memory
regions for dumps and TCG memory clear in a following patch.

Cc: dgilbert@redhat.com
Cc: imammedo@redhat.com
Cc: pbonzini@redhat.com
Cc: guangrong.xiao@linux.intel.com
Cc: mst@redhat.com
Cc: xiaoguangrong.eric@gmail.com
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181003114454.5662-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-11-06 21:35:05 +01:00
Richard Henderson
ab65110530 cputlb: Remove tlb_c.pending_flushes
This is essentially redundant with tlb_c.dirty.

Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-31 12:16:39 +00:00
Richard Henderson
3d1523ced6 cputlb: Filter flushes on already clean tlbs
Especially for guests with large numbers of tlbs, like ARM or PPC,
we may well not use all of them in between flush operations.
Remember which tlbs have been used since the last flush, and
avoid any useless flushing.

Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-31 12:16:35 +00:00
Richard Henderson
e09de0a20d cputlb: Count "partial" and "elided" tlb flushes
Our only statistic so far was "full" tlb flushes, where all mmu_idx
are flushed at the same time.

Now count "partial" tlb flushes where sets of mmu_idx are flushed,
but the set is not maximal.  Account one per mmu_idx flushed, as
that is the unit of work performed.

We don't actually count elided flushes yet, but go ahead and change
the interface presented to the monitor all at once.

Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-31 12:16:30 +00:00
Richard Henderson
d5363e5849 cputlb: Move env->vtlb_index to env->tlb_d.vindex
The rest of the tlb victim cache is per-tlb,
the next use index should be as well.

Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-31 12:16:12 +00:00
Richard Henderson
1308e02671 cputlb: Split large page tracking per mmu_idx
The set of large pages in the kernel is probably not the same
as the set of large pages in the application.  Forcing one
range to cover both will flush more often than necessary.

This allows tlb_flush_page_async_work to flush just the one
mmu_idx implicated, which in turn allows us to remove
tlb_check_page_and_flush_by_mmuidx_async_work.

Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-31 12:16:08 +00:00
Richard Henderson
60a2ad7d86 cputlb: Move cpu->pending_tlb_flush to env->tlb_c.pending_flush
Protect it with the tlb_lock instead of using atomics.
The move puts it in or near the same cacheline as the lock;
using the lock means we don't need a second atomic operation
in order to perform the update.  Which makes it cheap to also
update pending_flush in tlb_flush_by_mmuidx_async_work.

Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-31 12:16:02 +00:00
Richard Henderson
53d284554c cputlb: Move tlb_lock to CPUTLBCommon
This is the first of several moves to reduce the size of the
CPU_COMMON_TLB macro and improve some locality of refernce.

Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-31 12:15:28 +00:00
Peter Maydell
a2e002ff79 QEMU trivial patches collected between June and October 2018
(Thank you to Thomas Huth)
 
 v2: fix 32bit build with updated patch (v3) from Philippe Mathieu-Daudé
     built in a 32bit debian sid chroot
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJb2D8VAAoJEPMMOL0/L748dZEP/11pPehjPPYVxesxM++pFeuf
 2EOrLuOTkwlRX23itj2JHv8UTY3YZR9Z8kkF3SWe7qYfp4kB4dTEYjnJY5Im6fWQ
 TUbC9D9SivknOOPyQUtGXZQRN8D8m6V4hN2ZcoXC2M48GT23/uqUWBwCKYeHxdLf
 iJQFmhwDnXSZr+D0l9mpMK2vBsZ5ywcbne8GufTtrkz7Dq9A0nDWVc/XUEHzzahf
 C+6r2fRPjtImxIjhAGQeAEzOk5tYnqK/3kXjy6T4UygvnZw0pkAS1rIb3hvlzm1e
 kBlbA+pgL0kKumMmT9LBR4Os4hlL95URUF+BDNGa3EusImSL/wmhsawslQbfxVyv
 5at3VKIdvPXr7GQvmhaJ3dllXiQixX7A+axevkwyZkuIcYLnuhvh6bCR3ap+4mq/
 GRk4vwXStS6S8rDLAzo4GA4DsE4EDYJSnU13wMEaj1L9sYPVg1224AgCjnlIBbQa
 ntGD3lY7+nG5q1BeVfZXmpNZ4+N4TSpu2uEBxNvWY2/YkaouleQXJ8W4eFirB1Eo
 G8TN2fbroLcKgxhOlpvgFrfrgs8T5ZprpqQnvpE2h6M2Nu4JWJq4008q3uIPOwTy
 o9MrquqOjdG0+OBHr8Ji5HwDKex68NRQhl8BYhqtPhi/+XycDo47YSodNBfw2U/Q
 Ec9301/TQjBcvCBLEzrt
 =sHPv
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/qemu-trivial-for-3.1-pull-request' into staging

QEMU trivial patches collected between June and October 2018
(Thank you to Thomas Huth)

v2: fix 32bit build with updated patch (v3) from Philippe Mathieu-Daudé
    built in a 32bit debian sid chroot

# gpg: Signature made Tue 30 Oct 2018 11:23:01 GMT
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/qemu-trivial-for-3.1-pull-request:
  milkymist-minimac2: Use qemu_log_mask(GUEST_ERROR) instead of error_report
  ppc: move at24c to its own CONFIG_ symbol
  hw/intc/gicv3: Remove useless parenthesis around DIV_ROUND_UP macro
  hw/pci-host: Remove useless parenthesis around DIV_ROUND_UP macro
  tests/bios-tables-test: Remove an useless cast
  xen: Use the PCI_DEVICE macro
  qobject: Catch another straggler for use of qdict_put_str()
  configure: Support pkg-config for zlib
  tests: Fix typos in comments and help message (found by codespell)
  cpu.h: fix a typo in comment
  linux-user: fix comment s/atomic_write/atomic_set/
  qemu-iotests: make 218 executable
  scripts/qemu.py: remove trailing quotes on docstring
  scripts/decodetree.py: remove unused imports
  docs/devel/testing.rst: add missing newlines after code block
  qemu-iotests: fix filename containing checks
  tests/tcg/README: fix location for lm32 tests
  memory.h: fix typos in comments
  vga_int: remove unused function protype
  configs/alpha: Remove unused CONFIG_PARALLEL_ISA switch

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-10-30 15:49:55 +00:00
Li Qiang
847b31f0d6 memory.h: fix typos in comments
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1539080467-2976-1-git-send-email-liq3ea@gmail.com>
[lv: s/types/typos/]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-10-26 17:17:32 +02:00
Aleksandar Markovic
89a955e8df target/mips: Add disassembler support for nanoMIPS
Add disassembler support for nanoMIPS.

Reviewed-by: Stefan Markovic <smarkovic@wavecomp.com>
Signed-off-by: Matthew Fortune <matthew.fortune@mips.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
2018-10-25 22:13:33 +02:00
Peter Maydell
b312532fd0 * RTC fixes (Artem)
* icount fixes (Artem)
 * rr fixes (Pavel, myself)
 * hotplug cleanup (Igor)
 * SCSI fixes (myself)
 * 4.20-rc1 KVM header update (myself)
 * coalesced PIO support (Peng Hao)
 * HVF fixes (Roman B.)
 * Hyper-V refactoring (Roman K.)
 * Support for Hyper-V IPI (Vitaly)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJbycRuAAoJEL/70l94x66DGL4H/00Gu/+0dNlpxt6hYVaJ30jX
 vFCsZoglBJ060M8m0C9roTF7zdIgI/X0oxJKWNaxqCDD0GSL5oM1AfG0DCsEBq6X
 ApHYfBOh6mMWuB2qzV9QkK0b2u7+g9J8pQQYfZlU+QNtmUUmbzBxV4h7oqOoedJZ
 nTJrkYzBg88bLDXUAuFrnMhaktqzPvyhdD36vUX5Kc9Hk9R3krtEenc/XKfEJg+o
 n1DX9QeAWgi3MdhkhXSaNSnAu2k2+/qJDmOPk1r63ft5ZfaUKOaVecU06ioiEmrc
 KJd6EYeRvh2eIpbOCGSEVDrieGVBOPvqYg0ryWroxSveoPqJZh5ys9MdIjD+8zg=
 =4XhC
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* RTC fixes (Artem)
* icount fixes (Artem)
* rr fixes (Pavel, myself)
* hotplug cleanup (Igor)
* SCSI fixes (myself)
* 4.20-rc1 KVM header update (myself)
* coalesced PIO support (Peng Hao)
* HVF fixes (Roman B.)
* Hyper-V refactoring (Roman K.)
* Support for Hyper-V IPI (Vitaly)

# gpg: Signature made Fri 19 Oct 2018 12:47:58 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (47 commits)
  replay: pass raw icount value to replay_save_clock
  target/i386: kvm: just return after migrate_add_blocker failed
  hyperv_testdev: add SynIC message and event testmodes
  hyperv: process POST_MESSAGE hypercall
  hyperv: add support for KVM_HYPERV_EVENTFD
  hyperv: process SIGNAL_EVENT hypercall
  hyperv: add synic event flag signaling
  hyperv: add synic message delivery
  hyperv: make overlay pages for SynIC
  hyperv: only add SynIC in compatible configurations
  hyperv: qom-ify SynIC
  hyperv:synic: split capability testing and setting
  i386: add hyperv-stub for CONFIG_HYPERV=n
  default-configs: collect CONFIG_HYPERV* in hyperv.mak
  hyperv: factor out arch-independent API into hw/hyperv
  hyperv: make hyperv_vp_index inline
  hyperv: split hyperv-proto.h into x86 and arch-independent parts
  hyperv: rename kvm_hv_sint_route_set_sint
  hyperv: make HvSintRoute reference-counted
  hyperv: address HvSintRoute by X86CPU pointer
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-10-19 19:01:07 +01:00
Peter Maydell
31e213e306 Queued tcg patches.
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbyXOoAAoJEGTfOOivfiFfqXwIAIGMswKK6a6/I/lG/QFLGEnM
 7fJp6wvPCOb4bRdo+u8f4WuWvquHjxkH7Xby+SwND8he0Xot3Ihysb0pdTfNcvB0
 r5nHMdqyktbXV9Dq/0q16UsvK26KcLMP6csoxwFgdy0hX9tvZoqnuWp/9ho9/F8C
 wlbawX3bbFbWD1mG3i3EYzHG8L7WLaVU8L1NzX8uuoxKVpambLhzPr/4hrtnp3Km
 p6b2RMvim09FuCtu3yW6Ouz5kza8/Fkz2800IodCsUP25L1v5lvumOuhgq94gc9S
 CYXGNIqsOFqPpsiB6Xu+8gHpvbA6BsygGbD4rb4V8ufnc2MxAS756qZKPNFBeGE=
 =v6+d
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20181018' into staging

Queued tcg patches.

# gpg: Signature made Fri 19 Oct 2018 07:03:20 BST
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-tcg-20181018: (21 commits)
  cputlb: read CPUTLBEntry.addr_write atomically
  target/s390x: Check HAVE_ATOMIC128 and HAVE_CMPXCHG128 at translate
  target/s390x: Skip wout, cout helpers if op helper does not return
  target/s390x: Split do_cdsg, do_lpq, do_stpq
  target/s390x: Convert to HAVE_CMPXCHG128 and HAVE_ATOMIC128
  target/ppc: Convert to HAVE_CMPXCHG128 and HAVE_ATOMIC128
  target/arm: Check HAVE_CMPXCHG128 at translate time
  target/arm: Convert to HAVE_CMPXCHG128
  target/i386: Convert to HAVE_CMPXCHG128
  tcg: Split CONFIG_ATOMIC128
  tcg: Add tlb_index and tlb_entry helpers
  cputlb: serialize tlb updates with env->tlb_lock
  cputlb: fix assert_cpu_is_self macro
  exec: introduce tlb_init
  target/unicore32: remove tlb_flush from uc32_init_fn
  target/alpha: remove tlb_flush from alpha_cpu_initfn
  tcg: distribute tcg_time into TCG contexts
  tcg: plug holes in struct TCGProfile
  tcg: fix use of uninitialized variable under CONFIG_PROFILER
  tcg: access cpu->icount_decr.u16.high with atomics
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-10-19 16:17:32 +01:00
Peng Hao
e6d34aeea6 target-i386 : add coalesced_pio API
the primary API realization.

Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1539795177-21038-3-git-send-email-peng.hao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:11 +02:00
Zhang Chen
13af18f222 COLO: Load dirty pages into SVM's RAM cache firstly
We should not load PVM's state directly into SVM, because there maybe some
errors happen when SVM is receving data, which will break SVM.

We need to ensure receving all data before load the state into SVM. We use
an extra memory to cache these data (PVM's ram). The ram cache in secondary side
is initially the same as SVM/PVM's memory. And in the process of checkpoint,
we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
always the same as PVM's memory at every checkpoint, then we flush this cached ram
to SVM after we receive all PVM's state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
Emilio G. Cota
403f290c06 cputlb: read CPUTLBEntry.addr_write atomically
Updates can come from other threads, so readers that do not
take tlb_lock must use atomic_read to avoid undefined
behaviour (UB).

This completes the conversion to tlb_lock. This conversion results
on average in no performance loss, as the following experiments
(run on an Intel i7-6700K CPU @ 4.00GHz) show.

1. aarch64 bootup+shutdown test:

- Before:
 Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs):

       7487.087786      task-clock (msec)         #    0.998 CPUs utilized            ( +-  0.12% )
    31,574,905,303      cycles                    #    4.217 GHz                      ( +-  0.12% )
    57,097,908,812      instructions              #    1.81  insns per cycle          ( +-  0.08% )
    10,255,415,367      branches                  # 1369.747 M/sec                    ( +-  0.08% )
       173,278,962      branch-misses             #    1.69% of all branches          ( +-  0.18% )

       7.504481349 seconds time elapsed                                          ( +-  0.14% )

- After:
 Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs):

       7462.441328      task-clock (msec)         #    0.998 CPUs utilized            ( +-  0.07% )
    31,478,476,520      cycles                    #    4.218 GHz                      ( +-  0.07% )
    57,017,330,084      instructions              #    1.81  insns per cycle          ( +-  0.05% )
    10,251,929,667      branches                  # 1373.804 M/sec                    ( +-  0.05% )
       173,023,787      branch-misses             #    1.69% of all branches          ( +-  0.11% )

       7.474970463 seconds time elapsed                                          ( +-  0.07% )

2. SPEC06int:
                                              SPEC06int (test set)
                                           [Y axis: Speedup over master]
  1.15 +-+----+------+------+------+------+------+-------+------+------+------+------+------+------+----+-+
       |                                                                                                  |
   1.1 +-+.................................+++.............................+  tlb-lock-v2 (m+++x)       +-+
       |                                +++ |                   +++        tlb-lock-v3 (spinl|ck)         |
       |                    +++          |  |     +++    +++     |                           |            |
  1.05 +-+....+++...........####.........|####.+++.|......|.....###....+++...........+++....###.........+-+
       |      ###         ++#| #         |# |# ***### +++### +++#+#     |     +++     |     #|#    ###    |
     1 +-+++***+#++++####+++#++#++++++++++#++#+*+*++#++++#+#+****+#++++###++++###++++###++++#+#++++#+#+++-+
       |    *+* #    #++# ***  #   #### ***  # * *++# ****+# *| * # ****|#   |# #    #|#    #+#    # #    |
  0.95 +-+..*.*.#....#..#.*|*..#...#..#.*|*..#.*.*..#.*|.*.#.*++*.#.*++*+#.****.#....#+#....#.#..++#.#..+-+
       |    * * #    #  # *|*  #   #  # *|*  # * *  # *++* # *  * # *  * # * |* #  ++# #    # #  *** #    |
       |    * * #  ++#  # *+*  #   #  # *|*  # * *  # *  * # *  * # *  * # *++* # **** #  ++# #  * * #    |
   0.9 +-+..*.*.#...|#..#.*.*..#.++#..#.*|*..#.*.*..#.*..*.#.*..*.#.*..*.#.*..*.#.*.|*.#...|#.#..*.*.#..+-+
       |    * * #  ***  # * *  #  |#  # *+*  # * *  # *  * # *  * # *  * # *  * # *++* #   |# #  * * #    |
  0.85 +-+..*.*.#..*|*..#.*.*..#.***..#.*.*..#.*.*..#.*..*.#.*..*.#.*..*.#.*..*.#.*..*.#.****.#..*.*.#..+-+
       |    * * #  *+*  # * *  # *|*  # * *  # * *  # *  * # *  * # *  * # *  * # *  * # * |* #  * * #    |
       |    * * #  * *  # * *  # *+*  # * *  # * *  # *  * # *  * # *  * # *  * # *  * # * |* #  * * #    |
   0.8 +-+..*.*.#..*.*..#.*.*..#.*.*..#.*.*..#.*.*..#.*..*.#.*..*.#.*..*.#.*..*.#.*..*.#.*++*.#..*.*.#..+-+
       |    * * #  * *  # * *  # * *  # * *  # * *  # *  * # *  * # *  * # *  * # *  * # *  * #  * * #    |
  0.75 +-+--***##--***###-***###-***###-***###-***###-****##-****##-****##-****##-****##-****##--***##--+-+
 400.perlben401.bzip2403.gcc429.m445.gob456.hmme45462.libqua464.h26471.omnet473483.xalancbmkgeomean

  png: https://imgur.com/a/BHzpPTW

Notes:
- tlb-lock-v2 corresponds to an implementation with a mutex.
- tlb-lock-v3 corresponds to the current implementation, i.e.
  a spinlock and a single lock acquisition in tlb_set_page_with_attrs.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20181016153840.25877-1-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18 19:46:53 -07:00
Richard Henderson
383beda9cf tcg: Add tlb_index and tlb_entry helpers
Isolate the computation of an index from an address into a
helper before we change that function.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
[ cota: convert tlb_vaddr_to_host; use atomic_read on addr_write ]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20181009175129.17888-2-cota@braap.org>
2018-10-18 18:58:10 -07:00
Emilio G. Cota
71aec3541d cputlb: serialize tlb updates with env->tlb_lock
Currently we rely on atomic operations for cross-CPU invalidations.
There are two cases that these atomics miss: cross-CPU invalidations
can race with either (1) vCPU threads flushing their TLB, which
happens via memset, or (2) vCPUs calling tlb_reset_dirty on their TLB,
which updates .addr_write with a regular store. This results in
undefined behaviour, since we're mixing regular and atomic ops
on concurrent accesses.

Fix it by using tlb_lock, a per-vCPU lock. All updaters of tlb_table
and the corresponding victim cache now hold the lock.
The readers that do not hold tlb_lock must use atomic reads when
reading .addr_write, since this field can be updated by other threads;
the conversion to atomic reads is done in the next patch.

Note that an alternative fix would be to expand the use of atomic ops.
However, in the case of TLB flushes this would have a huge performance
impact, since (1) TLB flushes can happen very frequently and (2) we
currently use a full memory barrier to flush each TLB entry, and a TLB
has many entries. Instead, acquiring the lock is barely slower than a
full memory barrier since it is uncontended, and with a single lock
acquisition we can flush the entire TLB.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20181009174557.16125-6-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18 18:58:10 -07:00
Emilio G. Cota
5005e2537d exec: introduce tlb_init
Paves the way for the addition of a per-TLB lock.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20181009174557.16125-4-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18 18:58:10 -07:00
Peter Maydell
62a0db942d memory: Remove old_mmio accessors
Now that all the users of old_mmio MemoryRegion accessors
have been converted, we can remove the core code support.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180824170422.5783-2-peter.maydell@linaro.org>
Based-on: <20180802174042.29234-1-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-02 19:09:14 +02:00
Hikaru Nishida
d5dbde4645 hostmem-file: make available memory-backend-file on POSIX-based hosts
Before this change, memory-backend-file object is valid for Linux hosts
only because hostmem-file.c is compiled only on Linux hosts.
However, other POSIX-based hosts (such as macOS) can support
memory-backend-file object in the same way as on Linux hosts.
This patch makes hostmem-file.c and related functions to be compiled on
all POSIX-based hosts to make available memory-backend-file on them.

Signed-off-by: Hikaru Nishida <hikarupsp@gmail.com>
Message-Id: <20180924123205.29651-1-hikarupsp@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-02 19:09:13 +02:00
Thomas Huth
a69dc537cc ppc: Remove deprecated ppcemb target
There is no known available OS for ppc around anymore that uses page
sizes below 4k, so it does not make much sense that we keep wasting
our time on building and testing the ppcemb-softmmu target. It has
been deprecated since two releases, and nobody complained, so let's
remove this now.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-28 11:31:23 +10:00
Peter Maydell
659b11e7a7 linux-user fixes:
- netlink fixes (add missing types, fix MSG_TRUNC)
 - sh4 fix (tcg state)
 - sparc32plus fix (truncate address space to 32bit)
 - add x86_64 binfmt data
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbeyOIAAoJEPMMOL0/L74843oQAJCbDedfagKvmhMBFxWqFsp6
 En7UuUjh6MtOgb++5W47RY4LoVi12IGys5qvXLM3+Gar1l5oFgQaG58jnsUgl4uO
 o+QnsM+KqsTnYlrlQOviY8US+9eNoMP/dp/sAwF0NbpQpKUTiiWv/QQ6B8YC/x5O
 yv016xn+9ul7HrS7H57ah4lrm5YJcFh54pnKMzW6f40ekPiXIrbKicgXKUbR9Fg4
 c1Kxqwo+rxGS4tZ6aB+RFvu5dQ8NMxX4DhQUYXL1H8JSMR+fxPY3nzYTNqyFUwu9
 Qb8wkf/sP4hPz3QIay/ha1ThmAJQJqJfrWDD9Kx5JrMF1YLFSR9dfx2lmjlgHjbr
 TsAkpKHSsM0azqnFlJ5khmEjC7aJSxmsd9PQwH0VOnmuszAej9a13E9A1kwdA54N
 JAzRBjuxO5Y2W7MXiqlfNI+XNBLa7BnXIRR1pa8icSHCyFfXxhQSsa80YF0JZ6JE
 j7ACiXkxmcMdJUjxRLL24rCERnanSwIHPjsxdVkJHaMaO+L0eiMH2ZcboQbTcnlK
 L6Pl0sD4kBBGlyN5V0MVLSMMWfm9OXyTSz8bAGUt7MV574oq6vyub37I44l6FXKK
 RPUMaSuFBOD3kaA2HP+bmRumrCHZ/eGhsmkFcquPxML/F+tXDAT6WCd2FuBGmA5c
 UkCSNXY6zdESgnc76G1u
 =wHSV
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-3.1-pull-request' into staging

linux-user fixes:
- netlink fixes (add missing types, fix MSG_TRUNC)
- sh4 fix (tcg state)
- sparc32plus fix (truncate address space to 32bit)
- add x86_64 binfmt data

# gpg: Signature made Mon 20 Aug 2018 21:24:40 BST
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-3.1-pull-request:
  linux-user: add QEMU_IFLA_INFO_KIND nested type for tun
  linux-user: update netlink route types
  linux-user: introduce QEMU_RTA_* to use with rtattr_type_t
  linux-user: fix recvmsg()/recvfrom() with netlink and MSG_TRUNC
  sh4: fix use_icount with linux-user
  linux-user: fix 32bit g2h()/h2g()
  qemu-binfmt-conf.sh: add x86_64 target

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-21 11:36:15 +01:00
Peter Maydell
55f4e79d79 pc: fixes
This includes nvdimm persistence fixes queued before the release.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbepoTAAoJECgfDbjSjVRpLioH/3BPps8FLh4x2gZSq3B+u72O
 RYUA3I3TilEGyc9yf8o7e1Hf+pQAJBEmulcnKxXFVWZIJ1GVLPt4NZCMQGiPDnJL
 +RCT/Q64PUy09hRjddAasikrvXa4YOsRgBgJJToO7v9PSQSaU3fC7O3hNea7KcF/
 C4SSqkUgxyDhCCYHHblpKxFz/wtwy4ZaCGSdozIdmKNPJ6/ye8wOQ1Mq9e1Mwp18
 S6ilJub5IwB6aM2KVMmX4AFomF4u2cn153ts8fI+Dyo4/NE6P4+viDlz3BOBKdzm
 kmd49h6/n4Lenoo4oI1yNHSuIJJTVfvnoLu6rG7mPbQKgxNd1uN4KuUIygU5PCY=
 =Xcaj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc: fixes

This includes nvdimm persistence fixes queued before the release.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 20 Aug 2018 11:38:11 BST
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  migration/ram: ensure write persistence on loading all data to PMEM.
  migration/ram: Add check and info message to nvdimm post copy.
  mem/nvdimm: ensure write persistence to PMEM in label emulation
  hostmem-file: add the 'pmem' option
  configure: add libpmem support
  memory, exec: switch file ram allocation functions to 'flags' parameters
  memory, exec: Expose all memory block related flags.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-21 10:23:53 +01:00
Peter Maydell
8c1c245378 memory: Remove MMIO request_ptr APIs
Remove the obsolete MMIO request_ptr APIs; they have no
users now.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: KONRAD Frederic <frederic.konrad@adacore.com>
Message-id: 20180817114619.22354-3-peter.maydell@linaro.org
2018-08-20 11:24:32 +01:00
Laurent Vivier
3e23de1523 linux-user: fix 32bit g2h()/h2g()
sparc32plus has 64bit long type but only 32bit virtual address space.

For instance, "apt-get upgrade" failed because of a mmap()/msync()
sequence.

mmap() returned 0xff252000 but msync() used g2h(0xffffffffff252000)
to find the host address. The "(target_ulong)" in g2h() doesn't fix the
address because it is 64bit long.

This patch introduces an "abi_ptr" that is set to uint32_t
if the virtual address space is addressed using 32bit in the linux-user
case. It stays set to target_ulong with softmmu case.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180814171217.14680-1-laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[lv: added "%" in TARGET_ABI_FMT_ptr "%"PRIx64]
2018-08-17 13:56:33 +02:00
Peter Maydell
55a7cb144d accel/tcg: Check whether TLB entry is RAM consistently with how we set it up
We set up TLB entries in tlb_set_page_with_attrs(), where we have
some logic for determining whether the TLB entry is considered
to be RAM-backed, and thus has a valid addend field. When we
look at the TLB entry in get_page_addr_code(), we use different
logic for determining whether to treat the page as RAM-backed
and use the addend field. This is confusing, and in fact buggy,
because the code in tlb_set_page_with_attrs() correctly decides
that rom_device memory regions not in romd mode are not RAM-backed,
but the code in get_page_addr_code() thinks they are RAM-backed.
This typically results in "Bad ram pointer" assertion if the
guest tries to execute from such a memory region.

Fix this by making get_page_addr_code() just look at the
TLB_MMIO bit in the code_address field of the TLB, which
tlb_set_page_with_attrs() sets if and only if the addend
field is not valid for code execution.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180713150945.12348-1-peter.maydell@linaro.org
2018-08-14 17:17:19 +01:00
Junyan He
a4de8552b2 hostmem-file: add the 'pmem' option
When QEMU emulates vNVDIMM labels and migrates vNVDIMM devices, it
needs to know whether the backend storage is a real persistent memory,
in order to decide whether special operations should be performed to
ensure the data persistence.

This boolean option 'pmem' allows users to specify whether the backend
storage of memory-backend-file is a real persistent memory. If
'pmem=on', QEMU will set the flag RAM_PMEM in the RAM block of the
corresponding memory region. If 'pmem' is set while lack of libpmem
support, a error is generated.

Signed-off-by: Junyan He <junyan.he@intel.com>
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-08-10 13:29:39 +03:00
Junyan He
cbfc017103 memory, exec: switch file ram allocation functions to 'flags' parameters
As more flag parameters besides the existing 'share' are going to be
added to following functions
memory_region_init_ram_from_file
qemu_ram_alloc_from_fd
qemu_ram_alloc_from_file
let's switch them to use the 'flags' parameters so as to ease future
flag additions.

The existing 'share' flag is converted to the RAM_SHARED bit in ram_flags,
and other flag bits are ignored by above functions right now.

Signed-off-by: Junyan He <junyan.he@intel.com>
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2018-08-10 13:29:39 +03:00
Junyan He
b0e5de9381 memory, exec: Expose all memory block related flags.
We need to use these flags in other files rather than just in exec.c,
For example, RAM_SHARED should be used when create a ram block from file.
We expose them the exec/memory.h

Signed-off-by: Junyan He <junyan.he@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-08-10 13:29:39 +03:00
Peter Maydell
e8c858944e * IEC units series (Philippe)
* Hyper-V PV TLB flush (Vitaly)
 * git archive detection (Daniel)
 * host serial passthrough fix (David)
 * NPT support for SVM emulation (Jan)
 * x86 "info mem" and "info tlb" fix (Doug)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJbOkI9AAoJEL/70l94x66DaA0IAIzJD+3hUdwDCqitlW65x/yX
 D+KAoX4Ytpz7+QOtcXC7BBUW3JwvHTS5sfuvaAqKWnqEXSDrQs4/gG2iEB1UJ3Ko
 hC2LHGKygdcD9k3vuQ2q2USOu08jEUYRvvjgHmD6lsyaAQ+cb2heAYz/SxQqbkkt
 qun6TFaWuTGBQF1qy0xjJitdPokGwFZgprlZyVmMId/yLlsbsFlwmGIJh/l1+zqw
 I4DBzRzuhAg/nLH9qVZ3LWOjH1H0MLPGBUG59w4GbIDpwRh1VZu+GTyAmAYaquHl
 dSHYweXywNTvhi0WLroP8SD0Nqf/ZObuSRtop60gqJuP3YAbPrBMeRTlsqoZIRE=
 =Xzc8
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* IEC units series (Philippe)
* Hyper-V PV TLB flush (Vitaly)
* git archive detection (Daniel)
* host serial passthrough fix (David)
* NPT support for SVM emulation (Jan)
* x86 "info mem" and "info tlb" fix (Doug)

# gpg: Signature made Mon 02 Jul 2018 16:18:21 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (50 commits)
  tcg: simplify !CONFIG_TCG handling of tb_invalidate_*
  i386/monitor.c: make addresses canonical for "info mem" and "info tlb"
  target-i386: Add NPT support
  serial: Open non-block
  bsd-user: Use the IEC binary prefix definitions
  linux-user: Use the IEC binary prefix definitions
  tests/crypto: Use the IEC binary prefix definitions
  vl: Use the IEC binary prefix definitions
  monitor: Use the IEC binary prefix definitions
  cutils: Do not include "qemu/units.h" directly
  hw/rdma: Use the IEC binary prefix definitions
  hw/virtio: Use the IEC binary prefix definitions
  hw/vfio: Use the IEC binary prefix definitions
  hw/sd: Use the IEC binary prefix definitions
  hw/usb: Use the IEC binary prefix definitions
  hw/net: Use the IEC binary prefix definitions
  hw/i386: Use the IEC binary prefix definitions
  hw/ppc: Use the IEC binary prefix definitions
  hw/mips: Use the IEC binary prefix definitions
  hw/mips/r4k: Constify params_size
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-07-02 19:07:19 +01:00
Peter Maydell
334692bce7 tcg: Define and use new tlb_hit() and tlb_hit_page() functions
The condition to check whether an address has hit against a particular
TLB entry is not completely trivial. We do this in various places, and
in fact in one place (get_page_addr_code()) we have got the condition
wrong. Abstract it out into new tlb_hit() and tlb_hit_page() inline
functions (one for a known-page-aligned address and one for an
arbitrary address), and use them in all the places where we had the
condition correct.

This is a no-behaviour-change patch; we leave fixing the buggy
code in get_page_addr_code() to a subsequent patch.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180629162122.19376-2-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-07-02 08:02:20 -07:00
Paolo Bonzini
c40d479207 tcg: simplify !CONFIG_TCG handling of tb_invalidate_*
There is no need for a stub, since tb_invalidate_phys_addr can be excised
altogether when TCG is disabled.  This is a bit cleaner since it avoids
using code that is clearly specific to user-mode emulation (it calls
mmap_lock/unlock) for the !CONFIG_TCG case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-02 15:41:18 +02:00
Philippe Mathieu-Daudé
646f34fa54 tcg: Fix --disable-tcg build breakage
Fix the --disable-tcg breakage introduced by 8bca9a03ec:

    $ configure --disable-tcg
    [...]
    $ make -C i386-softmmu exec.o
    make: Entering directory 'i386-softmmu'
      CC      exec.o
    In file included from source/qemu/exec.c:62:0:
    source/qemu/include/exec/ram_addr.h:96:6: error: conflicting types for ‘tb_invalidate_phys_range’
     void tb_invalidate_phys_range(ram_addr_t start, ram_addr_t end);
          ^~~~~~~~~~~~~~~~~~~~~~~~
    In file included from source/qemu/exec.c:24:0:
    source/qemu/include/exec/exec-all.h:309:6: note: previous declaration of ‘tb_invalidate_phys_range’ was here
     void tb_invalidate_phys_range(target_ulong start, target_ulong end);
          ^~~~~~~~~~~~~~~~~~~~~~~~
    source/qemu/exec.c:1043:6: error: conflicting types for ‘tb_invalidate_phys_addr’
     void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
          ^~~~~~~~~~~~~~~~~~~~~~~
    In file included from source/qemu/exec.c:24:0:
    source/qemu/include/exec/exec-all.h:308:6: note: previous declaration of ‘tb_invalidate_phys_addr’ was here
     void tb_invalidate_phys_addr(target_ulong addr);
          ^~~~~~~~~~~~~~~~~~~~~~~
    make: *** [source/qemu/rules.mak:69: exec.o] Error 1
    make: Leaving directory 'i386-softmmu'

Tested to build x86_64-softmmu and i386-softmmu targets.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180629200710.27626-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-07-02 13:42:05 +01:00
Alexey Kardashevskiy
fc051ae6c4 memory/hmp: Print owners/parents in "info mtree"
This adds owners/parents (which are the same, just occasionally
owner==NULL) printing for memory regions; a new '-o' flag
enabled new output.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20180604032511.6980-1-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:36 +02:00
Paolo Bonzini
8bca9a03ec move public invalidate APIs out of translate-all.{c,h}, clean up
Place them in exec.c, exec-all.h and ram_addr.h.  This removes
knowledge of translate-all.h (which is an internal header) from
several files outside accel/tcg and removes knowledge of
AddressSpace from translate-all.c (as it only operates on ram_addr_t).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:30 +02:00
Peter Maydell
4a83bf2f33 migration/next for 20180627
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJbM4jhAAoJEPSH7xhYctcjFkoP/RkE3RhO/3JV+DKYU5KQwcrV
 n6FcXwsMIJ9EsCWLaZl4R+7nlbPYb4xPVG0UJnG1ntqIhJ5gN6RHzfi4l1wGCKUT
 EBRq7C5mEUbUVup7RPO1bEDixydvsvNrSHCL8AzALaag4t5HWrjneLcnJhrSykHx
 WovFn3Zyi/3LzUKHTzSQWLUoEKSBqNVJ1Ar3kIKwNDfS+jOhiZ6/Bm29hLn9UH0X
 b/iFxvCcalL2y/ulacocaS2dMe6Dx3/TvhKde8sZuyez2JeGCFFicrty7TOFhpoV
 INHjOQ3P4KLCWvJw8vs0jQmw2kikFQhqBXbJPVi3/0hV0MH/Uj4cC78/aGTn+Dt3
 bNa63eBi6//2KUzgdA+tlrTsVrkifjIhd69a0keTy1oH0zQHRjKvJrgSu0Mcy1x3
 YLbPnsjdIUrzuhCQwZcBAuzJP73OW0Q/Y+RSulFUjqkO2IBtJSUkx+AeLews5uhz
 +q9xom4FjHZMaRLwz6tWYUXxyfRlRyDTiDxZtL/9I+6XZWRHKqA8+/QZf0l2whTm
 NfJz43J+uL7Dlq9u8aMX3e+qHGwGj6b7o3zB5+xBtiHYFZMg7qPuO42Lstubdexp
 Zngb+EZl1H4SmYnOLhSVSY+Gus3f2xI6JXpuZuy1wo/bgTwFXM2o6z16t9sobKYl
 RPusi0XEBu2/XkomJxYB
 =kc0K
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20180627' into staging

migration/next for 20180627

# gpg: Signature made Wed 27 Jun 2018 13:53:53 BST
# gpg:                using RSA key F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20180627:
  migration: fix crash in when incoming client channel setup fails
  postcopy: drop ram_pages parameter from postcopy_ram_incoming_init()
  migration: Stop sending whole pages through main channel
  migration: Remove not needed semaphore and quit
  migration: Wait for blocking IO
  migration: Start sending messages
  migration: Create ram_save_multifd_page
  migration: Create multifd_bytes ram_counter
  migration: Synchronize multifd threads with main thread
  migration: Add block where to send/receive packets
  migration: Multifd channels always wait on the sem
  migration: Add multifd traces for start/end thread
  migration: Abstract the number of bytes sent
  migration: Calculate mbps only during transfer time
  migration: Create multifd packet
  migration: Create multipage support

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-28 15:31:42 +01:00
David Hildenbrand
c136180c90 postcopy: drop ram_pages parameter from postcopy_ram_incoming_init()
Not needed. Don't expose last_ram_page().

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180620202736.21399-1-david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2018-06-27 13:28:31 +02:00
Emilio G. Cota
32c072341f trace: fix misreporting of TCG access sizes for user-space
trace_mem_build_info expects a size_shift for its first argument. Fix it.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 1527028012-21888-2-git-send-email-cota@braap.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-06-27 11:09:24 +01:00
Peter Maydell
55df6fcf54 tcg: Support MMU protection regions smaller than TARGET_PAGE_SIZE
Add support for MMU protection regions that are smaller than
TARGET_PAGE_SIZE. We do this by marking the TLB entry for those
pages with a flag TLB_RECHECK. This flag causes us to always
take the slow-path for accesses. In the slow path we can then
special case them to always call tlb_fill() again, so we have
the correct information for the exact address being accessed.

This change allows us to handle reading and writing from small
regions; we cannot deal with execution from the small region.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180620130619.11362-2-peter.maydell@linaro.org
2018-06-26 17:50:41 +01:00
Peter Maydell
33836a7315 TCG patch queue:
Workaround macos assembler lossage.
 Eliminate tb_lock.
 Fix TB code generation overflow.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbJBZIAAoJEGTfOOivfiFfy0gH/1brodMhJbTS6/k9+FyXWEy5
 zYjCGKKlMZk//Y+4wcF5tXY/qDRNWk80j6KyxumNp3gCBehx6u59EEsrJRQaxBHm
 nYbDoE3Fy0J4KgRzdGmkYtl89XDK1++Ea9uL9N/stg2MSodzqoV6uudLYr/f+nRj
 4MkS+7BI+aJ4/XIKLU+/+cRo+5FdD0hNEabjlUxTOSrfJbr/YxbnVINX01A4yD6q
 LSzwLAEqpJehFBQjeSLu93ztrapj/1vEaguPOf04F6pXgOLpvSPlPahqwwk4qRwS
 OFgWwSPby3jrNLYZcufx2cY5pG3i4wDGK3z/B35hnDEGwYp1fNt6xdq+EzmHhaM=
 =ibt/
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20180615' into staging

TCG patch queue:

Workaround macos assembler lossage.
Eliminate tb_lock.
Fix TB code generation overflow.

# gpg: Signature made Fri 15 Jun 2018 20:40:56 BST
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-tcg-20180615:
  tcg: Reduce max TB opcode count
  tcg: remove tb_lock
  translate-all: remove tb_lock mention from cpu_restore_state_from_tb
  cputlb: remove tb_lock from tlb_flush functions
  translate-all: protect TB jumps with a per-destination-TB lock
  translate-all: discard TB when tb_link_page returns an existing matching TB
  translate-all: introduce assert_no_pages_locked
  translate-all: add page_locked assertions
  translate-all: use per-page locking in !user-mode
  translate-all: move tb_invalidate_phys_page_range up in the file
  translate-all: work page-by-page in tb_invalidate_phys_range_1
  translate-all: remove hole in PageDesc
  translate-all: make l1_map lockless
  translate-all: iterate over TBs in a page with PAGE_FOR_EACH_TB
  tcg: move tb_ctx.tb_phys_invalidate_count to tcg_ctx
  tcg: track TBs with per-region BST's
  qht: return existing entry when qht_insert fails
  qht: require a default comparison function
  tcg/i386: Use byte form of xgetbv instruction

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-21 17:54:26 +01:00
Emilio G. Cota
0ac20318ce tcg: remove tb_lock
Use mmap_lock in user-mode to protect TCG state and the page descriptors.
In !user-mode, each vCPU has its own TCG state, so no locks needed.
Per-page locks are used to protect the page descriptors.

Per-TB locks are used in both modes to protect TB jumps.

Some notes:

- tb_lock is removed from notdirty_mem_write by passing a
  locked page_collection to tb_invalidate_phys_page_fast.

- tcg_tb_lookup/remove/insert/etc have their own internal lock(s),
  so there is no need to further serialize access to them.

- do_tb_flush is run in a safe async context, meaning no other
  vCPU threads are running. Therefore acquiring mmap_lock there
  is just to please tools such as thread sanitizer.

- Not visible in the diff, but tb_invalidate_phys_page already
  has an assert_memory_lock.

- cpu_io_recompile is !user-only, so no mmap_lock there.

- Added mmap_unlock()'s before all siglongjmp's that could
  be called in user-mode while mmap_lock is held.
  + Added an assert for !have_mmap_lock() after returning from
    the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic.

Performance numbers before/after:

Host: AMD Opteron(tm) Processor 6376

                 ubuntu 17.04 ppc64 bootup+shutdown time

  700 +-+--+----+------+------------+-----------+------------*--+-+
      |    +    +      +            +           +           *B    |
      |         before ***B***                            ** *    |
      |tb lock removal ###D###                         ***        |
  600 +-+                                           ***         +-+
      |                                           **         #    |
      |                                        *B*          #D    |
      |                                     *** *         ##      |
  500 +-+                                ***           ###      +-+
      |                             * ***           ###           |
      |                            *B*          # ##              |
      |                          ** *          #D#                |
  400 +-+                      **            ##                 +-+
      |                      **           ###                     |
      |                    **           ##                        |
      |                  **         # ##                          |
  300 +-+  *           B*          #D#                          +-+
      |    B         ***        ###                               |
      |    *       **       ####                                  |
      |     *   ***      ###                                      |
  200 +-+   B  *B     #D#                                       +-+
      |     #B* *   ## #                                          |
      |     #*    ##                                              |
      |    + D##D#     +            +           +            +    |
  100 +-+--+----+------+------------+-----------+------------+--+-+
           1    8      16      Guest CPUs       48           64
  png: https://imgur.com/HwmBHXe

              debian jessie aarch64 bootup+shutdown time

  90 +-+--+-----+-----+------------+------------+------------+--+-+
     |    +     +     +            +            +            +    |
     |         before ***B***                                B    |
  80 +tb lock removal ###D###                              **D  +-+
     |                                                   **###    |
     |                                                 **##       |
  70 +-+                                             ** #       +-+
     |                                             ** ##          |
     |                                           **  #            |
  60 +-+                                       *B  ##           +-+
     |                                       **  ##               |
     |                                    ***  #D                 |
  50 +-+                               ***   ##                 +-+
     |                             * **   ###                     |
     |                           **B*  ###                        |
  40 +-+                     ****  # ##                         +-+
     |                   ****     #D#                             |
     |             ***B**      ###                                |
  30 +-+    B***B**        ####                                 +-+
     |    B *   *     # ###                                       |
     |     B       ###D#                                          |
  20 +-+   D  ##D##                                             +-+
     |      D#                                                    |
     |    +     +     +            +            +            +    |
  10 +-+--+-----+-----+------------+------------+------------+--+-+
          1     8     16      Guest CPUs        48           64
  png: https://imgur.com/iGpGFtv

The gains are high for 4-8 CPUs. Beyond that point, however, unrelated
lock contention significantly hurts scalability.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 08:18:48 -10:00
Emilio G. Cota
194125e3eb translate-all: protect TB jumps with a per-destination-TB lock
This applies to both user-mode and !user-mode emulation.

Instead of relying on a global lock, protect the list of incoming
jumps with tb->jmp_lock. This lock also protects tb->cflags,
so update all tb->cflags readers outside tb->jmp_lock to use
atomic reads via tb_cflags().

In order to find the destination TB (and therefore its jmp_lock)
from the origin TB, we introduce tb->jmp_dest[].

I considered not using a linked list of jumps, which simplifies
code and makes the struct smaller. However, it unnecessarily increases
memory usage, which results in a performance decrease. See for
instance these numbers booting+shutting down debian-arm:
                      Time (s)  Rel. err (%)  Abs. err (s)  Rel. slowdown (%)
------------------------------------------------------------------------------
 before                  20.88          0.74      0.154512                 0.
 after                   20.81          0.38      0.079078        -0.33524904
 GTree                   21.02          0.28      0.058856         0.67049808
 GHashTable + xxhash     21.63          1.08      0.233604          3.5919540

Using a hash table or a binary tree to keep track of the jumps
doesn't really pay off, not only due to the increased memory usage,
but also because most TBs have only 0 or 1 jumps to them. The maximum
number of jumps when booting debian-arm that I measured is 35, but
as we can see in the histogram below a TB with that many incoming jumps
is extremely rare; the average TB has 0.80 incoming jumps.

n_jumps: 379208; avg jumps/tb: 0.801099
dist: [0.0,1.0)|▄█▁▁▁▁▁▁▁▁▁▁▁ ▁▁▁▁▁▁ ▁▁▁  ▁▁▁     ▁|[34.0,35.0]

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 08:18:48 -10:00
Emilio G. Cota
faa9372c07 translate-all: introduce assert_no_pages_locked
The appended adds assertions to make sure we do not longjmp with page
locks held. Note that user-mode has nothing to check, since page_locks
are !user-mode only.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota
0b5c91f74f translate-all: use per-page locking in !user-mode
Groundwork for supporting parallel TCG generation.

Instead of using a global lock (tb_lock) to protect changes
to pages, use fine-grained, per-page locks in !user-mode.
User-mode stays with mmap_lock.

Sometimes changes need to happen atomically on more than one
page (e.g. when a TB that spans across two pages is
added/invalidated, or when a range of pages is invalidated).
We therefore introduce struct page_collection, which helps
us keep track of a set of pages that have been locked in
the appropriate locking order (i.e. by ascending page index).

This commit first introduces the structs and the function helpers,
to then convert the calling code to use per-page locking. Note
that tb_lock is not removed yet.

While at it, rename tb_alloc_page to tb_page_add, which pairs with
tb_page_remove.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota
1e05197f24 translate-all: iterate over TBs in a page with PAGE_FOR_EACH_TB
This commit does several things, but to avoid churn I merged them all
into the same commit. To wit:

- Use uintptr_t instead of TranslationBlock * for the list of TBs in a page.
  Just like we did in (c37e6d7e "tcg: Use uintptr_t type for
  jmp_list_{next|first} fields of TB"), the rationale is the same: these
  are tagged pointers, not pointers. So use a more appropriate type.

- Only check the least significant bit of the tagged pointers. Masking
  with 3/~3 is unnecessary and confusing.

- Introduce the TB_FOR_EACH_TAGGED macro, and use it to define
  PAGE_FOR_EACH_TB, which improves readability. Note that
  TB_FOR_EACH_TAGGED will gain another user in a subsequent patch.

- Update tb_page_remove to use PAGE_FOR_EACH_TB. In case there
  is a bug and we attempt to remove a TB that is not in the list, instead
  of segfaulting (since the list is NULL-terminated) we will reach
  g_assert_not_reached().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota
128ed2278c tcg: move tb_ctx.tb_phys_invalidate_count to tcg_ctx
Thereby making it per-TCGContext. Once we remove tb_lock, this will
avoid an atomic increment every time a TB is invalidated.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota
be2cdc5e35 tcg: track TBs with per-region BST's
This paves the way for enabling scalable parallel generation of TCG code.

Instead of tracking TBs with a single binary search tree (BST), use a
BST for each TCG region, protecting it with a lock. This is as scalable
as it gets, since each TCG thread operates on a separate region.

The core of this change is the introduction of struct tcg_region_tree,
which contains a pointer to a GTree and an associated lock to serialize
accesses to it. We then allocate an array of tcg_region_tree's, adding
the appropriate padding to avoid false sharing based on
qemu_dcache_linesize.

Given a tc_ptr, we first find the corresponding region_tree. This
is done by special-casing the first and last regions first, since they
might be of size != region.size; otherwise we just divide the offset
by region.stride. I was worried about this division (several dozen
cycles of latency), but profiling shows that this is not a fast path.
Note that region.stride is not required to be a power of two; it
is only required to be a multiple of the host's page size.

Note that with this design we can also provide consistent snapshots
about all region trees at once; for instance, tcg_tb_foreach
acquires/releases all region_tree locks before/after iterating over them.
For this reason we now drop tb_lock in dump_exec_info().

As an alternative I considered implementing a concurrent BST, but this
can be tricky to get right, offers no consistent snapshots of the BST,
and performance and scalability-wise I don't think it could ever beat
having separate GTrees, given that our workload is insert-mostly (all
concurrent BST designs I've seen focus, understandably, on making
lookups fast, which comes at the expense of convoluted, non-wait-free
insertions/removals).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00