Commit Graph

72 Commits

Author SHA1 Message Date
Emilio G. Cota
c5a49c63fa tcg: convert tb->cflags reads to tb_cflags(tb)
Convert all existing readers of tb->cflags to tb_cflags, so that we
use atomic_read and therefore avoid undefined behaviour in C11.

Note that the remaining setters/getters of the field are protected
by tb_lock, and therefore do not need conversion.

Luckily all readers access the field via 'tb->cflags' (so no foo.cflags,
bar->cflags in the code base), which makes the conversion easily
scriptable:

FILES=$(git grep 'tb->cflags' target include/exec/gen-icount.h \
	 accel/tcg/translator.c | cut -f1 -d':' | sort | uniq)

perl -pi -e 's/([^.>])tb->cflags/$1tb_cflags(tb)/g' $FILES
perl -pi -e 's/([a-z->.]*)(->|\.)tb->cflags/tb_cflags($1$2tb)/g' $FILES

Then manually fixed the few errors that checkpatch reported.

Compile-tested for all targets.

Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24 13:53:41 -07:00
Richard Henderson
9b990ee5a3 tcg: Add CPUState cflags_next_tb
We were generating code during tb_invalidate_phys_page_range,
check_watchpoint, cpu_io_recompile, and (seemingly) discarding
the TB, assuming that it would magically be picked up during
the next iteration through the cpu_exec loop.

Instead, record the desired cflags in CPUState so that we request
the proper TB so that there is no more magic.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24 13:53:41 -07:00
Emilio G. Cota
4e2ca83e71 tcg: define CF_PARALLEL and use it for TB hashing along with CF_COUNT_MASK
This will enable us to decouple code translation from the value
of parallel_cpus at any given time. It will also help us minimize
TB flushes when generating code via EXCP_ATOMIC.

Note that the declaration of parallel_cpus is brought to exec-all.h
to be able to define there the "curr_cflags" inline.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24 13:53:41 -07:00
David Hildenbrand
f52bfb1214 accel/tcg: allow to invalidate a write TLB entry immediately
Background: s390x implements Low-Address Protection (LAP). If LAP is
enabled, writing to effective addresses (before any translation)
0-511 and 4096-4607 triggers a protection exception.

So we have subpage protection on the first two pages of every address
space (where the lowcore - the CPU private data resides).

By immediately invalidating the write entry but allowing the caller to
continue, we force every write access onto these first two pages into
the slow path. we will get a tlb fault with the specific accessed
addresses and can then evaluate if protection applies or not.

We have to make sure to ignore the invalid bit if tlb_fill() succeeds.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016202358.3633-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-10-20 13:32:10 +02:00
Peter Maydell
a8b392ac9a * TCG 8-byte atomic accesses bugfix (Andrew)
* Report disk rotation rate (Daniel)
 * Report invalid scsi-disk block size configuration (Mark)
 * KVM and memory API MemoryListener fixes (David, Maxime, Peter Xu)
 * x86 CPU hotplug crash fix (Igor)
 * Load/store API documentation (Peter Maydell)
 * Small fixes by myself and Thomas
 * qdev DEVICE_DELETED deferral (Michael)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAlnnJUgUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMifwf/dTZwtGqvAV4+jezCiZ3MTknz39dM
 HOGnD3m2xy04QT5LHiwDmaLFXy1y/AUVQm79JMPN4dKoFvtruREoWUq8EU0FCsLZ
 PkdCbJuXKGiBYMRXkQQxeT8lAyaBQwZdc+O9mYuOrSGZOQscA7SxgClYmzVdVzcy
 ZNTqkuaw1NDIAapdfGv94WLza4Nb8XX8bFwohgkf4mLDXifhjYHQTbBTfB0NqPxH
 Rk3HU+wgYUCJRYXpvktESgzRo5sm1aozCRq3f0Y6RV12ylgF6GG4CyN7YcKRn8eh
 NZbyehHiF5YU2kuvO9SmAB+FqM2+aMtq8uuNuI1Nxgd222MOVaChyWc3jg==
 =gmUj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* TCG 8-byte atomic accesses bugfix (Andrew)
* Report disk rotation rate (Daniel)
* Report invalid scsi-disk block size configuration (Mark)
* KVM and memory API MemoryListener fixes (David, Maxime, Peter Xu)
* x86 CPU hotplug crash fix (Igor)
* Load/store API documentation (Peter Maydell)
* Small fixes by myself and Thomas
* qdev DEVICE_DELETED deferral (Michael)

# gpg: Signature made Wed 18 Oct 2017 10:56:24 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (29 commits)
  scsi: reject configurations with logical block size > physical block size
  qdev: defer DEVICE_DEL event until instance_finalize()
  Revert "qdev: Free QemuOpts when the QOM path goes away"
  qdev: store DeviceState's canonical path to use when unparenting
  qemu-pr-helper: use new libmultipath API
  watch_mem_write: implement 8-byte accesses
  notdirty_mem_write: implement 8-byte accesses
  memory: reuse section_from_flat_range()
  kvm: simplify kvm_align_section()
  kvm: region_add and region_del is not called on updates
  kvm: fix error message when failing to unregister slot
  kvm: tolerate non-existing slot for log_start/log_stop/log_sync
  kvm: fix alignment of ram address
  memory: call log_start after region_add
  target/i386: trap on instructions longer than >15 bytes
  target/i386: introduce x86_ld*_code
  tco: add trace events
  docs/devel/loads-stores.rst: Document our various load and store APIs
  nios2: define tcg_env
  build: remove CONFIG_LIBDECNUMBER
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-10-19 15:38:07 +01:00
David Hildenbrand
a6ffc4232a kvm: simplify kvm_align_section()
Use ROUND_UP and simplify the code a bit.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-7-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-18 10:15:00 +02:00
David Hildenbrand
90ed4bcc3a kvm: region_add and region_del is not called on updates
Attributes are not updated via region_add()/region_del(). Attribute changes
lead to a delete first, followed by a new add.

If this would ever not be the case, we would get an error when trying to
register the new slot.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-6-david@redhat.com>
Tested-by: Joe Clifford <joeclifford@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-18 10:14:52 +02:00
David Hildenbrand
1c4fdabaf7 kvm: fix error message when failing to unregister slot
"overlapping" is a leftover, let's drop it.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-18 10:14:48 +02:00
David Hildenbrand
e377e87ca6 kvm: tolerate non-existing slot for log_start/log_stop/log_sync
If we want to trap every access to a section, we might not have a
slot. So let's just tolerate if we don't have one.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-4-david@redhat.com>
Tested-by: Joe Clifford <joeclifford@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-18 10:14:42 +02:00
David Hildenbrand
bbfd3017eb kvm: fix alignment of ram address
Fix the wrong calculation of the delta, used to align the ram address.

This only strikes if alignment has to be done.

Reported-by: Joe Clifford <joeclifford@gmail.com>
Fixes: 5ea69c2e36 ("kvm: factor out alignment of memory section")
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-3-david@redhat.com>
Tested-by: Joe Clifford <joeclifford@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-18 10:14:35 +02:00
Richard Henderson
de258eb07d tcg: Fix off-by-one in assert in page_set_flags
Most of the users of page_set_flags offset (page, page + len) as
the end points.  One might consider this an error, since the other
users do supply an endpoint as the last byte of the region.

However, the first thing that page_set_flags does is round end UP
to the start of the next page.  Which means computing page + len - 1
is in the end pointless.  Therefore, accept this usage and do not
assert when given the exact size of the vm as the endpoint.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170708025030.15845-2-rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-10-16 16:00:56 +03:00
Emilio G. Cota
e7e168f413 exec-all: extract tb->tc_* into a separate struct tc_tb
In preparation for adding tc.size to be able to keep track of
TB's using the binary search tree implementation from glib.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Emilio G. Cota
6eb062abd6 translate-all: define and use DEBUG_TB_CHECK_GATE
This prevents bit rot by ensuring the debug code is compiled when
building a user-mode target.

Unfortunately the helpers are user-mode-only so we cannot fully
get rid of the ifdef checks. Add a comment to explain this.

Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Emilio G. Cota
dae9e03aed translate-all: define and use DEBUG_TB_INVALIDATE_GATE
This gets rid of an ifdef check while ensuring that the debug code
is compiled, which prevents bit rot.

Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Emilio G. Cota
67a5b5d2f6 exec-all: introduce TB_PAGE_ADDR_FMT
And fix the following warning when DEBUG_TB_INVALIDATE is enabled
in translate-all.c:

  CC      mipsn32-linux-user/accel/tcg/translate-all.o
/data/src/qemu/accel/tcg/translate-all.c: In function ‘tb_alloc_page’:
/data/src/qemu/accel/tcg/translate-all.c:1201:16: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘tb_page_addr_t {aka unsigned int}’ [-Werror=format=]
         printf("protecting code page: 0x" TARGET_FMT_lx "\n",
                ^
cc1: all warnings being treated as errors
/data/src/qemu/rules.mak:66: recipe for target 'accel/tcg/translate-all.o' failed
make[1]: *** [accel/tcg/translate-all.o] Error 1
Makefile:328: recipe for target 'subdir-mipsn32-linux-user' failed
make: *** [subdir-mipsn32-linux-user] Error 2
cota@flamenco:/data/src/qemu/build ((18f3fe1...) *$)$

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Emilio G. Cota
424079c13b translate-all: define and use DEBUG_TB_FLUSH_GATE
This gets rid of some ifdef checks while ensuring that the debug code
is compiled, which prevents bit rot.

Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Emilio G. Cota
84f1c148da exec-all: bring tb->invalid into tb->cflags
This gets rid of a hole in struct TranslationBlock.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Emilio G. Cota
f6bb84d531 tcg: consolidate TB lookups in tb_lookup__cpu_state
This avoids duplicating code. cpu_exec_step will also use the
new common function once we integrate parallel_cpus into tb->cflags.

Note that in this commit we also fix a race, described by Richard Henderson
during review. Think of this scenario with threads A and B:

   (A) Lookup succeeds for TB in hash without tb_lock
        (B) Sets the TB's tb->invalid flag
        (B) Removes the TB from tb_htable
        (B) Clears all CPU's tb_jmp_cache
   (A) Store TB into local tb_jmp_cache

Given that order of events, (A) will keep executing that invalid TB until
another flush of its tb_jmp_cache happens, which in theory might never happen.
We can fix this by checking the tb->invalid flag every time we look up a TB
from tb_jmp_cache, so that in the above scenario, next time we try to find
that TB in tb_jmp_cache, we won't, and will therefore be forced to look it
up in tb_htable.

Performance-wise, I measured a small improvement when booting debian-arm.
Note that inlining pays off:

 Performance counter stats for 'taskset -c 0 qemu-system-arm \
	-machine type=virt -nographic -smp 1 -m 4096 \
	-netdev user,id=unet,hostfwd=tcp::2222-:22 \
	-device virtio-net-device,netdev=unet \
	-drive file=jessie.qcow2,id=myblock,index=0,if=none \
	-device virtio-blk-device,drive=myblock \
	-kernel kernel.img -append console=ttyAMA0 root=/dev/vda1 \
	-name arm,debug-threads=on -smp 1' (10 runs):

Before:
      18714.917392 task-clock                #    0.952 CPUs utilized            ( +-  0.95% )
            23,142 context-switches          #    0.001 M/sec                    ( +-  0.50% )
                 1 CPU-migrations            #    0.000 M/sec
            10,558 page-faults               #    0.001 M/sec                    ( +-  0.95% )
    53,957,727,252 cycles                    #    2.883 GHz                      ( +-  0.91% ) [83.33%]
    24,440,599,852 stalled-cycles-frontend   #   45.30% frontend cycles idle     ( +-  1.20% ) [83.33%]
    16,495,714,424 stalled-cycles-backend    #   30.57% backend  cycles idle     ( +-  0.95% ) [66.66%]
    76,267,572,582 instructions              #    1.41  insns per cycle
                                             #    0.32  stalled cycles per insn  ( +-  0.87% ) [83.34%]
    12,692,186,323 branches                  #  678.186 M/sec                    ( +-  0.92% ) [83.35%]
       263,486,879 branch-misses             #    2.08% of all branches          ( +-  0.73% ) [83.34%]

      19.648474449 seconds time elapsed                                          ( +-  0.82% )

After, w/ inline (this patch):
      18471.376627 task-clock                #    0.955 CPUs utilized            ( +-  0.96% )
            23,048 context-switches          #    0.001 M/sec                    ( +-  0.48% )
                 1 CPU-migrations            #    0.000 M/sec
            10,708 page-faults               #    0.001 M/sec                    ( +-  0.81% )
    53,208,990,796 cycles                    #    2.881 GHz                      ( +-  0.98% ) [83.34%]
    23,941,071,673 stalled-cycles-frontend   #   44.99% frontend cycles idle     ( +-  0.95% ) [83.34%]
    16,161,773,848 stalled-cycles-backend    #   30.37% backend  cycles idle     ( +-  0.76% ) [66.67%]
    75,786,269,766 instructions              #    1.42  insns per cycle
                                             #    0.32  stalled cycles per insn  ( +-  1.24% ) [83.34%]
    12,573,617,143 branches                  #  680.708 M/sec                    ( +-  1.34% ) [83.33%]
       260,235,550 branch-misses             #    2.07% of all branches          ( +-  0.66% ) [83.33%]

      19.340502161 seconds time elapsed                                          ( +-  0.56% )

After, w/o inline:
      18791.253967 task-clock                #    0.954 CPUs utilized            ( +-  0.78% )
            23,230 context-switches          #    0.001 M/sec                    ( +-  0.42% )
                 1 CPU-migrations            #    0.000 M/sec
            10,563 page-faults               #    0.001 M/sec                    ( +-  1.27% )
    54,168,674,622 cycles                    #    2.883 GHz                      ( +-  0.80% ) [83.34%]
    24,244,712,629 stalled-cycles-frontend   #   44.76% frontend cycles idle     ( +-  1.37% ) [83.33%]
    16,288,648,572 stalled-cycles-backend    #   30.07% backend  cycles idle     ( +-  0.95% ) [66.66%]
    77,659,755,503 instructions              #    1.43  insns per cycle
                                             #    0.31  stalled cycles per insn  ( +-  0.97% ) [83.34%]
    12,922,780,045 branches                  #  687.702 M/sec                    ( +-  1.06% ) [83.34%]
       261,962,386 branch-misses             #    2.03% of all branches          ( +-  0.71% ) [83.35%]

      19.700174670 seconds time elapsed                                          ( +-  0.56% )

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Emilio G. Cota
7f11636dbe tcg: remove addr argument from lookup_tb_ptr
It is unlikely that we will ever want to call this helper passing
an argument other than the current PC. So just remove the argument,
and use the pc we already get from cpu_get_tb_cpu_state.

This change paves the way to having a common "tb_lookup" function.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Emilio G. Cota
841710c78e cpu-exec: rename have_tb_lock to acquired_tb_lock in tb_find
Reusing the have_tb_lock name, which is also defined in translate-all.c,
makes code reviewing unnecessarily harder.

Avoid potential confusion by renaming the local have_tb_lock variable
to something else.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Emilio G. Cota
13e1094735 translate-all: make have_tb_lock static
It is only used by this object, and it's not exported to any other.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Emilio G. Cota
0aecede612 tcg: fix corruption of code_time profiling counter upon tb_flush
Whenever there is an overflow in code_gen_buffer (e.g. we run out
of space in it and have to flush it), the code_time profiling counter
ends up with an invalid value (that is, code_time -= profile_getclock(),
without later on getting += profile_getclock() due to the goto).

Fix it by using the ti variable, so that we only update code_time
when there is no overflow. Note that in case there is an overflow
we fail to account for the elapsed coding time, but this is quite rare
so we can probably live with it.

"info jit" before/after, roughly at the same time during debian-arm bootup:

- before:
Statistics:
TB flush count      1
TB invalidate count 4665
TLB flush count     998
JIT cycles          -615191529184601 (-256329.804 s at 2.4 GHz)
translated TBs      302310 (aborted=0 0.0%)
avg ops/TB          48.4 max=438
deleted ops/TB      8.54
avg temps/TB        32.31 max=38
avg host code/TB    361.5
avg search data/TB  24.5
cycles/op           -42014693.0
cycles/in byte      -121444900.2
cycles/out byte     -5629031.1
cycles/search byte     -83114481.0
  gen_interm time   -0.0%
  gen_code time     100.0%
optim./code time    -0.0%
liveness/code time  -0.0%
cpu_restore count   6236
  avg cycles        110.4

- after:
Statistics:
TB flush count      1
TB invalidate count 4665
TLB flush count     1010
JIT cycles          1996899624 (0.832 s at 2.4 GHz)
translated TBs      297961 (aborted=0 0.0%)
avg ops/TB          48.5 max=438
deleted ops/TB      8.56
avg temps/TB        32.31 max=38
avg host code/TB    361.8
avg search data/TB  24.5
cycles/op           138.2
cycles/in byte      398.4
cycles/out byte     18.5
cycles/search byte     273.1
  gen_interm time   14.0%
  gen_code time     86.0%
optim./code time    19.4%
liveness/code time  10.3%
cpu_restore count   6372
  avg cycles        111.0

Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Emilio G. Cota
83974cf4f8 cputlb: bring back tlb_flush_count under !TLB_DEBUG
Commit f0aff0f124 ("cputlb: add assert_cpu_is_self checks") buried
the increment of tlb_flush_count under TLB_DEBUG. This results in
"info jit" always (mis)reporting 0 TLB flushes when !TLB_DEBUG.

Besides, under MTTCG tlb_flush_count is updated by several threads,
so in order not to lose counts we'd either have to use atomic ops
or distribute the counter, which is more scalable.

This patch does the latter by embedding tlb_flush_count in CPUArchState.
The global count is then easily obtained by iterating over the CPU list.

Note that this change also requires updating the accessors to
tlb_flush_count to use atomic_read/set whenever there may be conflicting
accesses (as defined in C11) to it.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10 07:37:10 -07:00
Greg Kurz
11748ba72e kvm: check KVM_CAP_NR_VCPUS with kvm_vm_check_extension()
On a modern server-class ppc host with the following CPU topology:

Architecture:          ppc64le
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0,8,16,24
Off-line CPU(s) list:  1-7,9-15,17-23,25-31
Thread(s) per core:    1

If both KVM PR and KVM HV loaded and we pass:

        -machine pseries,accel=kvm,kvm-type=PR -smp 8

We expect QEMU to warn that this exceeds the number of online CPUs:

Warning: Number of SMP cpus requested (8) exceeds the recommended
 cpus supported by KVM (4)
Warning: Number of hotpluggable cpus requested (8) exceeds the
 recommended cpus supported by KVM (4)

but nothing is printed...

This happens because on ppc the KVM_CAP_NR_VCPUS capability is VM
specific  ndreally depends on the KVM type, but we currently use it
as a global capability. And KVM returns a fallback value based on
KVM HV being present. Maybe KVM on POWER shouldn't presume anything
as long as it doesn't have a VM, but in all cases, we should call
KVM_CREATE_VM first and use KVM_CAP_NR_VCPUS as a VM capability.

This patch hence changes kvm_recommended_vcpus() accordingly and
moves the sanity checking of smp_cpus after the VM creation.

It is okay for the other archs that also implement KVM_CAP_NR_VCPUS,
ie, mips, s390, x86 and arm, because they don't depend on the VM
being created or not.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <150600966286.30533.10909862523552370889.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-02 14:38:06 +02:00
Greg Kurz
62dd4edaaf kvm: check KVM_CAP_SYNC_MMU with kvm_vm_check_extension()
On a server-class ppc host, this capability depends on the KVM type,
ie, HV or PR. If both KVM are present in the kernel, we will always
get the HV specific value, even if we explicitely requested PR on
the command line.

This can have an impact if we're using hugepages or a balloon device.

Since we've already created the VM at the time any user calls
kvm_has_sync_mmu(), switching to kvm_vm_check_extension() is
enough to fix any potential issue.

It is okay for the other archs that also implement KVM_CAP_SYNC_MMU,
ie, mips, s390, x86 and arm, because they don't depend on the VM being
created or not.

While here, let's cache the state of this extension in a bool variable,
since it has several users in the code, as suggested by Thomas Huth.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <150600965332.30533.14702405809647835716.stgit@bahia.lan>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-10-02 14:38:06 +02:00
Alex Bennée
8b81253332 accel/tcg/cputlb: avoid recursive BQL (fixes #1706296)
The mmio path (see exec.c:prepare_mmio_access) already protects itself
against recursive locking and it makes sense to do the same for
io_readx/writex. Otherwise any helper running in the BQL context will
assert when it attempts to write to device memory as in the case of
the bug report.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
CC: Richard Jones <rjones@redhat.com>
CC: Paolo Bonzini <bonzini@gnu.org>
CC: qemu-stable@nongnu.org
Message-Id: <20170921110625.9500-1-alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-25 11:23:30 -07:00
David Hildenbrand
3110cdbd8a kvm: drop wrong assertion creating problems with pflash
pflash toggles mr->romd_mode. So this assert does not always hold.

1) a device was added with !mr->romd_mode, therefore effectively not
   creating a kvm slot as we want to trap every access (add = false).
2) mr->romd_mode was toggled on before remove it. There is now
   actually no slot to remove and the assert is wrong.

So let's just drop the assert.

Reported-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170920145025.19403-1-david@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-21 12:40:08 +02:00
Philippe Mathieu-Daudé
4c44a007b5 accel/hax: move hax-stub.c to accel/stubs/
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170913221149.30382-1-f4bug@amsat.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-09-19 16:20:49 +02:00
Alistair Francis
8297be80f7 Convert multi-line fprintf() to warn_report()
Convert all the multi-line uses of fprintf(stderr, "warning:"..."\n"...
to use warn_report() instead. This helps standardise on a single
method of printing warnings to the user.

All of the warnings were changed using these commands:
  find ./* -type f -exec sed -i \
    'N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +
  find ./* -type f -exec sed -i \
    'N;N;N;N;N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
    {} +

Indentation fixed up manually afterwards.

Some of the lines were manually edited to reduce the line length to below
80 charecters. Some of the lines with newlines in the middle of the
string were also manually edit to avoid checkpatch errrors.

The #include lines were manually updated to allow the code to compile.

Several of the warning messages can be improved after this patch, to
keep this patch mechanical this has been moved into a later patch.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Jason Wang <jasowang@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <5def63849ca8f551630c6f2b45bcb1c482f765a6.1505158760.git.alistair.francis@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19 14:09:34 +02:00
David Hildenbrand
67548f0965 kvm: kvm_log_sync() is only called with known memory sections
Flatview will make sure that we can only end up in this function with
memory sections that correspond to exactly one slot. So we don't
have to iterate multiple times. There won't be overlapping slots but
only matching slots.

Properly align the section and look up the corresponding slot. This
heavily simplifies this function.

We can now get rid of kvm_lookup_overlapping_slot().

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-7-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19 14:09:34 +02:00
David Hildenbrand
343562e8fa kvm: kvm_log_start/stop are only called with known sections
Let's properly align the sections first and bail out if we would ever
get called with a memory section we don't know yet.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-6-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19 14:09:33 +02:00
David Hildenbrand
f357f564be kvm: we never have overlapping slots in kvm_set_phys_mem()
The way flatview handles memory sections, we will never have overlapping
memory sections in kvm.

address_space_update_topology_pass() will make sure that we will only
get called for

a) an existing memory section for which we only update parameters
(log_start, log_stop).
b) an existing memory section we want to delete (region_del)
c) a brand new memory section we want to add (region_add)

We cannot have overlapping memory sections in kvm as we will first remove
the overlapping sections and then add the ones without conflicts.

Therefore we can remove the complexity for handling prefix and suffix
slots.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19 14:09:33 +02:00
David Hildenbrand
2747e71672 kvm: use start + size for memory ranges
Convert kvm_lookup_matching_slot().

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-4-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19 14:09:33 +02:00
David Hildenbrand
5ea69c2e36 kvm: factor out alignment of memory section
Factor it out, so we can reuse it later.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-3-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19 14:09:33 +02:00
David Hildenbrand
89de4b9138 kvm: require JOIN_MEMORY_REGIONS_WORKS
We already require DESTROY_MEMORY_REGION_WORKS, JOIN_MEMORY_REGIONS_WORKS
was added just half a year later.

In addition, with flatview overlapping memory regions are first
removed before adding the changed one. So we can't really detect joining
memory regions this way.

Let's just get rid of this special handling.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19 14:09:33 +02:00
Philippe Mathieu-Daudé
a411d29637 accel/tcg: move USER code to user-exec.c
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170912211934.20919-1-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-17 06:52:19 -07:00
Philippe Mathieu-Daudé
10f7d4d53d accel/tcg: move atomic_template.h to accel/tcg/
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20170911213328.9701-5-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-17 06:52:19 -07:00
Philippe Mathieu-Daudé
61a3f8f6c0 accel/tcg: move tcg-runtime to accel/tcg/
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170911213328.9701-4-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-17 06:52:19 -07:00
Philippe Mathieu-Daudé
5841066668 accel/tcg: move user-exec to accel/tcg/
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170911213328.9701-3-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-17 06:52:19 -07:00
Thomas Huth
da1849c1eb accel/tcg: move softmmu_template.h to accel/tcg/
The header is only used by accel/tcg/cputlb.c so we can
move it to the accel/tcg/ folder, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
[PMD: reword commit title to match series]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170911213328.9701-2-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-17 06:52:19 -07:00
Richard Henderson
57a269469d tcg: Infrastructure for managing constant pools
A new shared header tcg-pool.inc.c adds new_pool_label,
for registering a tcg_target_ulong to be emitted after
the generated code, plus relocation data to install a
pointer to the data.

A new pointer is added to the TCGContext, so that we
dump the constant pool as data, not code.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
a858339336 tcg: Move USE_DIRECT_JUMP discriminator to tcg/cpu/tcg-target.h
Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump
boolean test.  Replace the tb_set_jmp_target1 ifdef with an unconditional
function tb_target_set_jmp_target.

While we're touching all backends, add a parameter for tb->tc_ptr;
we're going to need it shortly for some backends.

Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c.

This opens the possibility for TCG_TARGET_HAS_direct_jump to be
a runtime decision -- based on host cpu capabilities, the size of
code_gen_buffer, or a future debugging switch.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:34 -07:00
Lluís Vilanova
bb2e0039dc tcg: Add generic translation framework
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-Id: <150002073981.22386.9870422422367410100.stgit@frigg.lan>
[rth: Moved max_insns adjustment from tb_start to init_disas_context.
Removed pc_next return from translate_insn.
Removed tcg_check_temp_count from generic loop.
Moved gen_io_end to exactly match gen_io_start.
Use qemu_log instead of error_report for temporary leaks.
Moved TB size/icount assignments before disas_log.]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 08:06:47 -07:00
Peter Maydell
04e3aabde3 cputlb: Support generating CPU exceptions on memory transaction failures
Call the new cpu_transaction_failed() hook at the places where
CPU generated code interacts with the memory system:
 io_readx()
 io_writex()
 get_page_addr_code()

Any access from C code (eg via cpu_physical_memory_rw(),
address_space_rw(), ld/st_*_phys()) will *not* trigger CPU exceptions
via cpu_transaction_failed().  Handling for transactions failures for
this kind of call should be done by using a function which returns a
MemTxResult and treating the failure case appropriately in the
calling code.

In an ideal world we would not generate CPU exceptions for
instruction fetch failures in get_page_addr_code() but instead wait
until the code translation process tried a load and it failed;
however that change would require too great a restructuring and
redesign to attempt at this point.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2017-09-04 15:21:55 +01:00
Cornelia Huck
88c725c78e kvm: remove hard dependency on pci
The msi routing code in kvm calls some pci functions: provide
some stubs to enable builds without pci.

Also, to make this more obvious, guard them via a pci_available boolean
(which also can be reused in other places).

Fixes: e1d4fb2de ("kvm-irqchip: x86: add msi route notify fn")
Fixes: 767a554a0 ("kvm-all: Pass requester ID to MSI routing functions")
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Laurent Vivier
d73f024722 accel: cleanup error output
Only emit "XXX accelerator not found", if there are not
further accelerators listed. eg

   accel=kvm:tcg

doesn't print a "KVM accelerator not found" warning
when it falls back to tcg, but a

   accel=kvm

prints a warning, since no fallback is given.

Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170717144527.24534-1-lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-01 17:27:33 +02:00
Vladimir Sementsov-Ogievskiy
8908eb1a4a trace-events: fix code style: print 0x before hex numbers
The only exception are groups of numers separated by symbols
'.', ' ', ':', '/', like 'ab.09.7d'.

This patch is made by the following:

> find . -name trace-events | xargs python script.py

where script.py is the following python script:
=========================
 #!/usr/bin/env python

import sys
import re
import fileinput

rhex = '%[-+ *.0-9]*(?:[hljztL]|ll|hh)?(?:x|X|"\s*PRI[xX][^"]*"?)'
rgroup = re.compile('((?:' + rhex + '[.:/ ])+' + rhex + ')')
rbad = re.compile('(?<!0x)' + rhex)

files = sys.argv[1:]

for fname in files:
    for line in fileinput.input(fname, inplace=True):
        arr = re.split(rgroup, line)
        for i in range(0, len(arr), 2):
            arr[i] = re.sub(rbad, '0x\g<0>', arr[i])

        sys.stdout.write(''.join(arr))
=========================

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-id: 20170731160135.12101-5-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-01 12:13:07 +01:00
Lluís Vilanova
9c489ea6be tcg: Pass generic CPUState to gen_intermediate_code()
Needed to implement a target-agnostic gen_intermediate_code()
in the future.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Benneé <alex.benee@linaro.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-Id: <150002025498.22386.18051908483085660588.stgit@frigg.lan>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-19 14:45:16 -07:00
Lluís Vilanova
61a67f71dd exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state
Every vCPU now uses a separate set of TBs for each set of dynamic
tracing event state values. Each set of TBs can be used by any number of
vCPUs to maximize TB reuse when vCPUs have the same tracing state.

This feature is later used by tracetool to optimize tracing of guest
code events.

The maximum number of TB sets is defined as 2^E, where E is the number
of events that have the 'vcpu' property (their state is stored in
CPUState->trace_dstate).

For this to work, a change on the dynamic tracing state of a vCPU will
force it to flush its virtual TB cache (which is only indexed by
address), and fall back to the physical TB cache (which now contains the
vCPU's dynamic tracing state as part of the hashing function).

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 149915775266.6295.10060144081246467690.stgit@frigg.lan
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17 13:11:05 +01:00
Emilio G. Cota
d40d3da00c translate-all: remove redundant !tcg_enabled check in dump_exec_info
This check is redundant because it is already performed by the only
caller of dump_exec_info -- the caller was updated by b7da97eef
("monitor: Check whether TCG is enabled before running the "info jit"
code").

Checking twice wouldn't necessarily be too bad, but here the check also
returns with tb_lock held. So we can either do the check before tb_lock is
acquired, or just get rid of it. Given that it is redundant, I am going
for the latter option.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:42 +02:00