Commit Graph

1030 Commits

Author SHA1 Message Date
Roman Kapl
5ad4a2b75f exec: Add missing rcu_read_unlock
rcu_read_unlock was not called if the address_space_access_valid result is
negative.

This caused (at least) a problem when qemu on PPC/E500+TAP failed to terminate
properly and instead got stuck in a deadlock.

Signed-off-by: Roman Kapl <rka@sysgo.com>
Message-Id: <20170109110921.4931-1-rka@sysgo.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-16 17:52:35 +01:00
Alex Bennée
d10eb08f5d cputlb: drop flush_global flag from tlb_flush
We have never has the concept of global TLB entries which would avoid
the flush so we never actually use this flag. Drop it and make clear
that tlb_flush is the sledge-hammer it has always been.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
[DG: ppc portions]
Acked-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-13 14:24:37 +00:00
Jason Wang
052c8fa998 exec: introduce address_space_get_iotlb_entry()
This patch introduces a helper to query the iotlb entry for a
possible iova. This will be used by later device IOTLB API to enable
the capability for a dataplane (e.g vhost) to query the IOTLB.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 05:56:58 +02:00
Paolo Bonzini
1f4e496e1f exec: introduce MemoryRegionCache
Device models often have to perform multiple access to a single
memory region that is known in advance, but would to use "DMA-style"
functions instead of address_space_map/unmap.  This can happen
for example when the data has to undergo endianness conversion.
Introduce a new data structure to cache the result of
address_space_translate without forcing usage of a host address
like address_space_map does.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:23 +01:00
Paolo Bonzini
715c31ec8e exec: introduce address_space_extend_translation
This extracts the common part of address_space_map and
address_space_cache_init into a new function.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:23 +01:00
Paolo Bonzini
0ce265ffef exec: introduce memory_ldst.inc.c
Templatize the address_space_* and *_phys functions, so that we can add
similar functions in the next patch that work with a lightweight,
cache-like version of address_space_map/unmap.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:23 +01:00
Paolo Bonzini
2651efe7f5 exec: optimize remaining address_space_* cases
Do them right before the next patch generalizes them into a multi-included
file.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:22 +01:00
Peter Maydell
a9353fe897 exec.c: Fix breakpoint invalidation race
A bug (1647683) was reported showing a crash when removing
breakpoints.  The reproducer was bisected to 3359baad when tb_flush
was finally made thread safe.  While in MTTCG the locking in
breakpoint_invalidate would have prevented any problems, but
currently tb_lock() is a NOP for system emulation.

The race is between a tb_flush from the gdbstub and the
tb_invalidate_phys_addr() in breakpoint_invalidate().

Ideally we'd have actual locking here; for the moment the
simple fix is to do a full tb_flush() for a bp invalidate,
since that is thread-safe even if no lock is taken.

Reported-by: Julian Brown <julian@codesourcery.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1481047629-7763-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-12-06 20:21:46 +00:00
Stefan Hajnoczi
199a5bde46 * NBD bugfix (Changlong)
* NBD write zeroes support (Eric)
 * Memory backend fixes (Haozhong)
 * Atomics fix (Alex)
 * New AVX512 features (Luwei)
 * "make check" logging fix (Paolo)
 * Chardev refactoring fallout (Paolo)
 * Small checkpatch improvements (Paolo, Jeff)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJYGaRPFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 XKgH/RgNtosBTqJsmphkS7wACFAFOf7Uq46ajoKfB66Pt1J/++pFQg4TApPYkb7j
 KlKeKmXa7hb6+Jg8325H4zGkGno4kn2dE+OnznaB1xPKwiZVAMQVzQsagsEVqpno
 k/5PBVRptIiuHQKyU29Go0CxbWJBTH0O14S7rDK4YDF0YMnuT280HQOI3jdu1igV
 G/Q+CMgfk+yXf6GWHE8Z9sNq7n0ha8qgruA/X3NC7+pAvEsUcAP065zwLp9weYuK
 W1MU68L7Ub4tRo0SVf1HFkDUNdMv4T4hg+wpGe1GwthJWexHu9x0YAQBy60ykJb6
 NtHwjLwCUWtm7AiZD/btsOJPmjk=
 =+Dt/
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* NBD bugfix (Changlong)
* NBD write zeroes support (Eric)
* Memory backend fixes (Haozhong)
* Atomics fix (Alex)
* New AVX512 features (Luwei)
* "make check" logging fix (Paolo)
* Chardev refactoring fallout (Paolo)
* Small checkpatch improvements (Paolo, Jeff)

# gpg: Signature made Wed 02 Nov 2016 08:31:11 AM GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (30 commits)
  main-loop: Suppress I/O thread warning under qtest
  docs/rcu.txt: Fix minor typo
  vl: exit qemu on guest panic if -no-shutdown is not set
  checkpatch: allow spaces before parenthesis for 'coroutine_fn'
  x86: add AVX512_4VNNIW and AVX512_4FMAPS features
  slirp: fix CharDriver breakage
  qemu-char: do not forward events through the mux until QEMU has started
  nbd: Implement NBD_CMD_WRITE_ZEROES on client
  nbd: Implement NBD_CMD_WRITE_ZEROES on server
  nbd: Improve server handling of shutdown requests
  nbd: Refactor conversion to errno to silence checkpatch
  nbd: Support shorter handshake
  nbd: Less allocation during NBD_OPT_LIST
  nbd: Let client skip portions of server reply
  nbd: Let server know when client gives up negotiation
  nbd: Share common option-sending code in client
  nbd: Send message along with server NBD_REP_ERR errors
  nbd: Share common reply-sending code in server
  nbd: Rename struct nbd_request and nbd_reply
  nbd: Rename NbdClientSession to NBDClientSession
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-03 16:32:30 +00:00
Haozhong Zhang
1775f111ea exec.c: check memory backend file size with 'size' option
If the memory backend file is not large enough to hold the required 'size',
Qemu will report error and exit.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20161027042300.5929-3-haozhong.zhang@intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20161102010551.2723-1-haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-02 09:28:51 +01:00
Richard Henderson
1ee73216f4 log: Add locking to large logging blocks
Reuse the existing locking provided by stdio to keep in_asm, cpu,
op, op_opt, op_ind, and out_asm as contiguous blocks.

While it isn't possible to interleave e.g. in_asm or op_opt logs
because of the TB lock protecting all code generation, it is
possible to interleave cpu logs, or to interleave a cpu dump with
an out_asm dump.

For mingw32, we appear to have no viable solution for this.  The locking
functions are not properly exported from the system runtime library.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-11-01 10:29:03 -06:00
Haozhong Zhang
d6af99c9f8 exec.c: do not truncate non-empty memory backend file
For '-object memory-backend-file,mem-path=foo,size=xyz', if the size of
file 'foo' does not match the given size 'xyz', the current QEMU will
truncate the file to the given size, which may corrupt the existing data
in that file. To avoid such data corruption, this patch disables
truncating non-empty backend files.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20161027042300.5929-2-haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-01 16:06:57 +01:00
Alex Bennée
f35e44e764 exec.c: ensure all AddressSpaceDispatch updates under RCU
The memory_dispatch field is meant to be protected by RCU so we should
use the correct primitives when accessing it. This race was flagged up
by the ThreadSanitizer.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161021153418.21571-1-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-01 16:06:57 +01:00
Alex Bennée
ba051fb5e5 tcg: move locking for tb_invalidate_phys_page_range up
In the linux-user case all things that involve ''l1_map' and PageDesc
tweaks are protected by the memory lock (mmpa_lock). For SoftMMU mode
we previously relied on single threaded behaviour, with MTTCG we now use
the tb_lock().

As a result we need to do a little re-factoring  and push the taking of
this lock up the call tree. This requires a slightly different entry for
the SoftMMU and user-mode cases from tb_invalidate_phys_range.

This also means user-mode breakpoint insertion needs to take two locks
but it hadn't taken any previously so this is an improvement.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161027151030.20863-20-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 15:00:25 +01:00
KONRAD Frederic
a5e998262f tcg: protect translation related stuff with tb_lock.
This protects all translation related work with tb_lock() too ensure
thread safety. This effectively serialises all code generation. In
addition to the code generation we also take the lock for TB
invalidation. This has a knock on effect of meaning tb_lock() is held
for modification of the SoftMMU TLB by non-self threads which will be
used in later patches.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-Id: <1439220437-23957-8-git-send-email-fred.konrad@greensocs.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[AJB: moved into tree, clean-up history]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-10-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 10:51:16 +01:00
Richard Henderson
258dfaaad0 exec: Avoid direct references to Int128 parts
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-26 08:29:00 -07:00
Peter Maydell
c43e853afe x86 and CPU queue, 2016-10-24
x2APIC support to APIC code, cpu_exec_init() refactor on all
 architectures, and other x86 changes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYDmYyAAoJECgHk2+YTcWmoSUP/2ga+b9YmPuyL7XC+12pff0I
 Z8gdjUzbMUNcCI0JMZCTGUJbs3BapLcnsA7ypmt88s9kG02WeDMhNx1BfYiAFgLU
 kPLQlXAM7awEdGagd3sTCiFojSUZ7GxYHjd5fuhPoOAXvXM8im6zJl18ZcsnStjO
 /J8JGoGDHq1XJlz+RIjnGamojJWCiO/+iiD+rFmVSic8zjHPDYq14sIk/QJX+DaF
 azLiOI6DAlX3kyrN5ZshhIRQ3COzzUMUSDF/ZaYHjudUco5MBnwj/oLQniTq+ZUd
 hCu7dr5TpLxI7q1yltyd0UIl/+aZGbE8tEvoXAtc735iK4m2CTckT7ql6x3xI+Ir
 PmpPgIswHqfCiCXm8imLj6ZI47kRA1x4x4AudLaNVKP7jO82485sS9HWpOadYsaU
 jvek2SqfqvH+vce4FzwlLEcXGDb73MT/XkIUvd7SfPIbs9umgdZc03U4SHfAWr0i
 lAIRs4Ym0AAS2WSE4E09wvdUUr9oxaQBMhw3JAiNmg7hLfyINTP+D/IhtlAVXXEA
 F9D7fky5lDwfKvIwPxPJbDD5bCBV9AmxhiahIhv3epu4Kg4orf1inkrx0IZWSbB0
 7+JZ7j8asuizfibkeZAN9rxVwmz32makJNsnjzZHlnaPxTvIDzvRkNceBnhC5vKq
 3yfxgl4agXmMjveraAtt
 =T2kg
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging

x86 and CPU queue, 2016-10-24

x2APIC support to APIC code, cpu_exec_init() refactor on all
architectures, and other x86 changes.

# gpg: Signature made Mon 24 Oct 2016 20:51:14 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request:
  exec: call cpu_exec_exit() from a CPU unrealize common function
  exec: move cpu_exec_init() calls to realize functions
  exec: split cpu_exec_init()
  pc: q35: Bump max_cpus to 288
  pc: Require IRQ remapping and EIM if there could be x2APIC CPUs
  pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs
  Increase MAX_CPUMASK_BITS from 255 to 288
  pc: Clarify FW_CFG_MAX_CPUS usage comment
  pc: kvm_apic: Pass APIC ID depending on xAPIC/x2APIC mode
  pc: apic_common: Reset APIC ID to initial ID when switching into x2APIC mode
  pc: apic_common: Restore APIC ID to initial ID on reset
  pc: apic_common: Extend APIC ID property to 32bit
  pc: Leave max apic_id_limit only in legacy cpu hotplug code
  acpi: cphp: Force switch to modern cpu hotplug if APIC ID > 254
  pc: acpi: x2APIC support for SRAT table
  pc: acpi: x2APIC support for MADT table and _MAT method

Conflicts:
	target-arm/cpu.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-25 10:25:27 +01:00
Laurent Vivier
7bbc124e7e exec: call cpu_exec_exit() from a CPU unrealize common function
As cpu_exec_exit() mirrors the cpu_exec_realizefn(),
rename it as cpu_exec_unrealizefn().

Create and register a cpu_common_unrealizefn() function for
the CPU device class and call cpu_exec_unrealizefn() from
this function.

Remove cpu_exec_exit() from cpu_common_finalize()
(which mirrors init, not realize), and as x86_cpu_unrealizefn()
and ppc_cpu_unrealizefn() overwrite the device class unrealize function,
add a call to a parent_unrealize pointer.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:16 -02:00
Laurent Vivier
ce5b1bbf62 exec: move cpu_exec_init() calls to realize functions
Modify all CPUs to call it from XXX_cpu_realizefn() function.

Remove all the cannot_destroy_with_object_finalize_yet as
unsafe references have been moved to cpu_exec_realizefn().
(tested with QOM command provided by commit 4c315c27)

for arm:

Setting of cpu->mp_affinity is moved from arm_cpu_initfn()
to arm_cpu_realizefn() as setting of cpu_index is now done
in cpu_exec_realizefn(). To avoid to overwrite an user defined
value, we set it to an invalid value by default, and update
it in realize function only if the value is still invalid.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:16 -02:00
Laurent Vivier
39e329e341 exec: split cpu_exec_init()
Put in cpu_exec_initfn() what initializes the CPU,
and leave in cpu_exec_init() what adds it to the environment.

As cpu_exec_initfn() is called by all XX_cpu_initfn(), call it
directly in cpu_common_initfn().
cpu_exec_init() is now a realize function, it will be renamed
to cpu_exec_realizefn() and moved to the XX_cpu_realizefn()
function in a following patch.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:16 -02:00
Peter Maydell
20bccb82ff cpu: Support a target CPU having a variable page size
Support target CPUs having a page size which isn't knownn
at compile time. To use this, the CPU implementation should:
 * define TARGET_PAGE_BITS_VARY
 * not define TARGET_PAGE_BITS
 * define TARGET_PAGE_BITS_MIN to the smallest value it
   might possibly want for TARGET_PAGE_BITS
 * call set_preferred_target_page_bits() in its realize
   function to indicate the actual preferred target page
   size for the CPU (and report any error from it)

In CONFIG_USER_ONLY, the CPU implementation should continue
to define TARGET_PAGE_BITS appropriately for the guest
OS page size.

Machines which want to take advantage of having the page
size something larger than TARGET_PAGE_BITS_MIN must
set the MachineClass minimum_page_bits field to a value
which they guarantee will be no greater than the preferred
page size for any CPU they create.

Note that changing the target page size by setting
minimum_page_bits is a migration compatibility break
for that machine.

For debugging purposes, attempts to use TARGET_PAGE_SIZE
before it has been finally confirmed will assert.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-24 16:26:49 +01:00
Vijaya Kumar K
2615fabd42 exec.c: Remove static allocation of sub_section of sub_page
Allocate sub_section dynamically. Remove dependency
on TARGET_PAGE_SIZE to make run-time page size detection
for arm platforms.

Signed-off-by: Vijaya Kumar K <vijayak@cavium.com>
Message-id: 1465808915-4887-3-git-send-email-vijayak@caviumnetworks.com
[PMM: use flexible array member rather than separate malloc
 so we don't need an extra pointer deref when using it]
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-24 16:26:49 +01:00
Haozhong Zhang
8360668e69 exec.c: workaround regression caused by alignment change in d2f39ad
Commit d2f39ad "exec.c: Ensure right alignment also for file backed ram"
added an additional alignment requirement on the size of backend file
besides the previous page size. On x86, the alignment is changed from
4KB in QEMU 2.6 to 2MB in QEMU 2.7.

This change breaks certain usages in QEMU 2.7 on x86, e.g.
    -object memory-backend-file,id=mem1,mem-path=/tmp/,size=$SZ
    -device pc-dimm,id=dimm1,memdev=mem1
where $SZ is multiple of 4KB but not 2MB (e.g. 1023M). QEMU 2.7
reports the following error message and aborts:
qemu-system-x86_64: -device pc-dimm,memdev=mem1,id=nv1: backend memory size must be multiple of 0x200000

The same regression may also happen in other platforms as indicated by
Igor Mammedov. This change is however necessary for s390 according to
the commit message of d2f39ad, so we workaround the regression by taking
the change only on s390.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reported-by: "Xu, Anthony" <anthony.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:46:11 +02:00
Dr. David Alan Gilbert
863e9621c5 RAMBlocks: Store page size
Store the page size in each RAMBlock, we need it later.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Marc-André Lureau
efee678d6d exec: remove unused compacted argument
Since commit b35ba30f8f when it was introduced, phys_page_compact()
takes an unused compacted argument.

ubsan complains about it when launching qemu-x86_64 without arguments:
qemu/exec.c:310:5: runtime error: variable length array bound evaluates to non-positive value 0

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-10-08 11:25:29 +03:00
Paolo Bonzini
267f685b8b cpus-common: move CPU list management to common code
Add a mutex for the CPU list to system emulation, as it will be used to
manage safe work.  Abstract manipulation of the CPU list in new functions
cpu_list_add and cpu_list_remove.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:29 +02:00
Cao jin
c2cd627ddb kvm-all: drop kvm_setup_guest_memory
kvm_setup_guest_memory only does "madvise to QEMU_MADV_DONTFORK" and
is only called by ram_block_add, which actually is duplicate code.
Bonus: add simple comment for kvm_has_sync_mmu to make life easier.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1473662096-32598-1-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-13 19:09:43 +02:00
Igor Mammedov
3b8c1761f0 qtail: clean up direct access to tqe_prev field
instead of accessing tqe_prev field dircetly outside
of queue.h use macros to check if element is in list
and make sure that afer element is removed from list
tqe_prev field could be used to do the same check.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1469450832-84343-1-git-send-email-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-13 19:08:41 +02:00
Igor Mammedov
630eb0faf4 exec: Ensure the only one cpu_index allocation method is used
Make sure that cpu_index auto allocation isn't used in
combination with manual cpu_index assignment. And
dissallow out of order cpu removal if auto allocation
is in use.

Target that wishes to support out of order unplug should
switch to manual cpu_index assignment. Following patch
could be used as an example:
 (pc: init CPUState->cpu_index with index in possible_cpus[]))

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-05 16:01:55 -03:00
Igor Mammedov
056b68af77 fix qemu exit on memory hotplug when allocation fails at prealloc time
When adding hostmem backend at runtime, QEMU might exit with error:
  "os_mem_prealloc: Insufficient free host memory pages available to allocate guest RAM"

It happens due to os_mem_prealloc() not handling errors gracefully.

Fix it by passing errp argument so that os_mem_prealloc() could
report error to callers and undo performed allocation when
os_mem_prealloc() fails.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1469008443-72059-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-08-02 12:03:58 +02:00
Igor Mammedov
a07f953ef4 exec: Set cpu_index only if it's not been explictly set
It keeps the legacy behavior for all users that doesn't care
about stable cpu_index value, but would allow boards that
would support device_add/device_del to set stable cpu_index
that won't depend on order in which cpus are created/destroyed.

While at that simplify cpu_get_free_index() as cpu_index
generated by USER_ONLY and softmmu variants is the same
since none of the users support cpu-remove so far, except
of not yet released spapr/x86 device_add/delr, which
will be altered by follow up patches to set stable
cpu_index manually.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-26 15:32:01 -03:00
Igor Mammedov
8b1b835035 exec: Don't use cpu_index to detect if cpu_exec_init()'s been called
Instead use QTAIL's tqe_prev field to detect if cpu's been
placed in list by cpu_exec_init() which is always set if
QTAIL element is in list.

Fixes SIGSEGV on failure path in case cpu_index is assigned
by board and cpu.relalize() fails before cpu_exec_init() is called.

In follow up patches, cpu_index will be assigned by boards that
support cpu hot(un)plug and need stable cpu_index that doesn't
depend on order cpus are created/removed.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-26 15:32:00 -03:00
Igor Mammedov
1bc7e522d9 exec: Reduce CONFIG_USER_ONLY ifdeffenery
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-26 15:31:58 -03:00
Peter Lieven
101420b886 exec: avoid realloc in phys_map_node_reserve
this is the first step in reducing the brk heap fragmentation
created by the map->nodes memory allocation. Since the introduction
of RCU the freeing of the PhysPageMaps is delayed so that sometimes
several hundred are allocated at the same time.

Even worse the memory for map->nodes is allocated and shortly
afterwards reallocated. Since the number of nodes it grows
to in the end is the same for all PhysPageMaps remember this value
and at least avoid the reallocation.

The large number of simultaneous allocations (about 450 x 70kB in
my configuration) has to be addressed later.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1468577030-21097-1-git-send-email-pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-17 09:59:21 +02:00
Markus Armbruster
a9c94277f0 Use #include "..." for our own headers, <...> for others
Tracked down with an ugly, brittle and probably buggy Perl script.

Also move includes converted to <...> up so they get included before
ours where that's obviously okay.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12 16:19:16 +02:00
Paolo Bonzini
02d0e09503 os-posix: include sys/mman.h
qemu/osdep.h checks whether MAP_ANONYMOUS is defined, but this check
is bogus without a previous inclusion of sys/mman.h.  Include it in
sysemu/os-posix.h and remove it from everywhere else.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:03 +02:00
Anthony PERARD
d6b6aec409 exec: Fix qemu_ram_block_from_host for Xen
Since f615f39 (exec: remove ram_addr argument from
qemu_ram_block_from_host), migration under Xen is likely to fail, with a
SEGV of QEMU. But the commit only reveal a bug with the calculation of
the offset value in qemu_ram_block_from_host().

This patch calculates the offset from the ram_addr as
qemu_ram_addr_from_host() will later calculate the ram_addr from the
offset.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2016-06-13 11:50:20 +01:00
Peter Maydell
6886b98036 cpu-exec: Rename cpu_resume_from_signal() to cpu_loop_exit_noexc()
The function cpu_resume_from_signal() is now always called with a
NULL puc argument, and is rather misnamed since it is never called
from a signal handler. It is essentially forcing an exit to the
top level cpu loop but without raising any exception, so rename
it to cpu_loop_exit_noexc() and drop the useless unused argument.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-4-git-send-email-peter.maydell@linaro.org
2016-06-09 15:55:02 +01:00
Peter Maydell
500acc9c41 ppc patch queue for 2016-05-31
Here's another ppc patch queue.  This batch is all preliminaries
 towards two significant features:
 
 1) Full hypervisor-mode support for POWER8
     Patches 1-8 start fixing various bugs with TCG's handling of
     hypervisor mode
 
 2) CPU hotplug support
     Patches 9-12 make some preliminary fixes towards implementing CPU
     hotplug on ppc64 (and other non-x86 platforms).  These patches are
     actually to generic code, not ppc, but are included here with
     Paolo's ACK.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXTN1QAAoJEGw4ysog2bOSM4kP/2TKm/wkGo3nsGm7vG0CArs+
 JVIlTWI9Le7Cq5ijCkTwV9gjeG2CYz+Us2PCh2ZAoHpXgZtP7px2HRcDv07SbCnt
 SaCwCS+EGf3ZO9baQrzG0zfe8XrlJF+XXTejD2zWtOZw7sZ/4OPWF9KdcZbjWqFp
 PzJuXrpYOAaIyXyEPJSZFpHY+AC9NIblqHlUrKntPLLOYbqQBYP4IMxsUmOgu2IX
 rFK/5A8t20BJN0lbmx8JNKh0voorFpHY/hhaH/1T7rKxsRkKMh3VbYSxD6EYs3Uc
 nZ4ufQQW6C4CEFta3YHNwoClcsQUbnZQh3Ra+gKo9bXvqDzasVpq/mBgl3BDjGeG
 LQPSA6sfmEA8lqtRikVdgSgdXDnwy5YXJLVmIXeAIG1KHa6eRuUxC3o+ScOkcH3A
 ynLglCEBl9slsG9/yYkDcFW2u0t/txTBUvaxMfOQomAejrGjOLGZqnSWMd2UC4Gt
 KQRP+b7igkXC7+bfrpBPWKyYGvOOCukESw3OV90hLBIzOthI1dI5hO0Gj61C/nlI
 NXMRbx0qTztgj3tfSTTs6e9Ke8PEKnyXqgol0+t9Yxntlz28f2alTubUoyv/vZjx
 8J1IOlNms3PnO26TxUVBu7/KaCGCM25eTQgllbTx8rhaqin3wAH3dRc1RTlWWhwJ
 SgADl+MWf8sa7DkcxnZa
 =vAIf
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160531' into staging

ppc patch queue for 2016-05-31

Here's another ppc patch queue.  This batch is all preliminaries
towards two significant features:

1) Full hypervisor-mode support for POWER8
    Patches 1-8 start fixing various bugs with TCG's handling of
    hypervisor mode

2) CPU hotplug support
    Patches 9-12 make some preliminary fixes towards implementing CPU
    hotplug on ppc64 (and other non-x86 platforms).  These patches are
    actually to generic code, not ppc, but are included here with
    Paolo's ACK.

# gpg: Signature made Tue 31 May 2016 01:39:44 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160531:
  cpu: Add a sync version of cpu_remove()
  cpu: Reclaim vCPU objects
  exec: Do vmstate unregistration from cpu_exec_exit()
  exec: Remove cpu from cpus list during cpu_exec_exit()
  ppc: Add PPC_64H instruction flag to POWER7 and POWER8
  ppc: Get out of emulation on SMT "OR" ops
  ppc: Fix sign extension issue in mtmsr(d) emulation
  ppc: Change 'invalid' bit mask of tlbiel and tlbie
  ppc: tlbie, tlbia and tlbisync are HV only
  ppc: Do some batching of TCG tlb flushes
  ppc: Use split I/D mmu modes to avoid flushes on interrupts
  ppc: Remove MMU_MODEn_SUFFIX definitions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-31 10:37:22 +01:00
Bharata B Rao
9dfeca7c6b exec: Do vmstate unregistration from cpu_exec_exit()
cpu_exec_init() does vmstate_register for the CPU device. This needs to be
undone from cpu_exec_exit(). This change is needed to support CPU hot
removal.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[dwg: added missing include to fix compile on some archs]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 14:03:29 +10:00
Bharata B Rao
1c59eb39cf exec: Remove cpu from cpus list during cpu_exec_exit()
CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
should be removed from cpu_exec_exit().

cpu_exec_exit() is called from generic CPU::instance_finalize and some
archs like PowerPC call it from CPU unrealizefn. So ensure that we
dequeue the cpu only once.

Now -1 value for cpu->cpu_index indicates that we have already dequeued
the cpu for CONFIG_USER_ONLY case also.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 13:22:20 +10:00
Paolo Bonzini
0878d0e11b exec: hide mr->ram_addr from qemu_get_ram_ptr users
Let users of qemu_get_ram_ptr and qemu_ram_ptr_length pass in an
address that is relative to the MemoryRegion.  This basically means
what address_space_translate returns.

Because the semantics of the second parameter change, rename the
function to qemu_map_ram_ptr.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
07bdaa4196 memory: split memory_region_from_host from qemu_ram_addr_from_host
Move the old qemu_ram_addr_from_host to memory_region_from_host and
make it return an offset within the region.  For qemu_ram_addr_from_host
return the ram_addr_t directly, similar to what it was before
commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host",
2013-07-04).

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
f615f39616 exec: remove ram_addr argument from qemu_ram_block_from_host
Of the two callers, one does not use it, and the other can compute
it itself based on the other output argument (offset) and the RAMBlock.

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
4ff87573df memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr
Remove direct uses of ram_addr_t and optimize memory_region_{get,set}_fd
now that a MemoryRegion knows its RAMBlock directly.

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
e4e697940d memory: remove unnecessary masking of MemoryRegion ram_addr
mr->ram_block->offset is already aligned to both host and target size
(see qemu_ram_alloc_internal).  Remove further masking as it is
unnecessary.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:45 +02:00
Gonglei
ab0a995608 exec: adjust rcu_read_lock requirement
qemu_ram_unset_idstr() doesn't need rcu lock anymore,
meanwhile make the range of rcu lock in
qemu_ram_set_idstr() as small as possible.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1462845901-89716-3-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:44 +02:00
Gonglei
fa53a0e53e memory: drop find_ram_block()
On the one hand, we have already qemu_get_ram_block() whose function
is similar. On the other hand, we can directly use mr->ram_block but
searching RAMblock by ram_addr which is a kind of waste.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1462845901-89716-2-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:44 +02:00
Dominik Dingel
d2f39add72 exec.c: Ensure right alignment also for file backed ram
While in the anonymous ram case we already take care of the right alignment
such an alignment gurantee does not exist for file backed ram allocation.

Instead, pagesize is used for alignment. On s390 this is not enough for gmap,
as we need to satisfy an alignment up to segments.

Reported-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>

Message-Id: <1461585338-45863-1-git-send-email-dingel@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:42 +02:00
Paolo Bonzini
df43d49cb8 hw: clean up hw/hw.h includes
Include qom/object.h and exec/memory.h instead of exec/ioport.h;
exec/ioport.h was almost everywhere required only for those two
includes, not for the content of the header itself.

Remove block/aio.h, everybody is already including it through
another path.

With this change, include/hw/hw.h is freed from qemu-common.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:30 +02:00
Paolo Bonzini
63c915526d cpu: move exec-all.h inclusion out of cpu.h
exec-all.h contains TCG-specific definitions.  It is not needed outside
TCG-specific files such as translate.c, exec.c or *helper.c.

One generic function had snuck into include/exec/exec-all.h; move it to
include/qom/cpu.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
33c11879fd qemu-common: push cpu.h inclusion out of qemu-common.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
741da0d38b hw: cannot include hw/hw.h from user emulation
All qdev definitions are available from other headers, user-mode
emulation does not need hw/hw.h.

By considering system emulation only, it is simpler to disentangle
hw/hw.h from NEED_CPU_H.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Emilio G. Cota
89fee74a0f tb: consistently use uint32_t for tb->flags
We are inconsistent with the type of tb->flags: usage varies loosely
between int and uint64_t. Settle to uint32_t everywhere, which is
superior to both: at least one target (aarch64) uses the most significant
bit in the u32, and uint64_t is wasteful.

Compile-tested for all targets.

Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1460049562-23517-1-git-send-email-cota@braap.org>
2016-05-12 14:06:40 -10:00
Marc-André Lureau
85bc2a1512 memory: fix segv on qemu_ram_free(block=0x0)
Since f1060c55bf, the pointer is directly passed to
qemu_ram_free(). However, on initialization failure, it may be called
with a NULL pointer. Return immediately in this case.

This fixes a SEGV when memory initialization failed, for example
permission denied on open backing store /dev/hugepages, with -object
memory-backend-file,mem-path=/dev/hugepages.

Program received signal SIGSEGV, Segmentation fault.
0x00005555556e67e7 in qemu_ram_free (block=0x0) at /home/elmarco/src/qemu/exec.c:1775

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1459250451-29984-1-git-send-email-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-04-05 11:46:52 +02:00
Paolo Bonzini
5c3ece79cd exec: fix error handling in file_ram_alloc
One instance of double closing, and invalid close(-1) in some cases
of "goto error".

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:18 +01:00
Veronia Bahaa
f348b6d1a5 util: move declarations out of qemu-common.h
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)

Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:17 +01:00
Markus Armbruster
da34e65cb4 include/qemu/osdep.h: Don't include qapi/error.h
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef.  Since then, we've moved to include qemu/osdep.h
everywhere.  Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h.  That's in excess of
100KiB of crap most .c files don't actually need.

Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h.  Include qapi/error.h in .c files that need it and don't
get it now.  Include qapi-types.h in qom/object.h for uint16List.

Update scripts/clean-includes accordingly.  Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h.  Update the list of includes in the qemu/osdep.h
comment quoted above similarly.

This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third.  Unfortunately, the number depending on
qapi-types.h shrinks only a little.  More work is needed for that one.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:15 +01:00
Paolo Bonzini
39c350ee12 exec: fix early return from ram_block_add
After reporting an error, ram_block_add was going on with the registration
of the RAMBlock.  The visible effect is that it unlocked the ramlist
mutex twice.

Fixes: 528f46af6e
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:33 +01:00
Markus Armbruster
e1fb647199 exec: Fix memory allocation when memory path isn't on hugetlbfs
gethugepagesize() works reliably only when its argument is on
hugetlbfs.  When it's not, it returns the filesystem's "optimal
transfer block size", which may or may not be the actual page size
you'll get when you mmap().

If the value is too small or not a power of two, we fail
qemu_ram_mmap()'s assertions.  These were added in commit 794e8f3
(v2.5.0).  The bug's impact before that is currently unknown.  Seems
fairly unlikely at least when the normal page size is 4KiB.

Else, if the value is too large, we align more strictly than
necessary.

gethugepagesize() goes back to commit c902760 (v0.13).  That commit
clearly intended gethugepagesize() to be used on hugetlbfs only.  Not
only was it named accordingly, it also printed a warning when used on
anything else.  However, the commit neglected to spell out the
restriction in user documentation of -mem-path.

Commit bfc2a1a (v2.5.0) dropped the warning as bogus "because QEMU
functions perfectly well with the path on a regular tmpfs filesystem".
It sure does when you're sufficiently lucky.  In my testing, I was
lucky, too.

Fix by switching to qemu_fd_getpagesize().  Rename the variable
holding its result from hpagesize to page_size.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1457378754-21649-3-git-send-email-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:33 +01:00
Markus Armbruster
fd97fd4408 exec: Fix memory allocation when memory path names new file
Commit 8d31d6b extended file_ram_alloc() to accept file names in
addition to directory names.  Even though it passes O_CREAT to open(),
it actually works only for existing files.  Reproducer adapted from
the commit's qemu-doc.texi update:

    $ qemu-system-x86_64 -object memory-backend-file,size=2M,mem-path=/dev/hugepages/my-shmem-file,id=mb1
    qemu-system-x86_64: -object memory-backend-file,size=2M,mem-path=/dev/hugepages/my-shmem-file,id=mb1: failed to get page size of file /dev/hugepages/my-shmem-file: No such file or directory

This is because we first get the page size for @path, then open the
actual file.  Unwise even before the flawed commit, because the
directory could change in between, invalidating the page size.
Unlikely to bite in practice.

Rearrange the code to create the file (if necessary) before getting
its page size.  Carefully avoid TOCTTOU conditions with a method
suggested by Paolo Bonzini.

While there, replace "hugepages" by "guest RAM" in error messages,
because host memory backends can be used for purposes other than huge
pages, e.g. /dev/shm/ shared memory.  Help text of -mem-path agrees.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1457378754-21649-2-git-send-email-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:33 +01:00
Fam Zheng
729633c2bc exec: Introduce AddressSpaceDispatch.mru_section
Under heavy workloads the lookup will likely end up with the same
MemoryRegionSection from last time. Using a pointer to cache the result,
like ram_list.mru_block, significantly reduces cost of
address_space_translate.

During address space topology update, as->dispatch will be reallocated
so the pointer is invalidated automatically.

Perf reports a visible drop on the cpu usage, because phys_page_find is
not called.  Before:

   2.35%  qemu-system-x86_64       [.] phys_page_find
   0.97%  qemu-system-x86_64       [.] address_space_translate_internal
   0.95%  qemu-system-x86_64       [.] address_space_translate
   0.55%  qemu-system-x86_64       [.] address_space_lookup_region

After:

   0.97%  qemu-system-x86_64       [.] address_space_translate_internal
   0.97%  qemu-system-x86_64       [.] address_space_lookup_region
   0.84%  qemu-system-x86_64       [.] address_space_translate

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-8-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:37 +01:00
Fam Zheng
29cb533d8c exec: Factor out section_covers_addr
This will be shared by the next patch.

Also add a comment explaining the unobvious condition on "size.hi".

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-7-git-send-email-famz@redhat.com>
[Small change to the comment. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:37 +01:00
Fam Zheng
f1060c55bf exec: Pass RAMBlock pointer to qemu_ram_free
The only caller now knows exactly which RAMBlock to free, so it's not
necessary to do the lookup.

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-6-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:37 +01:00
Fam Zheng
8e41fb63c5 memory: Drop MemoryRegion.ram_addr
All references to mr->ram_addr are replaced by
memory_region_get_ram_addr(mr) (except for a few assertions that are
replaced with mr->ram_block).

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-5-git-send-email-famz@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:29 +01:00
Fam Zheng
0a75601853 memory: Move assignment to ram_block to memory_region_init_*
We don't force "const" qualifiers with pointers in QEMU, but it's still
good to keep a clean function interface. Assigning to mr->ram_block is
in this sense ugly - one initializer mutating its owning object's state.

Move it to memory_region_init_*, where mr->ram_addr is assigned.

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-3-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:18:28 +01:00
Fam Zheng
528f46af6e exec: Return RAMBlock pointer from allocating functions
Previously we return RAMBlock.offset; now return the pointer to the
whole structure.

ram_block_add returns void now, error is completely passed with errp.

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-2-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:18:28 +01:00
Gonglei
3655cb9c73 memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length
these two functions consume too much cpu overhead to
find the RAMBlock by ram address.

After this patch, we can pass the RAMBlock pointer
to them so that they don't need to find the RAMBlock
anymore most of the time. We can get better performance
in address translation processing.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1455935721-8804-3-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:29 +01:00
Gonglei
58eaa2174e exec: store RAMBlock pointer into memory region
Each RAM memory region has a unique corresponding RAMBlock.
In the current realization, the memory region only stored
the ram_addr which means the offset of RAM address space,
We need to qurey the global ram.list to find the ram block
by ram_addr if we want to get the ram block, which is very
expensive.

Now, we store the RAMBlock pointer into memory region
structure. So, if we know the mr, we can easily get the
RAMBlock.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1456130097-4208-2-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:26 +01:00
Peter Maydell
fc1ec1acff trivial patches for 2016-02-11
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWvHuEAAoJEL7lnXSkw9fb2lAIAIMkxyMS6xtUFlASAA8yhPJe
 iWoWX8okweraRZKdjinG15dNsRMzOywL6zActb+roNHJ1+RICOpHYhpUtp1P5Kpd
 gGYq3ry9HoVZZLxw1dGlJoLZVwDGdkAZmMFl1nOo6vq3Rb0GR7Et6NS//4nzkDtp
 XcNrsOGbfq0ABo/sYvQMHoWgSfnGU/2mkiMDNQs5FRWtCnUXgRRlvoEuolUPzkDS
 SHovLVYVIFB9okLgYpUhj2NmvDCnAlGcOL0rBb7hGvnKgZu+vT2kFIej6InrLZGX
 hVcptzX4Msv2E40wVB2y2QyU0GyWg1CjN6J39UAxA/a0q5lh/fm6EZbtAeiSZ+8=
 =dGSI
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2016-02-11' into staging

trivial patches for 2016-02-11

# gpg: Signature made Thu 11 Feb 2016 12:16:04 GMT using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"

* remotes/mjt/tags/pull-trivial-patches-2016-02-11:
  w32: include winsock2.h before windows.h
  Adds keycode 86 to the hid_usage_keys translation table.
  s390x: remove s390-zipl.rom
  Passthru CCID card: QOMify
  Emulated CCID card: QOMify
  ES1370: QOMify
  char: fix parameter name / type in BSD codepath
  qmp-spec: fix index in doc
  rdma: remove check on time_spent when calculating mbs
  qemu-sockets: simplify error handling
  cpu: cpu_save/cpu_load is no more
  qom: Correct object_property_get_int() description
  man: virtfs-proxy-helper: Rework awkward sentence
  remove libtool support

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-11 15:09:33 +00:00
Paolo Bonzini
945123a554 cpu: cpu_save/cpu_load is no more
Everything has been converted to vmstate.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:46 +03:00
Sergey Fedorov
568496c0c0 cpu: Add callback to check architectural watchpoint match
When QEMU watchpoint matches, that is not definitely an architectural
watchpoint match yet. If it is a stop-before-access watchpoint then that
is hardly possible to ignore it after throwing a TCG exception.

A special callback is introduced to check for architectural watchpoint
match before raising a TCG exception.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1454256948-10485-2-git-send-email-serge.fdrv@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-11 11:17:32 +00:00
Stefan Hajnoczi
5b82b703b6 memory: RCU ram_list.dirty_memory[] for safe RAM hotplug
Although accesses to ram_list.dirty_memory[] use atomics so multiple
threads can safely dirty the bitmap, the data structure is not fully
thread-safe yet.

This patch handles the RAM hotplug case where ram_list.dirty_memory[] is
grown.  ram_list.dirty_memory[] is change from a regular bitmap to an
RCU array of pointers to fixed-size bitmap blocks.  Threads can continue
accessing bitmap blocks while the array is being extended.  See the
comments in the code for an in-depth explanation of struct
DirtyMemoryBlocks.

I have tested that live migration with virtio-blk dataplane works.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1453728801-5398-2-git-send-email-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Paolo Bonzini
508127e243 log: do not unnecessarily include qom/cpu.h
Split the bits that require it to exec/log.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1452174932-28657-8-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:10 +00:00
Peter Maydell
7b31bbc2e6 exec: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-4-git-send-email-peter.maydell@linaro.org
2016-01-29 15:07:22 +00:00
Peter Maydell
0b0571dd24 Xen 2016/01/21
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJWoQ5KAAoJEIlPj0hw4a6QP6sP/01U66Fv7ZzqxnV6U/6hJkOG
 X11S6KUHVdNoLyMB4RCyOV/zsF16ODZ9A1PI+qeq/1Po4zNLASWYAdWR7OinCiis
 ad5QGmHY2JmzLm2x8ivWZR1ZqQ+PTRWFH7eEFEROaI/IEyG1wL4bTkMLB0L6Ih74
 SMnMJg3Rkl8XhxdvVuE5JZ4f4ZTyPIk+0daMXIH9Q58XspblVNRjKAbotjte/zrj
 XmCIxVfu29NOIKD3F1n0Cw29OqCuyofbxWHk+SwT68fM8M8KcdnX1WGmfOXylXod
 JP0j2NRN07LgMfJv1K+QXPSNlFOZAMlzzXpOAnbb2AJceTTMMwTdkQb6aahFfMEL
 eyWabU+ZI8gemFePgWWdOipkrqtWlGvdyFLKLv42CR9jhVGNck8SBt01njLcOEsf
 TZjsuzPVxMmQvSYr7xcZgIFKwWkt3yUpOAKl6KS5PlerIezpJ1MtmB1ZmFF+Caui
 kGpC1tfIgdu3VHdlqASlc50BsAeqTdGzXI+KxTE/6raOnn+aUVIXrUzcdgV+Tgby
 52Fd9y83X65RXIgasNIvNpUEX+jc7FYdrBaO2graSBzpCWAzituyypOk4WEpHxIn
 da64hN9Z3i4BzLDZtaC05B8A0iWpckLOwbVWK1zblsdiJJAaOFVAU9cNl2Plxm8j
 cy8WC0FdEqLZxXpU0deB
 =+hRh
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20160121' into staging

Xen 2016/01/21

# gpg: Signature made Thu 21 Jan 2016 16:58:50 GMT using RSA key ID 70E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"

* remotes/sstabellini/tags/xen-20160121:
  Xen PCI passthru: convert to realize()
  Add Error **errp for xen_pt_config_init()
  Add Error **errp for xen_pt_setup_vga()
  Add Error **errp for xen_host_pci_device_get()
  Xen: use qemu_strtoul instead of strtol
  Change xen_host_pci_sysfs_path() to return void
  xen-pvdevice: convert to realize()
  xen-hvm: Clean up xen_ram_alloc() error handling
  xen-hvm: Clean up xen_hvm_init() error handling
  xenfb.c: avoid expensive loops when prod <= out_cons
  MAINTAINERS: update Xen files

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-01-21 17:21:08 +00:00
Peter Crosthwaite
6731d864f8 qom/cpu: Add MemoryRegion property
Add a MemoryRegion property, which if set is used to construct
the CPU's initial (default) AddressSpace.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[PMM: code is moved from qom/cpu.c to exec.c to avoid having to
 make qom/cpu.o be a non-common object file; code to use the
 MemoryRegion and to default it to system_memory added.]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:06 +00:00
Peter Maydell
79ed041647 exec.c: Use correct AddressSpace in watch_mem_read and watch_mem_write
In the watchpoint access routines watch_mem_read and watch_mem_write,
find the correct AddressSpace to use from current_cpu and the memory
transaction attributes, rather than always assuming address_space_memory.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:06 +00:00
Peter Maydell
5232e4c798 exec.c: Use cpu_get_phys_page_attrs_debug
Use cpu_get_phys_page_attrs_debug() when doing virtual-to-physical
conversions in debug related code, so that we can obtain the right
address space index and thus select the correct AddressSpace,
rather than always using cpu->as.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:06 +00:00
Peter Maydell
651a5bc037 exec.c: Add cpu_get_address_space()
Add a function to return the AddressSpace for a CPU based on
its numerical index. (Callers outside exec.c don't have access
to the CPUAddressSpace struct so can't just fish it out of the
CPUState struct directly.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:05 +00:00
Peter Maydell
a54c87b68a exec.c: Pass MemTxAttrs to iotlb_to_region so it uses the right AS
Pass the MemTxAttrs for the memory access to iotlb_to_region(); this
allows it to determine the correct AddressSpace to use for the lookup.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:05 +00:00
Peter Maydell
d7898cda81 cputlb.c: Use correct address space when looking up MemoryRegionSection
When looking up the MemoryRegionSection for the new TLB entry in
tlb_set_page_with_attrs(), use cpu_asidx_from_attrs() to determine
the correct address space index for the lookup, and pass it into
address_space_translate_for_iotlb().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:05 +00:00
Peter Maydell
12ebc9a76d exec.c: Allow target CPUs to define multiple AddressSpaces
Allow multiple calls to cpu_address_space_init(); each
call adds an entry to the cpu->ases array at the specified
index. It is up to the target-specific CPU code to actually use
these extra address spaces.

Since this multiple AddressSpace support won't work with
KVM, add an assertion to avoid confusing failures.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:04 +00:00
Peter Maydell
56943e8cc1 exec.c: Don't set cpu->as until cpu_address_space_init
Rather than setting cpu->as unconditionally in cpu_exec_init
(and then having target-i386 override this later), don't set
it until the first call to cpu_address_space_init.

This requires us to initialise the address space for
both TCG and KVM (KVM doesn't need the AS listener but
it does require cpu->as to be set).

For target CPUs which don't set up any address spaces (currently
everything except i386), add the default address_space_memory
in qemu_init_vcpu().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-01-21 14:15:04 +00:00
Markus Armbruster
37aa7a0e2f xen-hvm: Clean up xen_ram_alloc() error handling
xen_ram_alloc() dies with hw_error() on error, even though its caller
ram_block_add() handles errors just fine.  Add an Error **errp
parameter and use it.

Leave case RUN_STATE_INMIGRATE alone, because that looks like some
kind of warning.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2016-01-14 16:49:50 +00:00
Tetsuya Mukawa
56a571d9c8 ivshmem: Store file descriptor for vhost-user negotiation
If virtio-net driver allocates memory in ivshmem shared memory,
vhost-net will work correctly, but vhost-user will not work because
a fd of shared memory will not be sent to vhost-user backend.
This patch fixes ivshmem to store file descriptor of shared memory.
It will be used when vhost-user negotiates vhost-user backend.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-01-09 23:20:20 +02:00
Paolo Bonzini
3cc8f88499 memory: try to inline constant-length reads
memcpy can take a large amount of time for small reads and writes.
Handle the common case of reading s/g descriptors from memory (there
is no corresponding "write" case that is as common, because writes
often use address_space_st* functions) by inlining the relevant
parts of address_space_read into the caller.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
a203ac702e memory: extract first iteration of address_space_read and address_space_write
We want to inline the case where there is only one iteration, because
then the compiler can also inline the memcpy.  As a start, extract
everything after the first address_space_translate call.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
eb7eeb8862 memory: split address_space_read and address_space_write
Rather than dispatching on is_write for every iteration, make
address_space_rw call one of the two functions.  The amount of
duplicate logic is pretty small, and memory_access_is_direct can
be tweaked so that it inlines better in the callers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
Paolo Bonzini
e81bcda529 exec: make qemu_ram_ptr_length more similar to qemu_get_ram_ptr
Notably, use qemu_get_ram_block to enjoy the MRU optimization.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
Paolo Bonzini
49b24afcb1 exec: always call qemu_get_ram_ptr within rcu_read_lock
Simplify the code and document the assumption.  The only caller
that is not within rcu_read_lock is memory_region_get_ram_ptr.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
Paolo Bonzini
013a29424c qemu-log: introduce qemu_log_separate
In some cases, the same message is printed both on stderr and in the log.
Avoid duplicate output in the default case where stderr _is_ the log,
and standardize this to stderr+log where it used to use stdio+log.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:47 +01:00
Eduardo Habkost
2f3a2bb15e exec: Remove unnecessary RAM_FILE flag
The only code that sets RAMBlock.fd is file_ram_alloc(), and the only
code that calls file_ram_alloc() sets the RAM_FILE flag. That means the
flag is always set when RAMBlock.fd >= 0, and the munmap() call at
reclaim_ramblock() is dead code that never runs.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1446847881-9385-1-git-send-email-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 15:24:34 +01:00
Eduardo Habkost
a29ac16632 exec: Eliminate qemu_ram_free_from_ptr()
Replace qemu_ram_free_from_ptr() with qemu_ram_free().

The only difference between qemu_ram_free_from_ptr() and
qemu_ram_free() is that g_free_rcu() is used instead of
call_rcu(reclaim_ramblock). We can safely replace it because:

* RAM blocks allocated by qemu_ram_alloc_from_ptr() always have
  RAM_PREALLOC set;
* reclaim_ramblock(block) will do nothing except g_free(block)
  if RAM_PREALLOC is set at block->flags.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1446844805-14492-2-git-send-email-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 15:24:33 +01:00
Don Slutz
55b4e80b04 exec: Stop using memory after free
memory_region_unref(mr) can free memory.

For example I got:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f43280d4700 (LWP 4462)]
0x00007f43323283c0 in phys_section_destroy (mr=0x7f43259468b0)
    at /home/don/xen/tools/qemu-xen-dir/exec.c:1023
1023        if (mr->subpage) {
(gdb) bt
    at /home/don/xen/tools/qemu-xen-dir/exec.c:1023
    at /home/don/xen/tools/qemu-xen-dir/exec.c:1034
    at /home/don/xen/tools/qemu-xen-dir/exec.c:2205
(gdb) p mr
$1 = (MemoryRegion *) 0x7f43259468b0

And this change prevents this.

Signed-off-by: Don Slutz <Don.Slutz@Gmail.com>
Message-Id: <1448921464-21845-1-git-send-email-Don.Slutz@Gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-02 12:01:43 +01:00
Daniel P. Berrange
bfc2a1a1f4 exec: remove warning about mempath and hugetlbfs
The gethugepagesize() method in exec.c printed a warning if
the file path for "-mem-path" or "-object memory-backend-file"
was not on a hugetlbfs filesystem. This warning is bogus, because
QEMU functions perfectly well with the path on a regular tmpfs
filesystem. Use of hugetlbfs vs tmpfs is a choice for the management
application or end user to make as best fits their needs. As such it
is inappropriate for QEMU to have an opinion on whether the user's
choice is right or wrong in this case.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1448448749-1332-3-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-26 16:47:44 +01:00
Daniel P. Berrange
2c189a4e12 Revert "exec: silence hugetlbfs warning under qtest"
This reverts commit 1c7ba94a18.

That commit changed QEMU initialization order from

 - object-initial, chardev, qtest, object-late

to

 - chardev, qtest, object-initial, object-late

This breaks chardev setups which need to rely on objects
having been created. For example, when chardevs use TLS
encryption in the future, they need to have tls credential
objects created first.

This revert, restores the ordering introduced in

  commit f08f9271bf
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Wed May 13 17:14:04 2015 +0100

    vl: Create (most) objects before creating chardev backends

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1448448749-1332-2-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-26 16:47:44 +01:00
Marc-André Lureau
1c7ba94a18 exec: silence hugetlbfs warning under qtest
vhost-user-test prints a warning. A test should not need to run on
hugetlbfs, let's silence the warning under qtest. The
condition can't check on qtest_enabled() since vhost-user-test actually
doesn't use qtest accel. However, qtest_driver() can be used, if
qtest_init() is called early enough. For that reason, move chardev and
qtest initialization early.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-19 15:26:05 +02:00
Dr. David Alan Gilbert
4ed023ce2a Round up RAMBlock sizes to host page sizes
RAMBlocks that are not a multiple of host pages in length
cause problems for postcopy (I've seen an ACPI table on aarch64
be 5k in length - i.e. 5x target-page), so round RAMBlock sizes
up to a host-page.

This potentially breaks migration compatibility due to changes
in RAMBlock sizes; however:
   1) x86 and s390 I think always have host=target page size
   2) When I've tried on Power the block sizes already seem aligned.
   3) I don't think there's anything else that maintains per-version
      machine-types for compatibility.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:28 +01:00
Dr. David Alan Gilbert
e3dd74934f qemu_ram_block_by_name
Add a function to find a RAMBlock by name; use it in two
of the places that already open code that loop; we've
got another use later in postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 14:51:48 +01:00
Dr. David Alan Gilbert
422148d3e5 qemu_ram_block_from_host
Postcopy sends RAMBlock names and offsets over the wire (since it can't
rely on the order of ramaddr being the same), and it starts out with
HVA fault addresses from the kernel.

qemu_ram_block_from_host translates a HVA into a RAMBlock, an offset
in the RAMBlock and the global ram_addr_t value.

Rewrite qemu_ram_addr_from_host to use qemu_ram_block_from_host.

Provide qemu_ram_get_idstr since its the actual name text sent on the
wire.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 14:51:48 +01:00
Dr. David Alan Gilbert
038629a699 Provide runtime Target page information
The migration code generally is built target-independent, however
there are a few places where knowing the target page size would
avoid artificially moving stuff into migration/ram.c.

Provide 'qemu_target_page_bits()' that returns TARGET_PAGE_BITS
to other bits of code so that they can stay target-independent.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 14:51:48 +01:00
Paolo Bonzini
68851b98e5 exec: avoid unnecessary cacheline bounce on ram_list.mru_block
Whenever the MRU cache hits for the list of RAM blocks, qemu_get_ram_block
does an unnecessary write that causes a processor cache line to bounce
from one core to another.  This causes a performance hit.

Reported-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-11-06 15:42:38 +03:00
Peter Maydell
9319738080 So here it is, let's see what happens.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJWPHM6AAoJEL/70l94x66DK5YIAJTNthYWL8eNhQ1iek6CLlV+
 etVXm3JDmkV0zOfYVHLBb44VLZ6I1ocas+57F/kmz7SKpMLiI6bMXRxhTSkiO4D+
 3N36cWQf3fq+P0DmxuikMlYGz8V6QQ5PQE2xJKV0ZIWAkiqInxilkN3qt81sNR+A
 A9Ohom3sc0eGHyYJcVDK4krbnNSAZjIB2yMWperw61x+GYAhxjA02HPUgB32KK6q
 KrdnKmnRu9Cw6y4wTCbbDITJztPexZYsX2DOJh30wC0eNcE+MZ7J2im8Frpxe+Ml
 C8MUuvSqLOyeu9tUfrXGzd6kMtEKrmU+fh2nNbxJbtfowDjkW2jcIEgC0UjkGE4=
 =BF1q
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream-replay' into staging

So here it is, let's see what happens.

# gpg: Signature made Fri 06 Nov 2015 09:30:34 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream-replay:
  replay: recording of the user input
  replay: command line options
  replay: replay blockers for devices
  replay: initialization and deinitialization
  replay: ptimer
  bottom halves: introduce bh call function
  replay: checkpoints
  icount: improve counting for record/replay
  replay: shutdown event
  replay: recording and replaying clock ticks
  replay: asynchronous events infrastructure
  replay: interrupts and exceptions
  cpu: replay instructions sequence
  cpu-exec: allow temporary disabling icount
  replay: introduce icount event
  replay: introduce mutex to protect the replay log
  replay: internal functions for replay log
  replay: global variables and function stubs

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-11-06 11:31:40 +00:00
Pavel Dovgalyuk
7615936ebf replay: initialization and deinitialization
This patch introduces the functions for enabling the record/replay and for
freeing the resources when simulator closes.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162507.8676.90232.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2015-11-06 10:16:03 +01:00
Pavel Fedin
8d31d6b65a backends/hostmem-file: Allow to specify full pathname for backing file
This allows to explicitly specify file name to use with the backend. This
is important when using it together with ivshmem in order to make it backed
by hugetlbfs. By default filename is autogenerated using mkstemp(), and the
file is unlink()ed after creation, effectively making it anonymous. This is
not very useful with ivshmem because it ends up in a memory which cannot be
accessed by something else.

Distinction between directory and file name is done by stat() check. If an
existing directory is given, the code keeps old behavior. Otherwise it
creates or opens a file with the given pathname.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Tested-by: Igor Skalkin <i.skalkin@samsung.com>
Message-Id: <004301d11166$9672fe30$c358fa90$@samsung.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-04 15:56:05 +01:00
Paolo Bonzini
680a4783dc memory: call begin, log_start and commit when registering a new listener
This ensures that cpu_reload_memory_map() is called as soon as
tcg_cpu_address_space_init() is called, and before cpu->memory_dispatch
is used.  qemu-system-s390x never changes the address spaces after
tcg_cpu_address_space_init() is called, and thus tcg_commit() is never
called.  This causes a SIGSEGV.

Because memory_map_init() will now call mem_commit(), we have to
initialize io_mem_* before address_space_memory and friends.

Reported-by: Philipp Kern <pkern@debian.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: 0a1c71cec6
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-04 15:56:01 +01:00
Igor Mammedov
cc57501dee file_ram_alloc: propagate error to caller instead of terminating QEMU
QEMU shouldn't exits from file_ram_alloc() if -mem-prealloc option is specified
and "object_add memory-backend-file,..." fails allocation during memory hotplug.

Propagate error to a caller and let it decide what to do with allocation failure.
That leaves QEMU alive if it can't create backend during hotplug time and
kills QEMU at startup time if backends or initial memory were misconfigured/
too large.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1445274671-17704-1-git-send-email-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-02 14:50:27 +01:00
Peter Maydell
ca3e40e233 vhost, pc, virtio features, fixes, cleanups
New features:
     VT-d support for devices behind a bridge
     vhost-user migration support
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWKMrnAAoJECgfDbjSjVRpVL0H/iRc31o00QE4nWBRpxUpf8WJ
 V5RWE8qKkDgBha5bS5Nt4vs8K4jkkHGXCbmygMidWph96hUPK8/yHy1A/wmpBibB
 5hVSPDK8onavNGJwpaWDrkhd9OhKAaKOuu49T6+VWJGZY/uX5ayqmcN934y0NPUa
 4EhH5tyxPpYOYeW9i/VOMQ374gCJcpzYBMug4NJZRyFpfz/b2mzAQtoqw3EsPtB0
 vpVJ+fKiCyG39HFKQJW7cL12yBeXOoyhjfDxpumLqwLWMfmde+vJwTFx6wbechgV
 aU3jIdvUX8wHCNYaB937NsMaDALoGNqUjbpKnf+xD1w7xr9pwTzdyrGH3rpGLEE=
 =+G1+
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

vhost, pc, virtio features, fixes, cleanups

New features:
    VT-d support for devices behind a bridge
    vhost-user migration support

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 22 Oct 2015 12:39:19 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (37 commits)
  hw/isa/lpc_ich9: inject the SMI on the VCPU that is writing to APM_CNT
  i386: keep cpu_model field in MachineState uptodate
  vhost: set the correct queue index in case of migration with multiqueue
  piix: fix resource leak reported by Coverity
  seccomp: add memfd_create to whitelist
  vhost-user-test: check ownership during migration
  vhost-user-test: add live-migration test
  vhost-user-test: learn to tweak various qemu arguments
  vhost-user-test: wrap server in TestServer struct
  vhost-user-test: remove useless static check
  vhost-user-test: move wait_for_fds() out
  vhost: add migration block if memfd failed
  vhost-user: use an enum helper for features mask
  vhost user: add rarp sending after live migration for legacy guest
  vhost user: add support of live migration
  net: add trace_vhost_user_event
  vhost-user: document migration log
  vhost: use a function for each call
  vhost-user: add a migration blocker
  vhost-user: send log shm fd along with log_base
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-10-22 12:41:44 +01:00
Michael S. Tsirkin
794e8f301a exec: factor out duplicate mmap code
Anonymous and file-backed RAM allocation are now almost exactly the same.

Reduce code duplication by moving RAM mmap code out of oslib-posix.c and
exec.c.

Reported-by: Marc-André Lureau <mlureau@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
2015-10-21 09:24:44 +03:00
Peter Maydell
32857f4d5e exec.c: Collect AddressSpace related fields into a CPUAddressSpace struct
Gather up all the fields currently in CPUState which deal with the CPU's
AddressSpace into a separate CPUAddressSpace struct. This paves the way
for allowing the CPU to know about more than one AddressSpace.

The rearrangement also allows us to make the MemoryListener a directly
embedded object in the CPUAddressSpace (it could not be embedded in
CPUState because 'struct MemoryListener' isn't defined for the user-only
builds). This allows us to resolve the FIXME in tcg_commit() by going
directly from the MemoryListener to the CPUAddressSpace.

This patch extracts the actual update of the cached dispatch pointer
from cpu_reload_memory_map() (which is renamed accordingly to
cpu_reloading_memory_map() as it is only responsible for breaking
cpu-exec.c's RCU critical section now). This lets us keep the definition
of the CPUAddressSpace struct private to exec.c.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1443709790-25180-4-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-12 18:29:26 +02:00
Peter Maydell
0a1c71cec6 exec.c: Don't call cpu_reload_memory_map() from cpu_exec_init()
Currently we call cpu_reload_memory_map() from cpu_exec_init(),
but this is not necessary:
 * KVM doesn't use the data structures maintained by
   cpu_reload_memory_map() (the TLB and cpu->memory_dispatch)
 * for TCG, we will call this function via tcg_commit() either
   as soon as tcg_cpu_address_space_init() registers the listener,
   or when the first MemoryRegion is added to the AddressSpace
   if the AS is empty when we register the listener

The unnecessary call is awkward for adding support for multiple
address spaces per CPU, so drop it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Message-Id: <1443709790-25180-2-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-12 18:29:25 +02:00
Michael S. Tsirkin
8561c9244d exec: allocate PROT_NONE pages on top of RAM
This inserts a read and write protected page between RAM and QEMU
memory, for file-backend RAM.
This makes it harder to exploit QEMU bugs resulting from buffer
overflows in devices using variants of cpu_physical_memory_map,
dma_memory_map etc.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01 16:16:52 +03:00
Peter Crosthwaite
dfccc76023 include/exec: Move cputlb exec.c defs out
Move the architecture agnostic function prototypes for exec.c out of
cputlb.h to exec-all.h. This allows hiding of the arch specific
cputlb.h from exec.c which should be getting close to having no
architecture specifics. Prepares support for multi-arch, which will have
a minimal cpu.h that services exec.c but not cputlb.h.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <b4fe754c58c860315e35d44430c26b1c967ce2c9.1441614289.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-16 17:33:33 +02:00
Peter Crosthwaite
bcae01e468 cputlb: Change tlb_set_dirty() arg to cpu
Change tlb_set_dirty() to accept a CPU instead of an env pointer. This
allows for removal of another CPUArchState usage from prototypes that
need to be QOMified.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <d2b1dcbe7945112989861d8ba7369449c11cc273.1441614289.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-16 17:33:33 +02:00
Peter Crosthwaite
9a13565d52 cputlb: move CPU_LOOP() for tlb_reset() to exec.c
To prepare for multi-arch, cputlb.c should only have awareness of one
single architecture. This means it should not have access to the full
CPU lists which may be heterogeneous. Instead, push the CPU_LOOP() up
to the one and only caller in exec.c.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <db06dc6c49f8970caaf116d0385f00ee10a56f2f.1441614289.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-16 17:33:33 +02:00
Andrey Smetanin
bac05aa9a7 cpu: Add crash_occurred flag into CPUState
CPUState::crash_occurred field inside CPUState marks
that guest crash occurred. This value is added into
cpu common migration subsection.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas Färber <afaerber@suse.de>
Message-Id: <1435924905-8926-12-git-send-email-den@openvz.org>
[Document the new field. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-16 17:33:32 +02:00
Peter Maydell
a2aa09e181 * Support for jemalloc
* qemu_mutex_lock_iothread "No such process" fix
 * cutils: qemu_strto* wrappers
 * iohandler.c simplification
 * Many other fixes and misc patches.
 
 And some MTTCG work (with Emilio's fixes squashed):
 * Signal-free TCG kick
 * Removing spinlock in favor of QemuMutex
 * User-mode emulation multi-threading fixes/docs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJV8Tk7AAoJEL/70l94x66Ds3QH/3bi0RRR2NtKIXAQrGo5tfuD
 NPMu1K5Hy+/26AC6mEVNRh4kh7dPH5E4NnDGbxet1+osvmpjxAjc2JrxEybhHD0j
 fkpzqynuBN6cA2Gu5GUNoKzxxTmi2RrEYigWDZqCftRXBeO2Hsr1etxJh9UoZw5H
 dgpU3j/n0Q8s08jUJ1o789knZI/ckwL4oXK4u2KhSC7ZTCWhJT7Qr7c0JmiKReaF
 JEYAsKkQhICVKRVmC8NxML8U58O8maBjQ62UN6nQpVaQd0Yo/6cstFTZsRrHMHL3
 7A2Tyg862cMvp+1DOX3Bk02yXA+nxnzLF8kUe0rYo6llqDBDStzqyn1j9R0qeqA=
 =nB06
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Support for jemalloc
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.

And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs

# gpg: Signature made Thu 10 Sep 2015 09:03:07 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (44 commits)
  cutils: work around platform differences in strto{l,ul,ll,ull}
  cpu-exec: fix lock hierarchy for user-mode emulation
  exec: make mmap_lock/mmap_unlock globally available
  tcg: comment on which functions have to be called with mmap_lock held
  tcg: add memory barriers in page_find_alloc accesses
  remove unused spinlock.
  replace spinlock by QemuMutex.
  cpus: remove tcg_halt_cond and tcg_cpu_thread globals
  cpus: protect work list with work_mutex
  scripts/dump-guest-memory.py: fix after RAMBlock change
  configure: Add support for jemalloc
  add macro file for coccinelle
  configure: factor out adding disas configure
  vhost-scsi: fix wrong vhost-scsi firmware path
  checkpatch: remove tests that are not relevant outside the kernel
  checkpatch: adapt some tests to QEMU
  CODING_STYLE: update mixed declaration rules
  qmp: Add example usage of strto*l() qemu wrapper
  cutils: Add qemu_strtoull() wrapper
  cutils: Add qemu_strtoll() wrapper
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-14 16:13:16 +01:00
Paolo Bonzini
f240eb6fdc remove qemu/tls.h
TLS is now required on all platforms, so DECLARE_TLS/DEFINE_TLS is not
needed anymore.  Removing it does not break Windows because of the
previous patch.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-09 15:34:53 +02:00
Peter Maydell
6554f5c037 exec.c: Use pow2floor() rather than hand-calculation
Use pow2floor() to round down to the nearest power of 2,
rather than an inline calculation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1437741192-20955-5-git-send-email-peter.maydell@linaro.org
2015-09-07 14:19:00 +01:00
Chen Hanxiao
9284f31994 exec: use macro ROUND_UP for alignment
Use ROUND_UP instead.

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Message-Id: <1437707523-4910-1-git-send-email-chenhanxiao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-08-14 23:40:32 +02:00
Peter Maydell
0b8e2c1002 exec.c: Use atomic_rcu_read() to access dispatch in memory_region_section_get_iotlb()
When accessing the dispatch pointer in an AddressSpace within an RCU
critical section we should always use atomic_rcu_read(). Fix an
access within memory_region_section_get_iotlb() which was incorrectly
doing a direct pointer access.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1437391637-31576-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-23 07:37:38 +02:00
Peter Crosthwaite
4bad9e392e cpu: Change cpu_exec_init() arg to cpu, not env
The callers (most of them in target-foo/cpu.c) to this function all
have the cpu pointer handy. Just pass it to avoid an ENV_GET_CPU() from
core code (in exec.c).

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Anthony Green <green@moxielogic.com>
Cc: Jia Liu <proljc@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09 15:20:40 +02:00
Peter Crosthwaite
bbd77c180d translate-all: Change tb_flush() env argument to cpu
All of the core-code usages of this API have the cpu pointer handy so
pass it in. There are only 3 architecture specific usages (2 of which
are commented out) which can just use ENV_GET_CPU() locally to get the
cpu pointer. The reduces core code usage of the CPU env, which brings
us closer to common-obj'ing these core files.

Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09 15:20:40 +02:00
Bharata B Rao
b7bca73334 cpu: Convert cpu_index into a bitmap
Currently CPUState::cpu_index is monotonically increasing and a newly
created CPU always gets the next higher index. The next available
index is calculated by counting the existing number of CPUs. This is
fine as long as we only add CPUs, but there are architectures which
are starting to support CPU removal, too. For an architecture like PowerPC
which derives its CPU identifier (device tree ID) from cpu_index, the
existing logic of generating cpu_index values causes problems.

With the currently proposed method of handling vCPU removal by parking
the vCPU fd in QEMU
(Ref: http://lists.gnu.org/archive/html/qemu-devel/2015-02/msg02604.html),
generating cpu_index this way will not work for PowerPC.

This patch changes the way cpu_index is handed out by maintaining
a bit map of the CPUs that tracks both addition and removal of CPUs.

The CPU bitmap allocation logic is part of cpu_exec_init(), which is
called by instance_init routines of various CPU targets. Newly added
cpu_exec_exit() API handles the deallocation part and this routine is
called from generic CPU instance_finalize.

Note: This new CPU enumeration is for !CONFIG_USER_ONLY only.
CONFIG_USER_ONLY continues to have the old enumeration logic.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
[AF: max_cpus -> MAX_CPUMASK_BITS]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09 15:20:40 +02:00
Bharata B Rao
5a790cc4b9 cpu: Add Error argument to cpu_exec_init()
Add an Error argument to cpu_exec_init() to let users collect the
error. This is in preparation to change the CPU enumeration logic
in cpu_exec_init(). With the new enumeration logic, cpu_exec_init()
can fail if cpu_index values corresponding to max_cpus have already
been handed out.

Since all current callers of cpu_exec_init() are from instance_init,
use error_abort Error argument to abort in case of an error.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09 15:20:40 +02:00
Eduardo Habkost
291135b5da cpu: Reorder cpu->as, cpu->thread_id, cpu->memory_dispatch init
Instead of initializing cpu->as, cpu->thread_id, and reloading memory
map while holding cpu_list_lock(), do it earlier, before locking the CPU
list and initializing cpu_index.

This allows the code handling cpu_index and global CPU list to be
isolated from the rest.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09 15:20:39 +02:00
Eduardo Habkost
7c39163e38 cpu: Initialize breakpoint/watchpoint lists in cpu_common_initfn()
One small step in the simplification of cpu_exec_init().

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09 15:20:39 +02:00
Eduardo Habkost
199fc85acd cpu: No need to zero-initialize CPUState::numa_node
QOM objects are already zero-filled when instantiated, there's no need
to explicitly set numa_node to 0.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09 15:20:39 +02:00
Li Zhijian
dd63169766 migration: extend migration_bitmap
Prevously, if we hotplug a device(e.g. device_add e1000) during
migration is processing in source side, qemu will add a new ram
block but migration_bitmap is not extended.
In this case, migration_bitmap will overflow and lead qemu abort
unexpectedly.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-07-07 14:54:56 +02:00
Paolo Bonzini
b242e0e0e2 exec: skip MMIO regions correctly in cpu_physical_memory_write_rom_internal
Loading the BIOS in the mac99 machine is interesting, because there is a
PROM in the middle of the BIOS region (from 16K to 32K).  Before memory
region accesses were clamped, when QEMU was asked to load a BIOS from
0xfff00000 to 0xffffffff it would put even those 16K from the BIOS file
into the region.  This is weird because those 16K were not actually
visible between 0xfff04000 and 0xfff07fff.  However, it worked.

After clamping was added, this also worked.  In this case, the
cpu_physical_memory_write_rom_internal function split the write in
three parts: the first 16K were copied, the PROM area (second 16K) were
ignored, then the rest was copied.

Problems then started with commit 965eb2f (exec: do not clamp accesses
to MMIO regions, 2015-06-17).  Clamping accesses is not done for MMIO
regions because they can overlap wildly, and MMIO registers can be
expected to perform full-width accesses based only on their address
(with no respect for adjacent registers that could decode to completely
different MemoryRegions).  However, this lack of clamping also applied
to the PROM area!  cpu_physical_memory_write_rom_internal thus failed
to copy the third range above, i.e. only copied the first 16K of the BIOS.

In effect, address_space_translate is expecting _something else_ to do
the clamping for MMIO regions if the incoming length is large.  This
"something else" is memory_access_size in the case of address_space_rw,
so use the same logic in cpu_physical_memory_write_rom_internal.

Reported-by: Alexander Graf <agraf@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Fixes: 965eb2f
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-06 14:59:11 +02:00
Jan Kiszka
4840f10eff memory: let address_space_rw/ld*/st* run outside the BQL
The MMIO case is further broken up in two cases: if the caller does not
hold the BQL on invocation, the unlocked one takes or avoids BQL depending
on the locking strategy of the target memory region and its coalesced
MMIO handling.  In this case, the caller should not hold _any_ lock
(a friendly suggestion which is disregarded by virtio-scsi-dataplane).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Frederic Konrad <fred.konrad@greensocs.com>
Message-Id: <1434646046-27150-6-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-01 15:45:51 +02:00
Paolo Bonzini
125b380666 exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld*/st*
As memory_region_read/write_accessor will now be run also without BQL held,
we need to move coalesced MMIO flushing earlier in the dispatch process.

Cc: Frederic Konrad <fred.konrad@greensocs.com>
Message-Id: <1434646046-27150-5-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-01 15:45:50 +02:00
Paolo Bonzini
e4a511f8cc exec: clamp accesses against the MemoryRegionSection
Because the clamping was done against the MemoryRegion,
address_space_rw was effectively broken if a write spanned
multiple sections that are not linear in underlying memory
(with the memory not being under an IOMMU).

This is visible with the MIPS rc4030 IOMMU, which is implemented
as a series of alias memory regions that point to the actual RAM.

Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 12:27:14 +02:00
Paolo Bonzini
965eb2fcdf exec: do not clamp accesses to MMIO regions
It is common for MMIO registers to overlap, for example a 4 byte register
at 0xcf8 (totally random choice... :)) and a 1 byte register at 0xcf9.
If these registers are implemented via separate MemoryRegions, it is
wrong to clamp the accesses as the value written would be truncated.

Hence for these regions the effects of commit 23820db (exec: Respect
as_translate_internal length clamp, 2015-03-16, previously applied as
commit c3c1bb99) must be skipped.

Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 12:27:14 +02:00
Dr. David Alan Gilbert
e3807054e2 qemu_ram_foreach_block: pass up error value, and down the ramblock name
check the return value of the function it calls and error if it's non-0
Fixup qemu_rdma_init_one_block that is the only current caller,
  and rdma_add_block the only function it calls using it.

Pass the name of the ramblock to the function; helps in debugging.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Michael R. Hines <mrhines@us.ibm.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-06-12 06:54:01 +02:00
Juan Quintela
5cd8cadae8 migration: Use normal VMStateDescriptions for Subsections
We create optional sections with this patch.  But we already have
optional subsections.  Instead of having two mechanism that do the
same, we can just generalize it.

For subsections we just change:

- Add a needed function to VMStateDescription
- Remove VMStateSubsection (after removal of the needed function
  it is just a VMStateDescription)
- Adjust the whole tree, moving the needed function to the corresponding
  VMStateDescription

Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-06-12 06:53:57 +02:00
Stefan Hajnoczi
03eebc9e32 memory: replace cpu_physical_memory_reset_dirty() with test-and-clear
The cpu_physical_memory_reset_dirty() function is sometimes used
together with cpu_physical_memory_get_dirty().  This is not atomic since
two separate accesses to the dirty memory bitmap are made.

Turn cpu_physical_memory_reset_dirty() and
cpu_physical_memory_clear_dirty_range_type() into the atomic
cpu_physical_memory_test_and_clear_dirty().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-6-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:10:00 +02:00
Paolo Bonzini
e87f7778b6 exec: only check relevant bitmaps for cleanliness
Most of the time, not all bitmaps have to be marked as dirty;
do not do anything if the interesting ones are already dirty.
Previously, any clean bitmap would have cause all the bitmaps to be
marked dirty.

In fact, unless running TCG most of the time bitmap operations need
not be done at all, because memory_region_is_logging returns zero.
In this case, skip the call to cpu_physical_memory_range_includes_clean
altogether as well.

With this patch, cpu_physical_memory_set_dirty_range is called
unconditionally, so there need not be anymore a separate call to
xen_modified_memory.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:10:00 +02:00
Paolo Bonzini
58d2707e87 exec: pass client mask to cpu_physical_memory_set_dirty_range
This cuts in half the cost of bitmap operations (which will become more
expensive when made atomic) during migration on non-VRAM regions.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
358653391b translate-all: remove unnecessary argument to tb_invalidate_phys_range
The is_cpu_write_access argument is always 0, remove it.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
845b6214a3 exec: use memory_region_get_dirty_log_mask to optimize dirty tracking
The memory API can now return the exact set of bitmaps that have to
be tracked.  Use it instead of the in_migration variable.

In the next patches, we will also use it to set only DIRTY_MEMORY_VGA
or DIRTY_MEMORY_MIGRATION if necessary.  This can make a difference
for dataplane, especially after the dirty bitmap is changed to use
more expensive atomic operations.

Of some interest is the change to stl_phys_notdirty.  When migration
was introduced, stl_phys_notdirty was changed to effectively behave
as stl_phys during migration.  In fact, if one looks at the function as it
was in the beginning (commit 8df1cd0, physical memory access functions,
2005-01-28), at the time the dirty bitmap was the equivalent of
DIRTY_MEMORY_CODE nowadays; hence, the function simply should not touch
the dirty code bits.  This patch changes it to do the intended thing.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
49dfcec403 ram_addr: tweaks to xen_modified_memory
Invoke xen_modified_memory from cpu_physical_memory_set_dirty_range_nocode;
it is akin to DIRTY_MEMORY_MIGRATION, so set it together with that bitmap.
The remaining call from invalidate_and_set_dirty's "else" branch will go
away soon.

Second, fix the second argument to the function in the
cpu_physical_memory_set_dirty_lebitmap call site.  That function is only used
by KVM, but it is better to be clean anyway.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:59 +02:00
Paolo Bonzini
db94604b20 exec: optimize phys_page_set_level
phys_page_set_level is writing zeroes to a struct that has just been
filled in by phys_map_node_alloc.  Instead, tell phys_map_node_alloc
whether to fill in the page "as a leaf" or "as a non-leaf".

memcpy is faster than struct assignment, which copies each bitfield
individually.  A compiler bug (https://gcc.gnu.org/PR66391), and
small memcpys like this one are special-cased anyway, and optimized
to a register move, so just use the memcpy.

This cuts the cost of phys_page_set_level from 25% to 5% when
booting qboot.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:09:58 +02:00
Paolo Bonzini
41063e1e7a exec: move rcu_read_lock/unlock to address_space_translate callers
Once address_space_translate will be called outside the BQL, the returned
MemoryRegion might disappear as soon as the RCU read-side critical section
ends.  Avoid this by moving the critical section to the callers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1426684909-95030-3-git-send-email-pbonzini@redhat.com>
2015-04-30 16:55:32 +02:00
Peter Maydell
06feaacfb4 - miscellaneous cleanups for TCG (Emilio) and NBD (Bogdan)
- next part in the thread-safe address_space_* saga: atomic access
   to the bounce buffer and the map_clients list, from Fam
 - optional support for linking with tcmalloc, also from Fam
 - reapplying Peter Crosthwaite's "Respect as_translate_internal
   length clamp" after fixing the SPARC fallout.
 - build system fix from Wei Liu
 - small acpi-build and ioport cleanup by myself
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJVQJd4AAoJEL/70l94x66DYFYH/3ifhqWZsd4dfJri0CGAHI4i
 SpPmNeouc8W+F/3lwf6Inrh5NnTgd5QzoUBMQaWVkQKwUiWls8g2mXkT3jo0iDqT
 /B40YXnZjNm20MixNaZmk9AsOF6OqPM8EMufau874k5zTlx3tCGAW1QD+I1N7WK7
 DfsFsIUD1svo2prn55fSoitMG1TIVPnpcklb4YGJRbAacQYUDhr5KAIhT1quDR2R
 93BvToyQmPqRQ4YKqnJLp8HAkL4FaJumfFZVvyh2cZvyaYGN/RVdi2Dw985dJDPX
 /z4enE4GCAs4RDw3lZ1RDbiZDqpT2ibFgASg/arX3SxzqHirOGvMdkOjO99r9j4=
 =aLjh
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

- miscellaneous cleanups for TCG (Emilio) and NBD (Bogdan)
- next part in the thread-safe address_space_* saga: atomic access
  to the bounce buffer and the map_clients list, from Fam
- optional support for linking with tcmalloc, also from Fam
- reapplying Peter Crosthwaite's "Respect as_translate_internal
  length clamp" after fixing the SPARC fallout.
- build system fix from Wei Liu
- small acpi-build and ioport cleanup by myself

# gpg: Signature made Wed Apr 29 09:34:00 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (22 commits)
  nbd/trivial: fix type cast for ioctl
  translate-all: use bitmap helpers for PageDesc's bitmap
  target-i386: disable LINT0 after reset
  Makefile.target: prepend $libs_softmmu to $LIBS
  milkymist: do not modify libs-softmmu
  configure: Add support for tcmalloc
  exec: Respect as_translate_internal length clamp
  ioport: reserve the whole range of an I/O port in the AddressSpace
  ioport: loosen assertions on emulation of 16-bit ports
  ioport: remove wrong comment
  ide: there is only one data port
  gus: clean up MemoryRegionPortio
  sb16: remove useless mixer_write_indexw
  sun4m: fix slavio sysctrl and led register sizes
  acpi-build: remove dependency from ram_addr.h
  memory: add memory_region_ram_resize
  dma-helpers: Fix race condition of continue_after_map_failure and dma_aio_cancel
  exec: Notify cpu_register_map_client caller if the bounce buffer is available
  exec: Protect map_client_list with mutex
  linux-user, bsd-user: Remove two calls to cpu_exec_init_all
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-04-30 12:04:11 +01:00
Peter Crosthwaite
23820dbfc7 exec: Respect as_translate_internal length clamp
address_space_translate_internal will clamp the *plen length argument
based on the size of the memory region being queried. The iommu walker
logic in addresss_space_translate was ignoring this by discarding the
post fn call value of *plen. Fix by just always using *plen as the
length argument throughout the fn, removing the len local variable.

This fixes a bootloader bug when a single elf section spans multiple
QEMU memory regions.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-Id: <1426570554-15940-1-git-send-email-peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-27 18:24:19 +02:00
Fam Zheng
e95205e1f9 dma-helpers: Fix race condition of continue_after_map_failure and dma_aio_cancel
If DMA's owning thread cancels the IO while the bounce buffer's owning thread
is notifying the "cpu client list", a use-after-free happens:

     continue_after_map_failure               dma_aio_cancel
     ------------------------------------------------------------------
     aio_bh_new
                                              qemu_bh_delete
     qemu_bh_schedule (use after free)

Also, the old code doesn't run the bh in the right AioContext.

Fix both problems by passing a QEMUBH to cpu_register_map_client.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1426496617-10702-6-git-send-email-famz@redhat.com>
[Remove unnecessary forward declaration. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-27 18:24:18 +02:00
Fam Zheng
33b6c2edf6 exec: Notify cpu_register_map_client caller if the bounce buffer is available
The caller's workflow is like

    if (!address_space_map()) {
        ...
        cpu_register_map_client();
    }

If bounce buffer became available after address_space_map() but before
cpu_register_map_client(), the caller could miss it and has to wait for the
next bounce buffer notify, which may never happen in the worse case.

Just notify the list in cpu_register_map_client().

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1426496617-10702-5-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-27 18:24:18 +02:00
Fam Zheng
38e047b50d exec: Protect map_client_list with mutex
So that accesses from multiple threads are safe.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1426496617-10702-4-git-send-email-famz@redhat.com>
[Remove #if from cpu_exec_init_all. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-27 18:24:17 +02:00