Note that even after this patch, most callers of address_space_*
functions must still be under the big QEMU lock, otherwise the memory
region returned by address_space_translate can disappear as soon as
address_space_translate returns. This will be fixed in the next part
of this series.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After the previous patch, TLBs will be flushed on every change to
the memory mapping. This patch augments that with synchronization
of the MemoryRegionSections referred to in the iotlb array.
With this change, it is guaranteed that iotlb_to_region will access
the correct memory map, even once the TLB will be accessed outside
the BQL.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This for now is a simple TLB flush. This can change later for two
reasons:
1) an AddressSpaceDispatch will be cached in the CPUState object
2) it will not be possible to do tlb_flush once the TCG-generated code
runs outside the BQL.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
address_space_destroy_dispatch is called from an RCU callback and hence
outside the iothread mutex (BQL). However, after address_space_destroy
no new accesses can hit the destroyed AddressSpace so it is not necessary
to observe changes to the memory map. Move the memory_listener_unregister
call earlier, to make it thread-safe again.
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: 374f2981d1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Coverity flags this as "dereference after null check". Not quite a
dereference, since it will just EFAULT, but still nice to fix.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The TARGET_HAS_ICE #define is intended to indicate whether a target-*
guest CPU implementation supports the breakpoint handling. However,
all our guest CPUs have that support (the only two which do not
define TARGET_HAS_ICE are unicore32 and openrisc, and in both those
cases the bp support is present and the lack of the #define is just
a bug). So remove the #define entirely: all new guest CPU support
should include breakpoint handling as part of the basic implementation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1420484960-32365-1-git-send-email-peter.maydell@linaro.org
This makes ROM blocks resizeable. This infrastructure is required for other
functionality we have queued.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUrme8AAoJECgfDbjSjVRpqmEH/1APnrphAi/CM6rxf2hPyvWj
f5yQDNXfeGxrHaW5vux6DvgHUkTng6KGBxz6XMSiwul6MeyRFNDqwbfMhSHjiIum
QkT//jqb5xux60kyTLXuIBTPok1SsKDtaTxbvZb0VmZrnkdYeI2CLa1Mq3cQUY0a
8DKnchQEM5lic9bxj+OuLiDFx8QYaMpQlUP9iIvNq6GjX+0zNsWvfPtkMTm00t93
lHKPvD2eVmrgfS5g+lkAwLDahLSjqwDc0YuLABOgDUFsZFz9GAUCHSpt0y8HEBwR
1NhGCfbnyyRl/1OSULtARGQ4Ddwm5dn1i5I4usoP5rLFS7FV5F7xhBu0IZlwgVA=
=pFmm
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc: resizeable ROM blocks
This makes ROM blocks resizeable. This infrastructure is required for other
functionality we have queued.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 08 Jan 2015 11:19:24 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
acpi-build: make ROMs RAM blocks resizeable
memory: API to allocate resizeable RAM MR
arch_init: support resizing on incoming migration
exec: qemu_ram_alloc_resizeable, qemu_ram_resize
exec: split length -> used_length/max_length
exec: cpu_physical_memory_set/clear_dirty_range
memory: add memory_region_set_size
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add API to allocate "resizeable" RAM.
This looks just like regular RAM generally, but
has a special property that only a portion of it
(used_length) is actually used, and migrated.
This used_length size can change across reboots.
Follow up patches will change used_length for such blocks at migration,
making it easier to extend devices using such RAM (notably ACPI,
but in the future thinkably other ROMs) without breaking migration
compatibility or wasting ROM (guest) memory.
Device is notified on resize, so it can adjust if necessary.
qemu_ram_alloc_resizeable allocates this memory, qemu_ram_resize resizes
it.
Note: nothing prevents making all RAM resizeable in this way.
However, reviewers felt that only enabling this selectively will
make some class of errors easier to detect.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
This patch allows us to distinguish between two
length values for each block:
max_length - length of memory block that was allocated
used_length - length of block used by QEMU/guest
Currently, we set used_length - max_length, unconditionally.
Follow-up patches allow used_length <= max_length.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Make cpu_physical_memory_set/clear_dirty_range
behave symmetrically.
To clear range for a given client type only, add
cpu_physical_memory_clear_dirty_range_type.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Otherwise fw_cfg accesses are split into 4-byte ones before they reach the
fw_cfg ops / handlers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1419250305-31062-6-git-send-email-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In QEMU 2.2 the exception_index value was added to the migration stream
through a subsection. The default was set to 0, which is wrong and
should have been -1.
However, 2.2 does not have commit e511b4d (cpu-exec: reset exception_index
correctly, 2014-11-26), hence in 2.2 the exception_index is never used
and is set to -1 on the next call to cpu_exec. So we can change the
migration stream to make the default -1. The effects are:
- 2.2.1 -> 2.2.0: cpu->exception_index set incorrectly to 0 if it
were -1 on the source; then reset to -1 in cpu_exec. This is TCG
only; KVM does not use exception_index.
- 2.2.0 -> 2.2.1: cpu->exception_index set incorrectly to -1 if it
were 0 on the source; but it would be reset to -1 in cpu_exec anyway.
This is TCG only; KVM does not use exception_index.
- 2.2.1 -> 2.1: two bugs fixed: 1) can migrate backwards if
cpu->exception_index is set to -1; 2) should not migrate backwards
(but 2.2.0 allows it) if cpu->exception_index is set to 0
- 2.2.0 -> 2.3.0: 2.2.0 will send the subsection unnecessarily if
exception_index is -1, but that is not a problem. 2.3.0 will set
cpu->exception_index to -1 if it is 0 on the source, but this would
be anyway a problem for 2.2.0 -> 2.2.x migration (due to lack of
commit e511b4d in 2.2.x) so we can ignore it
- 2.2.1 -> 2.3.0: everything works.
In addition, play it safe and never send the subsection unless TCG
is in use. KVM does not use exception_index (PPC KVM stores values
in it for use in the subsequent call to ppc_cpu_do_interrupt, but
does not need it as soon as kvm_handle_debug returns). Xen and
qtest do not run any code for the CPU at all.
Reported-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Tested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1418989994-17244-3-git-send-email-pbonzini@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
host pointer accesses force pointer math, let's
add a wrapper to make them safer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
introduce memory_region_get_alignment() that returns
underlying memory block alignment or 0 if it's not
relevant/implemented for backend.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The code in invalidate_and_set_dirty() needs to handle addr/length
combinations which cross guest physical page boundaries. This can happen,
for example, when disk I/O reads large blocks into guest RAM which previously
held code that we have cached translations for. Unfortunately we were only
checking the clean/dirty status of the first page in the range, and then
were calling a tb_invalidate function which only handles ranges that don't
cross page boundaries. Fix the function to deal with multipage ranges.
The symptoms of this bug were that guest code would misbehave (eg segfault),
in particular after a guest reboot but potentially any time the guest
reused a page of its physical RAM for new code.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416167061-13203-1-git-send-email-peter.maydell@linaro.org
With commit 05068c0dfb 'exec.c: Relax restrictions on watchpoint length
and alignment' it's no longer possible to set 1-byte-long watchpoint
because of incorrect address range check.
Fix that by changing condition that checks for address wraparound.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1411016616-29879-1-git-send-email-jcmvbkbc@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- Build: fixing block/iscsi.so and ranlib warnings on Mac OS X
- Migration fixes for x86
- The odd KVM patch.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJUEXeWAAoJEBvWZb6bTYby4AwP/0Hh55A7QzkkzZ66y65zM+G5
dsgRcLjufHSRQHoNQqm6LOcicV3Ygc/X644EY6jnZCZxFh/fsWuTPqUDGxLAnxEc
2V0PkLRIScAMOPezzxvRy6/9hkG+UYM3ZOL5D9yxA9pGuBtttw7tkts19Vqf9WZc
NYG5TBDuEGM1c596Zpo7t10m+Oiw+Jyi5luLXsb4lh5ikdFPDrtJaf0AnFvR+ym0
HXlj2K/0vHNowUeLoo+oWnZsW8mLE6OyJhgfo1tJtsH1BR+lQJnBnQ4moq4Sl/Wz
+iht/4gtz34XwLILokFR6yiNrPe+MIryyv+FYxOD5loIdGVDtKMx30UkIE2/D933
6/n5i3GBLi9JapeT9gkKTxk/UVRPzJ1PK07RWevgNZNQyTGKAUGp+p48nSzMYX7V
7GFSy3Q8uqOR8g9n+t+RURxkoMNbhhw7v53Z3PPXPCALCMDzg9RARlW/nkfiExcZ
oThUjE/8xfMTQlN1SO5HTyQXEkYjtknZhfC7/KFvkWYMbCG0KBTf212Md0zlTNkj
+C6r8Gq4ZWVIc07QyKkoCMxB+a9Uhvy4T1PKuSlm6iu94zUgZRhdf/PlOXimhFqH
9GL67Tv15kpj05xCS6jDXjeMZ416/UKw91OcsiT1UUHcq7/rc+GBycd0ngV1UgnQ
di5V12IVt8JwdzFxMeCT
=GIKW
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
- Memory: improve error reporting and avoid crashes on hotplug
- Build: fixing block/iscsi.so and ranlib warnings on Mac OS X
- Migration fixes for x86
- The odd KVM patch.
# gpg: Signature made Thu 11 Sep 2014 11:21:10 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: aka "Paolo Bonzini <bonzini@gnu.org>"
* remotes/bonzini/tags/for-upstream: (21 commits)
gdbstub: init mon_chr through qemu_chr_alloc
pckbd: adding new fields to vmstate
mc146818rtc: add missed field to vmstate
piix: do not set irq while loading vmstate
serial: fixing vmstate for save/restore
parallel: adding vmstate for save/restore
fdc: adding vmstate for save/restore
cpu: init vmstate for ticks and clock offset
apic_common: vapic_paddr synchronization fix
vl: use QLIST_FOREACH_SAFE to visit change state handlers
exec: add parameter errp to gethugepagesize
exec: report error when memory < hpagesize
hostmem-ram: don't exit qemu if size of memory-backend-ram is way too big
memory: add parameter errp to memory_region_init_rom_device
memory: add parameter errp to memory_region_init_ram
exec: add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr
rules.mak: Fix DSO build by pulling in archive symbols
util: Don't link host-utils.o if it's empty
util: Move general qemu_getauxval to util/getauxval.c
trace: Only link generated-tracers.o with "simple" backend
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If memory allocation fails when using the -mem-prealloc command-line
option, QEMU exits without printing any error information to
the user:
# qemu [...] -m 1G -mem-prealloc -mem-path /dev/hugepages
# echo $?
1
This commit adds an error message, so that we print instead:
# qemu [...] -m 1G -mem-prealloc -mem-path /dev/hugepages
qemu: unable to map backing store for hugepages: Cannot allocate memory
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
When we check whether we've hit a watchpoint we know the address
that we were attempting to access and whether it was a read or a
write. Record this information in the CPUWatchpoint struct so that
target-specific code can report it to the guest.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
We already provide dummy versions of the cpu_watchpoint_insert
and cpu_watchpoint_remove_all functions when CONFIG_USER_ONLY
is defined. Complete the set by providing cpu_watchpoint_remove
and cpu_watchpoint_remove_by_ref as well.
This allows target-* code using these functions to avoid
some ifdeffery.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The current implementation of watchpoints requires that they
have a power of 2 length which is not greater than TARGET_PAGE_SIZE
and that their address is a multiple of their length. Watchpoints
on ARM don't fit these restrictions, so change the implementation
so they can be relaxed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Add parameter errp to gethugepagesize thus callers can handle errors.
If user adds a memory-backend-file object using object_add command,
specifying a non-existing directory for property mem-path, qemu will
core dump with message:
/nonexistingdir: No such file or directory
Bad ram offset fffffffffffff000
Aborted (core dumped)
This patch fixes the problem. With this patch, qemu reports an error
message like:
qemu-system-x86_64: -object memory-backend-file,mem-path=/nonexistingdir,id=mem-file0,size=128M:
failed to get page size of file /nonexistingdir: No such file or directory
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Report an error when memory < hpagesize in file_ram_alloc() so callers
can handle the error.
If user adds a memory-backend-file object using object_add command,
specifying a size that is less than huge page size, qemu will core dump
with message:
Bad ram offset fffffffffffff000
Aborted (core dumped)
This patch fixes the problem. With this patch, qemu reports error
message like:
qemu-system-x86_64: -object memory-backend-file,mem-path=/hugepages,id=mem-file0,size=1M: memory
size 0x100000 must be equal to or larger than huge page size 0x200000
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr so that
we can handle errors.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Assert ptr != NULL in memory_region_init_ram_ptr. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds a subsection with exception_index field to the VMState for
correct saving the CPU state.
Without this patch, simulator could miss the pending exception in the saved
virtual machine state.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Add a bool variable is_write as a parameter to the translate function of
MemoryRegionIOMMUOps to indicate the operation of the access. It can be
used for correct fault reporting from within the callback.
Change the interface of related functions.
Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Memory changes for QOMification and automatic tracking of MR lifetime.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJT8et9AAoJEBvWZb6bTYbyIJAQAI3AlLSe27xWoUGfQUgWH30z
Rt/pShHz3BJMfQpD79JfTH8u6uBpkQmKtflerNT7FhXN9ULDzNq+b/jRtke8nkuy
ctCt05FhhK00rfWpUoRue4XiCuvbizBU7MK0DI3yCyNdXQyYnFvgnvsJtlqox8Zh
J5HZcBJEmdCiWBxq7UPk0qBitp4PqNoy7jlD/Ex3m7fJN5WK2cyspQIT9zmhehVn
B8Nwp+RitDDbXbwm0r18col5rFr/6Nj6+dW1gr+7sVJDLNsmJEqC2l3Kgk0wbPkG
Uqwbih29me9PC9/L1VLGHY0ApKDQ8JGE0GrYgEg162hbhoxEHkjjoHMhDUfV6Pj8
NkqcjjWl11UUhgkNqrGafayXbBVnOiEglxy8uXCeq14y9Xd/gjK9Fz6MQvRSOjms
PFmaKknhdmpxh0DuZmTix7WBmKim8zOiCE0/vrAPvwx5L+d1bn5xh6yQvtVjBMpU
Sru3Mhdm9bL9dUDBgOM/G6WCxSTVLBlExOblcYkQh03MfabD7bfplcrKYPXt5ull
Y8YLjqkoIfoy5t0ErvtlpdBJjeEz99JXU+wLQ6NYHnzwzTV+oUtSaEph14mAFOcY
XkFKdoPDI9PnyEfvy4193du8z/dSbhu7sWgHWbTCQyrcaNnSaVhlH43NUC+p23YN
8vfEsVLd1X7MFkDBUmWp
=M+/m
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
SCSI changes that enable sending vendor-specific commands via virtio-scsi.
Memory changes for QOMification and automatic tracking of MR lifetime.
# gpg: Signature made Mon 18 Aug 2014 13:03:09 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: aka "Paolo Bonzini <bonzini@gnu.org>"
* remotes/bonzini/tags/for-upstream:
mtree: remove write-only field
memory: Use canonical path component as the name
memory: Use memory_region_name for name access
memory: constify memory_region_name
exec: Abstract away ref to memory region names
loader: Abstract away ref to memory region names
tpm_tis: remove instance_finalize callback
memory: remove memory_region_destroy
memory: convert memory_region_destroy to object_unparent
ioport: split deletion and destruction
nic: do not destroy memory regions in cleanup functions
vga: do not dynamically allocate chain4_alias
sysbus: remove unused function sysbus_del_io
qom: object: move unparenting to the child property's release callback
qom: object: delete properties before calling instance_finalize
virtio-scsi: implement parse_cdb
scsi-block, scsi-generic: implement parse_cdb
scsi-block: extract scsi_block_is_passthrough
scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
scsi-bus: prepare scsi_req_new for introduction of parse_cdb
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use the function provided rather than spying on the struct.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Devices that use address_space_rw to write large areas to memory
(as opposed to address_space_map/unmap) were broken with respect
to migration since fe680d0 (exec: Limit translation limiting in
address_space_translate to xen, 2014-05-07). Such devices include
IDE CD-ROMs.
The reason is that invalidate_and_set_dirty (called by address_space_rw
but not address_space_map/unmap) was only setting the dirty bit for
the first page in the translation.
To fix this, introduce cpu_physical_memory_set_dirty_range_nocode that
is the same as cpu_physical_memory_set_dirty_range except it does not
muck with the DIRTY_MEMORY_CODE bitmap. This function can be used if
the caller invalidates translations with tb_invalidate_phys_page_range.
There is another difference between cpu_physical_memory_set_dirty_range
and cpu_physical_memory_set_dirty_flag; the former includes a call
to xen_modified_memory. This is handled separately in
invalidate_and_set_dirty, and is not needed in other callers of
cpu_physical_memory_set_dirty_range_nocode, so leave it alone.
Just one nit: now that invalidate_and_set_dirty takes care of handling
multiple pages, there is no need for address_space_unmap to wrap it
in a loop. In fact that loop would now be O(n^2).
Reported-by: Dave Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QOMify memory regions as an Object. The former init() and destroy()
routines become instance_init() and instance_finalize() resp.
memory_region_init() is re-implemented to be:
object_initialize() + set fields
memory_region_destroy() is re-implemented to call unparent().
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Add newly-created MR as child, unparent on destruction. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We currently have a virtio_is_big_endian() helper that provides the target
endianness to the virtio code. As of today, the helper returns a fixed
compile-time value. Of course, this will have to change if we want to
support target endianness changes at run-time.
Let's move the TARGET_WORDS_BIGENDIAN bits out to a new helper and have
virtio_is_big_endian() implemented on top of it.
This patch doesn't change any functionality.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Old code was affected by memory gaps which resulted in buffer pointers
pointing to address outside of the mapped regions.
Here we are introducing following changes:
- new function qemu_get_ram_block_host_ptr() returns host pointer
to the ram block, it is needed to calculate offset of specific
region in the host memory
- new field mmap_offset is added to the VhostUserMemoryRegion. It
contains offset where specific region starts in the mapped memory.
As there is stil no wider adoption of vhost-user agreement was made
that we will not bump version number due to this change
- other fileds in VhostUserMemoryRegion struct are not changed, as
they are all needed for usermode app implementation
- region data is not taken from ram_list.blocks anymore, instead we
use region data which is alredy calculated for use in vhost-net
- Now multiple regions can have same FD and user applicaton can call
mmap() multiple times with the same FD but with different offset
(user needs to take care for offset page alignment)
Signed-off-by: Damjan Marion <damarion@cisco.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Damjan Marion <damarion@cisco.com>
As a "utility", it only supported ppc, and in a way that other
tcg backends provided directly in tcg-target.h. Removing this
disparity is easier now that the two ppc backends are merged.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
A new "share" property can be used with the "memory-file" backend to
map memory with MAP_SHARED instead of MAP_PRIVATE.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
And allow preallocation of file-based memory even without -mem-prealloc.
Some care is necessary because -mem-prealloc does not allow disabling
preallocation for hostmem-file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Right now, -mem-path will fall back to RAM-based allocation in some
cases. This should never happen with "-object memory-file", prepare
the code by adding correct error propagation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
MST: drop \n at end of error messages
Like the previous patch did in exec.c, split memory_region_init_ram and
memory_region_init_ram_from_file, and push mem_path one step further up.
Other RAM regions than system memory will now be backed by regular RAM.
Also, boards that do not use memory_region_allocate_system_memory will
not support -mem-path anymore. This can be changed before the patches
are merged by migrating boards to use the function.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Prepare for adding more flags. The "_MASK" suffix is unique, kill it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
So that backends can use it.
Since we need the page size for efficiency, move code to compute it
out of translate-all.c and into util/oslib-win32.c.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Split the internal interface in exec.c to a separate function, and
push the check on mem_path up to memory_region_init_ram.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Rather than use the global singleton.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After previous Peter patch, they are redundant. This way we don't
assign them except when needed. Once there, there were lots of case
where the ".fields" indentation was wrong:
.fields = (VMStateField []) {
and
.fields = (VMStateField []) {
Change all the combinations to:
.fields = (VMStateField[]){
The biggest problem (appart from aesthetics) was that checkpatch complained
when we copy&pasted the code from one place to another.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
The address_space_translate() function cuts the returned plen (page size)
to hardcoded TARGET_PAGE_SIZE. This function can be used on pages bigger
than that so this limiting should not be used on such pages.
Since originally the limiting was introduced for XEN, we can safely
limit this piece of code to XEN. So does the patch.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Commit 259186a7d2 (cpu: Move halted and
interrupt_request fields to CPUState) passed CPUState::env_ptr to
tlb_flush() directory rather than through a typed variable.
Commit 00c8cb0a36 (cputlb: Change
tlb_flush() argument to CPUState) now changed the argument type.
This was unnoticed by gcc because env_ptr is a void pointer.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Most targets were using offsetof(CPUFooState, breakpoints) to determine
how much of CPUFooState to clear on reset. Use the next field after
CPU_COMMON instead, if any, or sizeof(CPUFooState) otherwise.
Signed-off-by: Andreas Färber <afaerber@suse.de>
* remotes/kvm/uq/master:
target-i386: bugfix of Intel MPX
file_ram_alloc: unify mem-path,mem-prealloc error handling
kvm-all: exit in case max vcpus exceeded
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This file does not depend on windows.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
-mem-prealloc asks to preallocate memory residing on -mem-path path.
Currently QEMU exits in case:
- Memory file has been created but allocation via explicit write
fails.
And it fallbacks to malloc in case:
- Querying huge page size fails.
- Lack of sync MMU support.
- Open fails.
- mmap fails.
Have the same behaviour for all cases: fail in case -mem-path and
-mem-prealloc are specified for regions where the requested size is
suitable for hugepages.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 360e607 (address_space_translate: do not cross page boundaries,
2014-01-30) broke MMIO accesses in cases where the section is shorter
than the full register width. This can happen for example with the
Bochs DISPI registers, which are 16 bits wide but have only a 1-byte
long MemoryRegion (if you write to the "second byte" of the register
your access is discarded; it doesn't write only to half of the register).
Restrict the action of commit 360e607 to direct RAM accesses. This
is enough for Xen, since MMIO will not go through the mapcache.
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The following commit:
commit 149f54b53b
Author: Paolo Bonzini <pbonzini@redhat.com>
Date: Fri May 24 12:59:37 2013 +0200
memory: add address_space_translate
breaks Xen support in QEMU, in particular the Xen mapcache. The effect
is that one Windows XP installation out of ten would end up with BSOD.
The reason is that after this commit l in address_space_rw can span a
page boundary, however qemu_get_ram_ptr still calls xen_map_cache asking
to map a single page (if block->offset == 0).
Fix the issue by reverting to the previous behaviour: do not return a
length from address_space_translate_internal that can span a page
boundary.
Also in address_space_translate do not ignore the length returned by
address_space_translate_internal.
This patch should be backported to QEMU 1.6.x.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Anthony Perard <anthony.perard@citrix.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
* qemu-kvm/uq/master:
kvm: always update the MPX model specific register
KVM: fix addr type for KVM_IOEVENTFD
KVM: Retry KVM_CREATE_VM on EINTR
mempath prefault: fix off-by-one error
kvm: x86: Separately write feature control MSR on reset
roms: Flush icache when writing roms to guest memory
target-i386: clear guest TSC on reset
target-i386: do not special case TSC writeback
target-i386: Intel MPX
Conflicts:
exec.c
aliguori: fix trivial merge conflict in exec.c
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
All the functions that use ram_addr_t should be here.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Result was always 0, and not used anywhere. Once there, use bool type
for the parameter.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
We have an end parameter in all the callers, and this make it coherent
with the rest of cpu_physical_memory_* functions, that also take a
length parameter.
Once here, move the start/end calculation to
tlb_reset_dirty_range_all() as we don't need it here anymore.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
All uses except one really want the other meaning.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Now all functions use the same wording that bitops/bitmap operations
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
After all the previous patches, spliting the bitmap gets direct.
Note: For some reason, I have to move DIRTY_MEMORY_* definitions to
the beginning of memory.h to make compilation work.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Document it
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
So remove the flag argument and do it directly. After this change,
there is nothing else using cpu_physical_memory_set_dirty_flags() so
remove it.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Fix off-by-one error (noticed by Andrea Arcangeli).
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
We use the rom infrastructure to write firmware and/or initial kernel
blobs into guest address space. So we're basically emulating the cache
off phase on very early system bootup.
That phase is usually responsible for clearing the instruction cache for
anything it writes into cachable memory, to ensure that after reboot we
don't happen to execute stale bits from the instruction cache.
So we need to invalidate the icache every time we write a rom into guest
address space. We do not need to do this for every DMA since the guest
expects it has to flush the icache manually in that case.
This fixes random reboot issues on e5500 (booke ppc) for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
We use the rom infrastructure to write firmware and/or initial kernel
blobs into guest address space. So we're basically emulating the cache
off phase on very early system bootup.
That phase is usually responsible for clearing the instruction cache for
anything it writes into cachable memory, to ensure that after reboot we
don't happen to execute stale bits from the instruction cache.
So we need to invalidate the icache every time we write a rom into guest
address space. We do not need to do this for every DMA since the guest
expects it has to flush the icache manually in that case.
This fixes random reboot issues on e5500 (booke ppc) for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Every address space has its own nodes and sections, but
it uses the same global arrays of nodes/section.
This limits the number of devices that can be attached
to the guest to 20-30 devices. It happens because:
- The sections array is limited to 2^12 entries.
- The main memory has at least 100 sections.
- Each device address space is actually an alias to
main memory, multiplying its number of nodes/sections.
Remove the limitation by using separate arrays of
nodes and sections for each address space.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With the single exception of ppc with 16M pages,
we get the same number of levels
with L2_PAGE_SIZE = 10 as with L2_PAGE_SIZE = 9.
by doing this we reduce memory footprint of a single level
in the node memory map by 2x without runtime overhead.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As an alternative to commit 818f86b (exec: limit system memory
size, 2013-11-04) let's just make all address spaces 64-bit wide.
This eliminates problems with phys_page_find ignoring bits above
TARGET_PHYS_ADDR_SPACE_BITS and address_space_translate_internal
consequently messing up the computations.
In Luiz's reported crash, at startup gdb attempts to read from address
0xffffffffffffffe6 to 0xffffffffffffffff inclusive. The region it gets
is the newly introduced master abort region, which is as big as the PCI
address space (see pci_bus_init). Due to a typo that's only 2^63-1,
not 2^64. But we get it anyway because phys_page_find ignores the upper
bits of the physical address. In address_space_translate_internal then
diff = int128_sub(section->mr->size, int128_make64(addr));
*plen = int128_get64(int128_min(diff, int128_make64(*plen)));
diff becomes negative, and int128_get64 booms.
The size of the PCI address space region should be fixed anyway.
Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
At the moment, memory radix tree is already variable width, but it can
only skip the low bits of address.
This is efficient if we have huge memory regions but inefficient if we
are only using a tiny portion of the address space.
After we have built up the map, detect
configurations where a single L2 entry is valid.
We then speed up the lookup by skipping one or more levels.
In case any levels were skipped, we might end up in a valid section
instead of erroring out. We handle this by checking that
the address is in range of the resulting section.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Extend skip to 6 bit. As page entry doesn't fit in 16 bit
any longer anyway, extend it to 32 bit.
This doubles node map memory requirements, but follow-up
patches will save this memory.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In preparation for dynamic radix tree depth support, rename is_leaf
field to skip, telling us how many bits to skip to next level.
Set to 0 for leaf.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The exec.c and translate-all.c radix trees are quite different, and
the exec.c one in particular is not limited to the CPU---it can be
used also by devices that do DMA, and in that case the address space
is not limited to TARGET_PHYS_ADDR_SPACE_BITS bits.
We want to make exec.c's radix trees 64-bit wide. As a first step,
stop sharing the constants between exec.c and translate-all.c.
exec.c gets P_L2_* constants, translate-all.c gets V_L2_*, for
consistency with the existing V_L1_* symbols. Though actually
in the softmmu case translate-all.c is also indexed by physical
addresses...
This patch has no semantic change.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts PCI master abort support - we'll want it
eventually but it exposes too many core bugs to be safe for 1.7.
This also reverts a recent exec.c change that was an
attempt to work-around some of these core bugs.
Also included are small fixes in pc and virtio,
and a core loader fix for PPC bamboo.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQEcBAABAgAGBQJSf4ZyAAoJECgfDbjSjVRp9DIIAK7yEMa9ie5n3sInKH+xHT3R
Sf4uErqx55WfT/54dnLJPrs7DTfXblW+Qjnq/7RuaoJ32Dfshgxz64mPF+Lm2s3+
ghjdQrKo2YkdSbbxy+AnBNO4eHMSeUs/rM2yIfi7FZU0nwC7wNe1QpAN3UjM4yAF
5vE18xZE0Rxz/prXgofLtPHa1czvGPFk1qbS7Vag6HCSkfEI4N1Jxf9otDRV6KZP
9hX0kTvZyOKdbhccN05G4VCWwx5YUrpBsNSoph4Jx1aokEBoucr4sgE1FPDp0H9H
bJqDaAM2G5HNrDtIiDov5WOzRNT/ly011Q4mcaQh3va0pqUXttKCHgE1KRgn76I=
=iMNW
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pci, pc, virtio bug fixes
This reverts PCI master abort support - we'll want it
eventually but it exposes too many core bugs to be safe for 1.7.
This also reverts a recent exec.c change that was an
attempt to work-around some of these core bugs.
Also included are small fixes in pc and virtio,
and a core loader fix for PPC bamboo.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 10 Nov 2013 05:13:22 AM PST using RSA key ID D28D5469
# gpg: Can't check signature: public key not found
# By Michael S. Tsirkin (3) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
Revert "exec: limit system memory size"
Revert "hw/pci: partially handle pci master abort"
loader: drop return value for rom_add_blob_fixed
acpi-build: disable with -no-acpi
virtio-net: only delete bh that existed
Fix pc migration from qemu <= 1.5
Message-id: 1384159176-31662-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
This reverts commit 818f86b883.
This was a work-around for bugs elsewhere in the system,
exposed by commit a53ae8e934:
"hw/pci: partially handle pci master abort"
since that's reverted now, the work-around is not required for 1.7
anymore.
The proper fix is supporting full 64 bit addresses in the radix tree.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Marcel Apfelbaum <marcel.a@redhat.com>
This fixes qemu abort with the following message:
include/qemu/int128.h:22: int128_get64: Assertion `!a.hi' failed.
which happens due to attempt to invalidate breakpoint by virtual address
for which get_phys_page_debug couldn't find mapping.
For more details see
http://lists.nongnu.org/archive/html/qemu-devel/2013-09/msg04582.html
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
The page table logic in exec.c assumes
that memory addresses are at most TARGET_PHYS_ADDR_SPACE_BITS.
But pci addresses are full 64 bit so if we try to render them ignoring
the extra bits, we get strange effects with sections overlapping each
other.
To fix, simply limit the system memory size to
1 << TARGET_PHYS_ADDR_SPACE_BITS,
pci addresses will be rendered within that.
Cc: qemu-stable@nongnu.org
Reported-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This fixes a regression introduced by commit e3127ae0c, which kept the
allocation size of the bounce buffer limited to one page in order to
avoid unbounded allocations (as explained in the commit message of
6d16c2f88), but broke the reporting of the shortened bounce buffer to
the caller. The caller therefore assumes that the full requested size
was provided and causes memory corruption when writing beyond the end of
the actually allocated buffer.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is not needed since the RAM list is not modified anymore by
qemu_get_ram_ptr. Replace it with qemu_get_ram_block.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
phys_mem_alloc and its assigned values qemu_anon_ram_alloc and
legacy_s390_alloc must have identical argument lists.
legacy_s390_alloc uses the size parameter to call mmap, so size_t is
good enough for all of them.
This patch fixes compiler errors on i686 Linux hosts:
CC alpha-softmmu/exec.o
exec.c:752:51: error:
initialization from incompatible pointer type [-Werror]
exec.c: In function 'qemu_ram_alloc_from_ptr':
exec.c:1139:32: error:
comparison of distinct pointer types lacks a cast [-Werror]
exec.c: In function 'qemu_ram_remap':
exec.c:1283:21: error:
comparison of distinct pointer types lacks a cast [-Werror]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1380481005-32399-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
* Fix for X86CPU model field of qemu32/qemu64 CPU models
* Bug fix for longjmp on FreeBSD
* Removal of unused function
* Confinement of clone syscall infrastructure to linux-user
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJSVTKzAAoJEPou0S0+fgE/7tYP/i5dgm6q7jSnhJcwzgHlCHDE
c0BTwnvFjdBdkuAARYb/soo0m9QWfsW/dgC4bG3rO5j3o84PLstMjiZSQch0pqM1
YhA0hYSiFjHrMcRk9FOwIECPIe+QcHZ79iNML+9G4K13D7qg36aJWISbVOWy24Dp
kj5D0wBBDNw032Oh/3z3EAK4U+vLc/+i4s8XjfwtbuBCCn7GMCE3mRnEqnf8ZX3o
H3Il3h/o+I3XQSzIJKXXyJZ5ZVXTtlj0z/0ShQXe8o8u1hINXE2Nf9lB6WG/6sh0
Y43d0uU/e9fWDer25j9yis9KfDNErgYyxlBMUA2X1+Rny5P0twjnnBr5GTAeKgSq
Kcux8Ov7W8cbVoM/px03rnynF9rbFbgmGlx82L+QsNMKWhjnEsfs6unpccpGhHR5
UuZX3ZPrmeHfjv0AZD/U2ya3jfrp0v+9gsTqy3QV1rCPbqPDcJ6jg8jzbPZYjEfa
/Zy0e/0O3sytSyiaAfBg3MzVPBxdzPcn0JjExJQV9BHsUlkZIVCZVMfePw1oIaf+
coyV4cT3hCe8LrSCzPZlRYP+1hIg41W4NicLbDxtS8lqgfRbcglvqw6NFdAM+NcB
z3heQ7IFstQ+pEINXQNy6bS8orv8F1VVvCtZaV+2pzB4TZzjPYuGsrqygre4QkLU
mtpN9BTfmSIjzyo6iYBv
=hQfy
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'afaerber/tags/qom-cpu-for-anthony' into staging
QOM CPUState refactorings / X86CPU
* Fix for X86CPU model field of qemu32/qemu64 CPU models
* Bug fix for longjmp on FreeBSD
* Removal of unused function
* Confinement of clone syscall infrastructure to linux-user
# gpg: Signature made Wed 09 Oct 2013 03:40:51 AM PDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found
# By Andreas Färber (2) and others
# Via Andreas Färber
* afaerber/tags/qom-cpu-for-anthony:
cpu: Drop cpu_model_str from CPU_COMMON
cpu: Move cpu_copy() into linux-user
cputlb: Remove dead function tlb_update_dirty()
cpu-exec: Also reload CPUClass *cc after longjmp return in cpu_exec()
target-i386: Set model=6 on qemu64 & qemu32 CPU models
It is only used there and is deemed very fragile if not incorrect in its
current memcpy() form. Moving it into linux-user will allow to move
parts into target_cpu.h headers and only copy what the ABI mandates.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Touched some error after enabling DEBUG_SUBPAGE.
Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
# By Stefan Weil (8) and others
# Via Michael Tokarev
* mjt/trivial-patches:
tests/.gitignore: ignore test-throttle
exec: Fix broken build for MinGW (regression)
kvm: Fix compiler warning (clang)
tcg-sparc: Fix parenthesis warning
Makefile: Remove some more files when cleaning
target-i386: Fix segment cache dump
iov: avoid "orig_len may be used unitialized" warning
vscclient: remove unnecessary use of uninitialized variable
trace-events: Clean up with scripts/cleanup-trace-events.pl again
tci: Fix qemu-alpha on 32 bit hosts (wrong assertions)
*-user: Improve documentation for lock_user function
MAINTAINERS: Add missing entry to filelist for TCI target
translate-all: Fix formatting of dump output
*-user: Fix typo in comment (ulocking -> unlocking)
docs: Fix IO port number for CPU present bitmap.
q35: Fix typo in constant DEFUALT -> DEFAULT.
configure: Undefine _FORTIFY_SOURCE prior using it
Message-id: 1379696296-32105-1-git-send-email-mjt@msgid.tls.msk.ru
# By Alexey Kardashevskiy (3) and others
# Via Paolo Bonzini
* qemu-kvm/uq/master:
target-i386: add feature kvm_pv_unhalt
linux-headers: update to 3.12-rc1
target-i386: forward CPUID cache leaves when -cpu host is used
linux-headers: update to 3.11
kvm: fix traces to use %x instead of %d
kvmvapic: Clear also physical ROM address when entering INACTIVE state
kvmvapic: Enter inactive state on hardware reset
kvmvapic: Catch invalid ROM size
kvm irqfd: support direct msimessage to irq translation
fix steal time MSR vmsd callback to proper opaque type
kvm: warn if num cpus is greater than num recommended
cpu: Move cpu state syncs up into cpu_dump_state()
exec: always use MADV_DONTFORK
Message-id: 1379694292-1601-1-git-send-email-pbonzini@redhat.com
Commit 3435f39513 reduced the ifdeffery with
this result for MinGW:
exec.c: In function ‘qemu_ram_free’:
exec.c:1239:17: warning:
implicit declaration of function ‘munmap’ [-Wimplicit-function-declaration]
exec.c:1239:17: warning:
nested extern declaration of ‘munmap’ [-Wnested-externs]
exec.c:1239: undefined reference to `munmap'
Add some ifdeffery again to fix this.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
MADV_DONTFORK prevents fork to fail with -ENOMEM if the default
overcommit heuristics decides there's too much anonymous virtual
memory allocated. If the KVM secondary MMU is synchronized with MMU
notifiers or not, doesn't make a difference in that regard.
Secondly it's always more efficient to avoid copying the guest
physical address space in the fork child (so we avoid to mark all the
guest memory readonly in the parent and so we skip the establishment
and teardown of lots of pagetables in the child).
In the common case we can ignore the error if MADV_DONTFORK is not
available. Leave a second invocation that errors out in the KVM path
if MMU notifiers are missing and KVM is enabled, to abort in such
case.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-By: Benoit Canet <benoit@irqsave.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
We abort() on memory allocation failure. abort() is appropriate for
programming errors. Maybe most memory allocation failures are
programming errors, maybe not. But guest memory allocation failure
isn't, and aborting when the user asks for more memory than we can
provide is not nice. exit(1) instead, and do it in just one place, so
the error message is consistent.
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1375276272-15988-8-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Another issue missed in commit fdec991 is -mem-path: it needs to be
rejected only for old S390 KVM, not for any S390. Not that I
personally care, but the ifdeffery in qemu_ram_alloc_from_ptr() annoys
me.
Note that this doesn't actually make -mem-path work, as the kernel
doesn't (yet?) support large pages in the host for KVM guests. Clean
it up anyway.
Thanks to Christian Borntraeger for pointing out the S390 kernel
limitations.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1375276272-15988-7-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Old S390 KVM wants guest RAM mapped in a peculiar way. Commit 6b02494
implemented that.
When qemu_ram_remap() got added in commit cd19cfa, its code carefully
mimicked the allocation code: peculiar way if defined(TARGET_S390X) &&
defined(CONFIG_KVM), else normal way.
For new S390 KVM, we actually want the normal way. Commit fdec991
changed qemu_ram_alloc_from_ptr() accordingly, but forgot to update
qemu_ram_remap(). If qemu_ram_alloc_from_ptr() maps RAM the normal
way, but qemu_ram_remap() remaps it the peculiar way, remapping
changes protection and flags, which it shouldn't.
Fortunately, this can't happen, as we never remap on S390.
Replace the incorrect code with an assertion.
Thanks to Christian Borntraeger for help with assessing the bug's
(non-)impact.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-6-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Make it a generic hook rather than a KVM hook. Less code and
ifdeffery.
Since the only user of the hook is old S390 KVM, there's hope we can
get rid of it some day.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-5-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Instead of spreading its ifdeffery everywhere, confine it to
qemu_ram_alloc_from_ptr(). Everywhere else, simply test block->fd,
which is non-negative exactly when block uses -mem-path.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-4-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
With -mem-path, qemu_ram_alloc_from_ptr() first tries to allocate
accordingly, but when it fails, it falls back to normal allocation.
The fall back allocation code used to be effectively identical to the
"-mem-path not given" code, until it started to diverge in commit
432d268. I believe the code still works, but clean it up anyway: drop
the special fall back allocation code, and fall back to the ordinary
"-mem-path not given" code instead.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-3-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Issues:
* We try to obey -mem-path even though it can't work with Xen.
* To implement -machine mem-merge, we call
memory_try_enable_merging(new_block->host, size). But with Xen,
new_block->host remains null. Oops.
Fix by separating Xen allocation from normal allocation.
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1375276272-15988-2-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Accesses to unassigned io ports shall return -1 on read and be ignored
on write. Ensure these properties via dedicated ops, decoupling us from
the memory core's handling of unassigned accesses.
Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If offset_within_address_space falls in a page, then we register a
subpage. So check offset_within_address_space rather than
offset_within_region.
Cc: qemu-stable@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: "Andreas Färber" <afaerber@suse.de>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The problem is introduced by commit 2332616 (exec: Support 64-bit
operations in address_space_rw, 2013-07-08). Before that commit,
memory_access_size would only return 1/2/4.
Since alignment is already handled above, reduce l to the largest
power of two that is smaller than l.
Cc: qemu-stable@nongnu.org
Reported-by: Oleksii Shevchuk <alxchk@gmail.com>
Tested-by: Oleksii Shevchuk <alxchk@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It was introduced to loop over CPUs from target-independent code, but
since commit 182735efaf target-independent
CPUState is used.
A loop can be considered more efficient than function calls in a loop,
and CPU_FOREACH() hides implementation details just as well, so use that
instead.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Commit 1a1562f5ea prepared a VMSTATE_CPU()
macro for device-style VMStateDescription registration, but missed to
adapt cpu_exec_init(), so that the "cpu_common" VMStateDescription was
still registered for AlphaCPU (fe31e73742)
and OpenRISCCPU (da69721460). Fix this.
Cc: Richard Henderson <rth@twiddle.net>
Tested-by: Jia Liu <proljc@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Passing a CPUState pointer instead of a CPUArchState pointer eliminates
the last target dependent data type in sysemu/kvm.h.
It also simplifies the code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
* riku/linux-user-for-upstream: (21 commits)
linux-user: Handle compressed ISA encodings when processing MIPS exceptions
linux-user: Unlock mmap_lock when resuming guest from page_unprotect
linux-user: Reset copied CPUs in cpu_copy() always
linux-user: Fix epoll on ARM hosts
linux-user: fix segmentation fault passing with h2g(x) != x
linux-user: Fix pipe syscall return for SPARC
linux-user: Fix target_stat and target_stat64 for OpenRISC
linux-user: Avoid conditional cpu_reset()
configure: Make NPTL non-optional
linux-user: Enable NPTL for x86-64
linux-user: Add i386 TLS setter
linux-user: Clean up handling of clone() argument order
linux-user: Add missing 'break' in i386 get_thread_area syscall
linux-user: Enable NPTL for m68k
linux-user: Enable NPTL for SPARC targets
linux-user: Enable NPTL for OpenRISC
linux-user: Move includes of target-specific headers to end of qemu.h
configure: Enable threading for unicore32-linux-user
configure: Enable threading on all ppc and mips linux-user targets
configure: Don't say target_nptl="no" if there is no linux-user target
...
Conflicts:
linux-user/main.c
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When a new thread gets created, we need to reset non arch specific state to
get the new CPU into clean state.
However this reset should happen before the arch specific CPU contents get
copied over. Otherwise we end up having clean reset state in our newly created
thread.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Change breakpoint_invalidate() argument to CPUState alongside.
Since all targets now assign a softmmu-only field, we can drop helpers
cpu_class_set_{do_unassigned_access,vmsd}() and device_class_set_vmsd().
Prepares for changing cpu_memory_rw_debug() argument to CPUState.
Acked-by: Max Filippov <jcmvbkbc@gmail.com> (for xtensa)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Prepares for changing cpu_single_step() argument to CPUState.
Acked-by: Michael Walle <michael@walle.cc> (for lm32)
Signed-off-by: Andreas Färber <afaerber@suse.de>
access_size_min can be 1 because erroneous accesses must not crash
QEMU, they should trigger exceptions in the guest or just return
garbage (depending on the CPU). I am not sure I understand the
comment: placing a 4-byte field at the last byte of a region
makes no sense (unless impl.unaligned is true), and that is
why memory.c:access_with_adjusted_size does not bother with
minimums larger than the remaining length.
access_size_max can be mr->ops->valid.max_access_size because memory.c
can and will still break accesses bigger than
mr->ops->impl.max_access_size.
Reported-by: Markus Armbruster <armbru@redhat.com>
Tested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit e3127ae0 introduced a problem where we're passing a
hwaddr* to qemu_ram_ptr_length() but it wants a ram_addr_t*;
this will cause problems on 32 bit hosts and in any case
provokes a clang warning on MacOSX:
CC arm-softmmu/exec.o
exec.c:2164:46: warning: incompatible pointer types passing 'hwaddr *'
(aka 'unsigned long long *') to parameter of type 'ram_addr_t *'
(aka 'unsigned long *')
[-Wincompatible-pointer-types]
return qemu_ram_ptr_length(raddr + base, plen);
^~~~
exec.c:1392:63: note: passing argument to parameter 'size' here
static void *qemu_ram_ptr_length(ram_addr_t addr, ram_addr_t *size)
^
Since this function is only used in one place, change its
prototype to pass a hwaddr* rather than a ram_addr_t*,
rather than contorting the calling code to get the type right.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Riku Voipio <riku.voipio@linaro.org>
Tested-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Honor the implementation maximum access size, and at least check
the minimum access size.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Since commit 878096eeb2 (cpu: Turn
cpu_dump_{state,statistics}() into CPUState hooks) CPUArchState is no
longer needed.
Add documentation and make the functions available through qemu/log.h
outside NEED_CPU_H to allow use in qom/cpu.c. Moving them to qom/cpu.h
was not yet possible due to convoluted include paths, so that some
devices grow an implicit and unneeded dependency on qom/cpu.h for now.
Acked-by: Michael Walle <michael@walle.cc> (for lm32)
Reviewed-by: Richard Henderson <rth@twiddle.net>
[AF: Simplified mb_cpu_do_interrupt() and do_interrupt_all() changes]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Move next_cpu from CPU_COMMON to CPUState.
Move first_cpu variable to qom/cpu.h.
gdbstub needs to use CPUState::env_ptr for now.
cpu_copy() no longer needs to save and restore cpu_next.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Rebased, simplified cpu_copy()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The previous two commits fixed bugs in -machine option queries. I
can't find fault with the remaining queries, but let's use
qemu_get_machine_opts() everywhere, for consistency, simplicity and
robustness.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1372943363-24081-7-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
It seems to be unused since several years (commit
be995c2764 in 2006).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-id: 1373044036-14443-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
cur_map is not used anymore; instead, each AddressSpaceDispatch
has its own nodes/sections pair. The priorities of the
MemoryListeners, and in the future RCU, guarantee that the
nodes/sections are not freed while they are still in use.
(In fact, next_map itself is not needed except to free the data on the
next update).
To avoid incorrect use, replace cur_map with a temporary copy that
is only valid while the topology is being updated. If you use it,
the name prev_map makes it clear that you're doing something weird.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After this patch, AddressSpaceDispatch holds a constistent tuple of
(phys_map, nodes, sections). This will be important when updates
of the topology will run concurrently with reads.
cur_map is not used anymore except for freeing it at the end of the
topology update.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This same treatment previously done to phys_node_map and phys_sections
is now applied to the dispatch field of AddressSpace. Topology updates
use as->next_dispatch while accesses use as->dispatch.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will help having two copies of AddressSpaceDispatch during the
recreation of the radix tree (one being built, and one that is complete
and will be protected by RCU). We do not want to have to unregister and
re-register the listener.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, phys_node_map and phys_sections are shared by all
of the AddressSpaceDispatch. When updating mem topology, all
AddressSpaceDispatch will rebuild dispatch tables sequentially
on them. In order to prepare for RCU access, leave the old
memory map alive while the next one is being accessed.
When rebuilding, the new dispatch tables will build and lookup
next_map; after all dispatch tables are rebuilt, we can switch
to next_* and free the previous table.
Based on a patch from Liu Ping Fan.
Signed-off-by: Liu Ping Fan <qemulist@gmail.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sections like phys_section_unassigned always have fixed address
in phys_sections. Declared as macro, so we can use them
when having more than one phys_sections array.
Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Liu Ping Fan <qemulist@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The iothread mutex might be released between map and unmap, so the
mapped region might disappear.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
First of all, rename "todo" to "done".
Second, clearly separate the case of done == 0 with the case of done != 0.
This will help handling reference counting in the next patch.
Third, this test:
if (memory_region_get_ram_addr(mr) + xlat != raddr + todo) {
does not guarantee that the memory region is the same across two iterations
of the while loop. For example, you could have two blocks:
A) size 640 K, mapped at physical address 0, ram_addr_t 0
B) size 64 K, mapped at physical address 0xa0000, ram_addr_t 0xa0000
then mapping 1 M starting at physical address zero will erroneously treat
B as the continuation of block A. qemu_ram_ptr_length ensures that no
invalid memory is accessed, but it is still a pointless complication of
the algorithm. The patch makes the logic clearer with an explicit test
that the memory region is the same.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After the next patch it would not be used elsewhere anyway. Also,
the _nofail and the standard versions of this function return different
things, which is confusing. Removing the function from the public headers
limits the confusion.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add ref/unref calls at the following places:
- places where memory regions are stashed by a listener and
used outside the BQL (including in Xen or KVM).
- memory_region_find callsites
- creation of aliases and containers (only the aliased/contained
region gets a reference to avoid loops)
- around calls to del_subregion/add_subregion, where the region
could disappear after the first call
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not bother visiting the radix tree when an address space is destroyed.
After the previous patch, this has become a pointless exercise. When
called from address_space_destroy_dispatch, all you're doing is zeroing
out a structure that will be freed as soon as you come back. When called
from mem_begin, when phys_page_set_level will call phys_map_node_alloc the
radix tree's array will be zeroed too.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
phys_sections_clear is invoked after the dispatch tree has been
destroyed. This leaves a window where phys_sections_nb > 0 but the
subpages are not valid anymore, which is a recipe for use-after-free
bugs.
Move the destruction of subpages in phys_sections_clear. We will
still destroy the subpages when an address space is cleaned up,
because address_space_destroy will clear as->root and commit the
change before it calls address_space_destroy_dispatch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current ioport dispatcher is a complex beast, mostly due to the
need to deal with old portio interface users. But we can overcome it
without converting all portio users by embedding the required base
address of a MemoryRegionPortio access into that data structure. That
removes the need to have the additional MemoryRegionIORange structure
in the loop on every access.
To handle old portio memory ops, we simply install dispatching handlers
for portio memory regions when registering them with the memory core.
This removes the need for the old_portio field.
We can drop the additional aliasing of ioport regions and also the
special address space listener. cpu_in and cpu_out now simply call
address_space_read/write. And we can concentrate portio handling in a
single source file.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make cpustats monitor command available unconditionally.
Prepares for changing kvm_handle_internal_error() and kvm_cpu_exec()
arguments to CPUState.
Signed-off-by: Andreas Färber <afaerber@suse.de>
It no longer depends on CPUArchState, so move it to qom/cpu.c.
Prepares for changing GDBState::c_cpu to CPUState.
Signed-off-by: Andreas Färber <afaerber@suse.de>
To be used to embed common CPU state into CPU subclasses.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
This is used during RDMA initialization in order to
transmit a description of all the RAM blocks to the
peer for later dynamic chunk registration purposes.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The "info mtree" command in QEMU console prints only "memory" and "I/O"
address spaces while there are actually a lot more other AddressSpace
structs created by PCI and VIO devices. Those devices do not normally
have names and therefore not present in "info mtree" output.
The patch fixes this.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The DMAContext is a simple pointer to an AddressSpace that is now always
already available. Make everyone hold the address space directly,
and clean up the DMA API to use the AddressSpace directly.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The translate function in the DMAContext is now always NULL.
Remove every reference to it.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a new memory region type that translates addresses it is given,
then forwards them to a target address space. This is similar to
an alias, except that the mapping is more flexible than a linear
translation and trucation, and also less efficient since the
translation happens at runtime.
The implementation uses an AddressSpace mapping the target region to
avoid hierarchical dispatch all the way to the resolved region; only
iommu regions are looked up dynamically.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
[Modified to put translation in address_space_translate; assume
IOMMUs are not reachable from TCG. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
So far, the size of all regions passed to listeners could fit in 64 bits,
because artificial regions (containers and aliases) are eliminated by
the memory core, leaving only device regions which have reasonable sizes
An IOMMU however cannot be eliminated by the memory core, and may have
an artificial size, hence we may need 65 bits to represent its size.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When adding support for 2^64-byte sections, we will have to change
the structure of mem_add to avoid failures in int128_get64.
Reorganize the code now before introducing Int128.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Only address_space_translate_for_iotlb needs to return the section.
Every caller of address_space_translate now uses only section->mr,
return it directly.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will allow to add support for unaligned memory regions: the subpage
container region can activate unaligned support unconditionally because
the read/write handler will now ensure that accesses are split as
required by calling address_space_rw. We can furthermore drop the
special handling of RAM subpages, address_space_rw takes care of this
already.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Except for the case of setting the IOTLB entry in TCG mode, we can avoid
the subpage dispatching handlers and do the resolution directly on
address_space_lookup_region. An IOTLB entry describes a full page, not
only the region that the first access to a sub-divided page may return.
This patch therefore introduces a special translation function,
address_space_translate_for_iotlb, that avoids the subpage resolutions.
In contrast, callers of the existing address_space_translate service
will now always receive the terminal memory region section. This will be
important for breaking the BQL and for enabling unaligned memory region.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will be needed for some corner cases with para-virtual I/O ports.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This introduces a wrapper for phys_page_find (before we complicate
address_space_translate with IOMMU translation). This function will
also encapsulate locking and reference counting when we introduce
BQL-free dispatching.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The memory API allows a MemoryRegion's size to be 2^64, as a special
case (otherwise the size always fits in a 64 bit integer). This meant
that attempts to access address zero in a 2^64 sized region would
assert in address_space_translate():
#3 0x00007ffff3e4d192 in __GI___assert_fail#(assertion=0x555555a43f32
"!a.hi", file=0x555555a43ef0 "include/qemu/int128.h", line=18,
function=0x555555a4439f "int128_get64") at assert.c:103
#4 0x0000555555877642 in int128_get64 (a=...)
at include/qemu/int128.h:18
#5 0x00005555558782f2 in address_space_translate (as=0x55555668d140,
/addr=0, xlat=0x7fffafac9918, plen=0x7fffafac9920, is_write=false)
at exec.c:221
Fix this by doing the 'min' operation in 128 bit arithmetic
rather than 64 bit arithmetic (we know the result of the 'min'
definitely fits in 64 bits because one of the inputs did).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The memory API is able to split it in two 4-byte accesses.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The old-style IOMMU lets you check whether an access is valid in a
given DMAContext. There is no equivalent for AddressSpace in the
memory API, implement it with a lookup of the dispatch tree.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will be used by address_space_access_valid too.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After the previous patches, this is a common test for all read/write
functions.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is no need to use the special phys_section_rom section.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using phys_page_find to translate an AddressSpace to a MemoryRegionSection
is unwieldy. It requires to pass the page index rather than the address,
and later memory_region_section_addr has to be called. Replace
memory_region_section_addr with a function that does all of it: call
phys_page_find, compute the offset within the region, and check how
big the current mapping is. This way, a large flat region can be written
with a single lookup rather than a page at a time.
address_space_translate will also provide a single point where IOMMU
forwarding is implemented.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This provides the basics for detecting accesses to unassigned memory
as soon as they happen, and also for a simple implementation of
address_space_access_valid.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We will soon reach this case when doing (unaligned) accesses that
span partly past the end of memory. We do not want to crash in
that case.
unassigned_mem_ops and rom_mem_ops are now the same.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is no reason to avoid a recompile before accessing unassigned
memory. In the end it will be treated as MMIO anyway.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is never used, the IOTLB always goes through io_mem_notdirty.
In fact in softmmu_template.h, if it were, QEMU would crash just
below the tests, as soon as io_mem_read/write dispatches to
error_mem_read/write.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The radix tree is statically sized to fit TARGET_PHYS_ADDR_SPACE_BITS.
If a larger memory region is registered, it will overflow.
Fix by limiting any section in the radix tree to the supported size.
This problem was not observed earlier since artificial regions (containers
and aliases) are eliminated by the memory core, leaving only device regions
which have reasonable sizes. An IOMMU however cannot be eliminated by the
memory core, and may have an artificial size.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
[ Fail the build if TARGET_PHYS_ADDR_SPACE_BITS is too large - Paolo ]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While sized to 15 bits in PhysPageEntry, the ptr field is ORed into the
iotlb entries together with a page-aligned pointer. The ptr field must
not overflow into this page-aligned value, assert that it is smaller than
the page size.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
See how we call memory_region_section_addr two lines below to
convert a physical address to a base address in the region.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We switched from qemu_memalign to mmap() but then we don't modify
qemu_vfree() to do a munmap() over free(). Which we cannot do
because qemu_vfree() frees memory allocated by qemu_{mem,block}align.
Introduce a new function that does the munmap(), luckily the size is
available in the RAMBlock.
Reported-by: Amos Kong <akong@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Message-id: 1368454796-14989-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This is preparatory to the introduction of a separate freeing API.
Reported-by: Amos Kong <akong@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Message-id: 1368454796-14989-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Wrapper to avoid open-coded loops and to make CPUState iteration
independent of CPUArchState.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Many of these should be cleaned up with proper qdev-/QOM-ification.
Right now there are many catch-all headers in include/hw/ARCH depending
on cpu.h, and this makes it necessary to compile these files per-target.
However, fixing this does not belong in these patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
find_ram_offset() does not handle size=0 gracefully. It hands out the
same RAMBlock offset multiple times, leading to obscure failures later
on.
Add an assert to warn early if something is incorrectly allocating a
zero size RAMBlock.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# By Andreas Färber (16) and Igor Mammedov (1)
# Via Andreas Färber
* afaerber/qom-cpu:
target-lm32: Update VMStateDescription to LM32CPU
target-arm: Override do_interrupt for ARMv7-M profile
cpu: Replace do_interrupt() by CPUClass::do_interrupt method
cpu: Pass CPUState to cpu_interrupt()
exec: Pass CPUState to cpu_reset_interrupt()
cpu: Move halted and interrupt_request fields to CPUState
target-cris/helper.c: Update Coding Style
target-i386: Update VMStateDescription to X86CPU
cpu: Introduce cpu_class_set_vmsd()
cpu: Register VMStateDescription through CPUState
stubs: Add a vmstate_dummy struct for CONFIG_USER_ONLY
vmstate: Make vmstate_register() static inline
target-sh4: Move PVR/PRR/CVR into SuperHCPUClass
target-sh4: Introduce SuperHCPU subclasses
cpus: Replace open-coded CPU loop in qmp_memsave() with qemu_get_cpu()
monitor: Use qemu_get_cpu() in monitor_set_cpu()
cpu: Fix qemu_get_cpu() to return NULL if CPU not found
Adds ramblocks' names to their backing files when using -mem-path. Eases
introspection and debugging.
Signed-off-by: Peter Feiner <peter@gridcentric.ca>
Message-id: 1362423265-15855-1-git-send-email-peter@gridcentric.ca
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move it to qom/cpu.h to avoid issues with include order.
Change pc_acpi_smi_interrupt() opaque to X86CPU.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Move it to qom/cpu.c to avoid build failures depending on include order
of cpu-qom.h and exec/cpu-all.h.
Change opaques of various ..._irq_handler() functions to the
appropriate CPU type to facilitate using cpu_reset_interrupt().
Fix Coding Style issues while at it (missing braces, indentation).
Signed-off-by: Andreas Färber <afaerber@suse.de>
Both fields are used in VMState, thus need to be moved together.
Explicitly zero them on reset since they were located before
breakpoints.
Pass PowerPCCPU to kvmppc_handle_halt().
Signed-off-by: Andreas Färber <afaerber@suse.de>
In comparison to DeviceClass::vmsd, CPU VMState is split in two,
"cpu_common" and "cpu", and uses cpu_index as instance_id instead of -1.
Therefore add a CPU-specific CPUClass::vmsd field.
Unlike the legacy CPUArchState registration, rather register CPUState.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Commit 55e5c2850 breaks CPU not found return value, and returns
CPU corresponding to the last non NULL env.
Fix it by returning CPU only if env is not NULL, otherwise CPU is
not found and function should return NULL.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Fix some of the nasty TCG race conditions and crashes by implementing
cpu_exit() as setting a flag which is checked at the start of each TB.
This avoids crashes if a thread or signal handler calls cpu_exit()
while the execution thread is itself modifying the TB graph (which
may happen in system emulation mode as well as in linux-user mode
with a multithreaded guest binary).
This fixes the crashes seen in LP:668799; however there are another
class of crashes described in LP:1098729 which stem from the fact
that in linux-user with a multithreaded guest all threads will
use and modify the same global TCG date structures (including the
generated code buffer) without any kind of locking. This means that
multithreaded guest binaries are still in the "unsupported"
category.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
g_strdup_printf already handles OOM errors, so some error handling in
QEMU code can be removed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Move the declaration to qemu/cpu.h and add documentation.
The implementation still depends on CPUArchState for CPU iteration.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Note that target-alpha accesses this field from TCG, now using a
negative offset. Therefore the field is placed last in CPUState.
Pass PowerPCCPU to [kvm]ppc_fixup_cpu() to facilitate this change.
Move common parts of mips cpu_state_reset() to mips_cpu_reset().
Acked-by: Richard Henderson <rth@twiddle.net> (for alpha)
[AF: Rebased onto ppc CPU subclasses and openpic changes]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Add the new mutex that protects shared state between ram_save_live
and the iothread. If the iothread mutex has to be taken together
with the ramlist mutex, the iothread shall always be _outside_.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
This will be used to detect if last_block might have become invalid
across different calls to ram_save_live.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Most of the time, only 2 items will be active (from/to for a string operation,
or code/data). But TCG guests likely won't have gigabytes of memory, so
this actually goes down to 1 item.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Various header files rely on qemu-char.h including qemu-config.h or
main-loop.h, but they really do not need qemu-char.h at all (particularly
interesting is the case of the block layer!). Clean this up, and also
add missing inclusions of qemu-char.h itself.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After allocating 32MB or more contiguous memory, huge pages
would seem to be ideal.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Define a new global dma_context_memory which is a DMAContext corresponding
to the global address_space_memory AddressSpace. This can be used by
sysbus peripherals like sysbus-ohci which need to do DMA.
In particular, use it in the sysbus-ohci device, which fixes a
segfault when attempting to use that device.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
* 'trivial-patches' of git://github.com/stefanha/qemu:
pc: Drop redundant test for ROM memory region
exec: make some functions static
target-ppc: make some functions static
ppc: add missing static
vnc: add missing static
vl.c: add missing static
target-sparc: make do_unaligned_access static
m68k: Return semihosting errno values correctly
cadence_uart: More debug information
Conflicts:
target-m68k/m68k-semi.c
Add GETPC_EXT which is used by MMU helpers to selectively calculate the code
address of accessing guest memory when called from a qemu_ld/st optimized code
or a C function. Currently, it supports only i386 and x86-64 hosts.
Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>