Commit Graph

325 Commits

Author SHA1 Message Date
Stefan Hajnoczi
db1923de60 exec: Remove debugging fprintf() that slipped into qemu_ram_alloc_from_ptr()
Remove the debugging fprintf() slipped in via the following commit:

    commit b2e0a138e7
    Author: Michael S. Tsirkin <mst@redhat.com>
    Date:   Mon Nov 22 19:52:34 2010 +0200

        migration: stable ram block ordering

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-12-03 11:50:20 -06:00
Michael S. Tsirkin
b2e0a138e7 migration: stable ram block ordering
This makes ram block ordering under migration stable, ordered by offset.
This is especially useful for migration to exec, for debugging.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Jason Wang <jasowang@redhat.com>
2010-12-02 21:13:39 +02:00
Stefan Weil
055403b2a7 exec: Use fprintf_function for dump_exec_info (format checking)
fprintf_function uses format checking with GCC_FMT_ATTR.

It is declared in qemu-common.h and used in cpu-all.h
(which is included from cpu.h), so qemu-common.h must
be included earlier. Some redundant include statements
for standard include files were removed.

Fix also two format errors (ptrdiff_t needs %td).

Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-10-30 08:01:59 +00:00
Marcelo Tosatti
e890261f67 Export qemu_ram_addr_from_host
To be used by next patches.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Stefan Weil
7fd3f49440 exec: Fix compilation error for debug code
is_softmmu was removed with commit
d4c430a80f,
so remove it now from debug code, too.

Fix also the format specifier for paddr
in the same line of code.

Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-10-03 06:41:09 +00:00
Andreas Färber
e78815a554 Introduce qemu_madvise()
vl.c has a Sun-specific hack to supply a prototype for madvise(),
but the call site has apparently moved to arch_init.c.

Haiku doesn't implement madvise() in favor of posix_madvise().
OpenBSD and Solaris 10 don't implement posix_madvise() but madvise().
MinGW implements neither.

Check for madvise() and posix_madvise() in configure and supply qemu_madvise()
as wrapper. Prefer madvise() over posix_madvise() due to flag availability.
Convert all callers to use qemu_madvise() and QEMU_MADV_*.

Note that on Solaris the warning is fixed by moving the madvise() prototype,
not by qemu_madvise() itself. It helps with porting though, and it simplifies
most call sites.

v7 -> v8:
* Some versions of MinGW have no sys/mman.h header. Reported by Blue Swirl.

v6 -> v7:
* Adopt madvise() rather than posix_madvise() semantics for returning errors.
* Use EINVAL in place of ENOTSUP.

v5 -> v6:
* Replace two leftover instances of POSIX_MADV_NORMAL with QEMU_MADV_INVALID.
  Spotted by Blue Swirl.

v4 -> v5:
* Introduce QEMU_MADV_INVALID, suggested by Alexander Graf.
  Note that this relies on -1 not being a valid advice value.

v3 -> v4:
* Eliminate #ifdefs at qemu_advise() call sites. Requested by Blue Swirl.
  This will currently break the check in kvm-all.c by calling madvise() with
  a supported flag, which will not fail. Ideas/patches welcome.

v2 -> v3:
* Reuse the *_MADV_* defines for QEMU_MADV_*. Suggested by Alexander Graf.
* Add configure check for madvise(), too.
  Add defines to Makefile, not QEMU_CFLAGS.
  Convert all callers, untested. Suggested by Blue Swirl.
* Keep Solaris' madvise() prototype around. Pointed out by Alexander Graf.
* Display configure check results.

v1 -> v2:
* Don't rely on posix_madvise() availability, add qemu_madvise().
  Suggested by Blue Swirl.

Signed-off-by: Andreas Färber <afaerber@opensolaris.org>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-09-25 11:26:05 +00:00
Gleb Natapov
95c318f5e1 Fix segfault in mmio subpage handling code.
It is possible that subpage mmio is registered over existing memory
page. When this happens "memory" will have real memory address and not
index into io_mem array so next access to the page will generate
segfault. It is uncommon to have some part of a page to be accessed as
memory and some as mmio, but qemu shouldn't crash even when guest does
stupid things. So lets just pretend that the rest of the page is
unassigned if guest configure part of the memory page as mmio.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-08-28 08:47:23 +00:00
Yoshiaki Tamura
6977dfe6af exec: remove code duplication in qemu_ram_alloc() and qemu_ram_alloc_from_ptr()
Since most of the code in qemu_ram_alloc() and
qemu_ram_alloc_from_ptr() are duplicated, let
qemu_ram_alloc_from_ptr() to switch by checking void *host, and change
qemu_ram_alloc() to a wrapper.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-08-22 16:19:00 -05:00
Yoshiaki Tamura
9742bf26b1 exec: replace tabs by spaces.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-08-22 16:19:00 -05:00
Cam Macdonell
84b89d782f Add qemu_ram_alloc_from_ptr function
Provide a function to add an allocated region of memory to the qemu RAM.

This patch is copied from Marcelo's qemu_ram_map() in qemu-kvm and given the
clearer name qemu_ram_alloc_from_ptr().

Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-08-10 16:25:15 -05:00
Stefan Weil
24ab68ac72 Declare code_gen_ptr, code_gen_max_blocks 'static'
Both values are only used in exec.c, so there is no need
to make them globally available.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-07-22 05:52:10 +02:00
Blue Swirl
09d7ae9000 Fix warning about uninitialized variable
With gcc 4.2.1-sjlj (mingw32-2) I get this warning:
/src/qemu/exec.c: In function 'qemu_ram_alloc':
/src/qemu/exec.c:2777: warning: 'offset' may be used uninitialized in this function

Fix by initializing the variable.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-07-07 19:37:53 +00:00
Alex Williamson
fb787f81e7 ramblocks: No more being lazy about duplicate names
Now that we have a working qemu_ram_free() and the primary runtime
user of it has been updated, don't be lenient about duplicate id strings.
We also shouldn't need to create them ondemand at the target.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-06 10:36:29 -05:00
Alex Williamson
04b1665372 qemu_ram_free: Implement it
Now that we can support a ram_addr_t space with holes, we can implement
qemu_ram_free().

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-06 10:36:28 -05:00
Alex Williamson
cc9e98cb8f ramblocks: Make use of DeviceState pointer and BusInfo.get_dev_path
With these two pieces in place, we can start naming ramblocks.  When
the device is present and it lives on a bus that provides a device
path, we concatenate the path and the provided name.  Otherwise we
just use name.  The resulting id string must be unique.  For now we
assume an allocation for the same name and size is a device that has
been removed and reinserted and return the same block.  This will go
away once qemu_ram_free() is implemented.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-06 10:36:28 -05:00
Alex Williamson
1724f04985 qemu_ram_alloc: Add DeviceState and name parameters
These will be used to generate unique id strings for ramblocks.  The name
field is required, the device pointer is optional as most callers don't
have a device.  When there's no device or the device isn't a child of
a bus implementing BusInfo.get_dev_path, the name should be unique for
the platform.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-06 10:36:28 -05:00
Alex Williamson
0be71e324f savevm: Add DeviceState param
When available, we'd like to be able to access the DeviceState
when registering a savevm.  For buses with a get_dev_path()
function, this will allow us to create more unique savevm
id strings.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-06 10:36:28 -05:00
Alex Williamson
d17b5288d9 Remove uses of ram.last_offset (aka last_ram_offset)
We currently need this either to allocate the next ram_addr_t for a
new block, or for total memory to be migrated.  Both of which we can
calculate without need of this to keep us in a contiguous address space.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-06 10:36:27 -05:00
Jun Koi
bf298f83c3 A bit optimization for tlb_set_page()
This patch avoids handling write watchpoints on read-only memory access.
It also breaks the searching loop for watchpoint once the setup for
handling watchpoint later is done.

Signed-off-by: Jun Koi <junkoi2004@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-06-30 20:25:55 +02:00
Alex Williamson
f471a17e9d ram_blocks: Convert to a QLIST
This makes the RAM block list easier to manipulate.  Also incorporate
relevant variables into the RAMList struct.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Chris Wright <chrisw@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-06-14 11:12:53 -05:00
Richard Henderson
eba0b89379 tcg-s390: Allocate the code_gen_buffer near the main program.
This allows the use of direct calls to the helpers,
and a direct branch back to the epilogue.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-06-11 09:30:48 +02:00
Aurelien Jarno
239fda311a tcg: get rid of copy_size in TCGOpDef
copy_size is a left-over from the dyngen era, remove it.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-06-09 16:10:50 +02:00
Richard Henderson
9002ec794e tcg: Initialize the prologue after GUEST_BASE is fixed.
This will allow backends to make intelligent choices about how
to implement GUEST_BASE.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-05-21 18:41:21 +02:00
Marcelo Tosatti
618a568da4 Fix -mem-path with hugetlbfs
Fallback to qemu_vmalloc in case file_ram_alloc fails.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-11 14:02:21 -03:00
Richard Henderson
3cab721d0e Fill in unassigned mem read/write callbacks.
Implement the "functions may be omitted with NULL pointer"
interface mentioned in the function block comment by transforming
NULL entries in the read/write arrays into calls to the
unassigned_mem family of functions.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-05-07 16:58:38 +00:00
Michael S. Tsirkin
733f0b02c8 qemu: address todo comment in exec.c
exec.c has a comment 'XXX: optimize' for lduw_phys/stw_phys,
so let's do it, along the lines of stl_phys.

The reason to address 16 bit accesses specifically is that virtio relies
on these accesses to be done atomically, using memset as we do now
breaks this assumption, which is reported to cause qemu with kvm
to read wrong index values under stress.

https://bugzilla.redhat.com/show_bug.cgi?id=525323

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-05-06 07:28:49 +02:00
Richard Henderson
3e0650a9c9 Fix zero-length write(2).
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-05-06 06:45:12 +02:00
Paul Brook
2e9a5713f0 Remove PAGE_RESERVED
The usermode PAGE_RESERVED code is not required by the current mmap
implementation, and is already broken when guest_base != 0.
Unfortunately the bsd emulation still uses the old mmap implementation,
so we can't rip it out altogether.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-05-05 16:32:59 +01:00
Richard Henderson
f64052478e Remove IO_MEM_SUBWIDTH.
Greatly simplify the subpage implementation by not supporting
multiple devices at the same address at different widths.  We
don't need full copies of mem_read/mem_write/opaque for each
address, only a single index back into the main io_mem_* arrays.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-04-25 12:59:33 +00:00
Jun Koi
24f7fb19b3 Cleanup dead code
This patch removes some dead code in exec.c

Signed-off-by: Jun Koi <junkoi2004@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-04-11 20:15:01 +00:00
Aurelien Jarno
fd43690777 Revert "Avoid page_set_flags() assert in qemu-user host page protection code"
This reverts commit 01c0bef162.

(breaks build on 32-bit hosts)
2010-04-10 17:20:36 +02:00
Juergen Lock
01c0bef162 Avoid page_set_flags() assert in qemu-user host page protection code
V2 that uses endaddr = end-of-guest-address-space if !h2g_valid(endaddr)
after I found out that indeed works; and also disables the FreeBSD 6.x
/compat/linux/proc/self/maps fallback because it can return partial lines
if (at least I think that's the reason) the mappings change between
subsequent read() calls.

Signed-off-by: Juergen Lock <nox@jelal.kn-bremen.de>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-04-09 22:01:30 +02:00
Yoshiaki Tamura
f7c11b5350 Replace direct phys_ram_dirty access with wrapper functions.
Replaces direct phys_ram_dirty access with wrapper functions to prevent
direct access to the phys_ram_dirty bitmap.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp>
Reviewed-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-04-08 11:11:21 +02:00
Paul Brook
355b194369 Split TLB addend and target_phys_addr_t
Historically the qemu tlb "addend" field was used for both RAM and IO accesses,
so needed to be able to hold both host addresses (unsigned long) and guest
physical addresses (target_phys_addr_t).  However since the introduction of
the iotlb field it has only been used for RAM accesses.

This means we can change the type of addend to unsigned long, and remove
associated hacks in the big-endian TCG backends.

We can also remove the host dependence from target_phys_addr_t.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-04-05 00:28:53 +01:00
Aurelien Jarno
ebf50fb3b9 tcg: align static_code_gen_buffer to CODE_GEN_ALIGN
On ia64, the default memory alignement is not enough for a code
alignement. To fix that, force static_code_gen_buffer alignment
to CODE_GEN_ALIGN.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-04-01 21:51:59 +02:00
Aurelien Jarno
45d679d643 linux-user: fix page_unprotect when host page size > target page size
When the host page size is bigger that the target one, unprotecting a
page should:
- mark all the target pages corresponding to the host page as writable
- invalidate all tb corresponding to the host page (and not the target
  page)

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-04-01 21:51:59 +02:00
Juergen Lock
f01576f185 Get bsd-user host page protection code working on FreeBSD hosts
Use kinfo_getvmmap(3) on FeeBSD >= 7.x and /compat/linux/proc on older
FreeBSD.  (kinfo_getvmmap is preferred since /compat/linux/proc is
usually only mounted on hosts also using the Linuxolator.)

This patch is a bit hacky because the includes needed for kinfo_getvmmap
conflict with other definitions in exec.c by default so I had to `trick
around' a little, but I built the result in FreeBSD 6.4-stable and
7.2-stable tbs and on 8-stable on the host so the hacks at least
should be stable.  (If this is a problem maybe we could also move the
kinfo_getvmmap invocations into a seperate source file but that would
be more work...)

Signed-off-by: Juergen Lock <nox@jelal.kn-bremen.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-03-30 17:45:10 +00:00
Blue Swirl
29e922b61f Compile qemu-timer only once
Arrange various declarations so that also non-CPU code can access
them, adjust users.

Move CPU specific code to cpus.c.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-03-29 19:24:00 +00:00
Aurelien Jarno
91dbed4ba1 exec: remove dead code
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-03-28 18:47:25 +02:00
Michael Tokarev
6adc05492f be more specific in -mem-path error messages
Signed-Off-By: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-03-27 15:26:36 +01:00
Paul Brook
d4c430a80f Large page TLB flush
QEMU uses a fixed page size for the CPU TLB.  If the guest uses large
pages then we effectively split these into multiple smaller pages, and
populate the corresponding TLB entries on demand.

When the guest invalidates the TLB by virtual address we must invalidate
all entries covered by the large page.  However the address used to
invalidate the entry may not be present in the QEMU TLB, so we do not
know which regions to clear.

Implementing a full vaiable size TLB is hard and slow, so just keep a
simple address/mask pair to record which addresses may have been mapped by
large pages.  If the guest invalidates this region then flush the
whole TLB.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-03-17 02:44:41 +00:00
Paul Brook
7296abaccc Fix pagetable code
The multi-level pagetable code fails to iterate ove all entries because
of the L2_BITS v.s. L2_SIZE thinko.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-03-14 14:58:46 +00:00
Blue Swirl
338e9e6ce5 Fix more wrong usermode virtual address types
Fixes warning:
  CC    sparc-bsd-user/exec.o
/src/qemu/exec.c: In function `page_check_range':
/src/qemu/exec.c:2375: warning: comparison is always true due to limited range of data type

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-03-13 09:48:08 +00:00
Paul Brook
b480d9b74d Fix usermode virtual address type
Usermode virtual addresses are abi_ulong, not target_ulong.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-03-12 23:25:52 +00:00
Paul Brook
b3755a915e Disable phsyical memory handling in userspace emulation.
Code to handle physical memory access is not meaningful in usrmode emulation,
so disable it.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-03-12 18:34:25 +00:00
Paul Brook
41c1b1c9eb Add tb_page_addr_t
The page tracking code in exec.c is used by both userspace and system
emulation.  Userspace emulation uses it to track virtual pages, and
system emulation to track ram pages.  Introduce a new type to hold this
kind of address.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-03-12 17:23:50 +00:00
Richard Henderson
376a790970 Fix last page errors in page_check_range and page_set_flags.
The addr < end comparison prevents iterating over the last
page in the guest address space; an iteration based on
length avoids this problem.

At the same time, assert that the given address is in the
guest address space.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2010-03-12 16:31:32 +00:00
Richard Henderson
5cd2c5b6ad Implement multi-level page tables.
Define L1_MAP_ADDR_SPACE_BITS to be either the virtual address size
(in user mode) or physical address size (in system mode), and use
that to size l1_map.  This rewrites page_find_alloc, page_flush_tb,
and walk_memory_regions.

Use TARGET_PHYS_ADDR_SPACE_BITS for the physical memory map based
off of l1_phys_map.  This rewrites page_phys_find_alloc and
phys_page_for_each.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2010-03-12 16:31:09 +00:00
Richard Henderson
5270589032 Move TARGET_PHYS_ADDR_SPACE_BITS to target-*/cpu.h.
Removes a set of ifdefs from exec.c.

Introduce TARGET_VIRT_ADDR_SPACE_BITS for all targets other
than Alpha.  This will be used for page_find_alloc, which is
supposed to be using virtual addresses in the first place.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2010-03-12 16:28:24 +00:00
Jan Kiszka
ea375f9ab8 KVM: Rework VCPU state writeback API
This grand cleanup drops all reset and vmsave/load related
synchronization points in favor of four(!) generic hooks:

- cpu_synchronize_all_states in qemu_savevm_state_complete
  (initial sync from kernel before vmsave)
- cpu_synchronize_all_post_init in qemu_loadvm_state
  (writeback after vmload)
- cpu_synchronize_all_post_init in main after machine init
- cpu_synchronize_all_post_reset in qemu_system_reset
  (writeback after system reset)

These writeback points + the existing one of VCPU exec after
cpu_synchronize_state map on three levels of writeback:

- KVM_PUT_RUNTIME_STATE (during runtime, other VCPUs continue to run)
- KVM_PUT_RESET_STATE   (on synchronous system reset, all VCPUs stopped)
- KVM_PUT_FULL_STATE    (on init or vmload, all VCPUs stopped as well)

This level is passed to the arch-specific VCPU state writing function
that will decide which concrete substates need to be written. That way,
no writer of load, save or reset functions that interact with in-kernel
KVM states will ever have to worry about synchronization again. That
also means that a lot of reasons for races, segfaults and deadlocks are
eliminated.

cpu_synchronize_state remains untouched, just as Anthony suggested. We
continue to need it before reading or writing of VCPU states that are
also tracked by in-kernel KVM subsystems.

Consequently, this patch removes many cpu_synchronize_state calls that
are now redundant, just like remaining explicit register syncs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-04 00:29:28 -03:00