Commit Graph

276 Commits

Author SHA1 Message Date
Jan Kiszka
ea375f9ab8 KVM: Rework VCPU state writeback API
This grand cleanup drops all reset and vmsave/load related
synchronization points in favor of four(!) generic hooks:

- cpu_synchronize_all_states in qemu_savevm_state_complete
  (initial sync from kernel before vmsave)
- cpu_synchronize_all_post_init in qemu_loadvm_state
  (writeback after vmload)
- cpu_synchronize_all_post_init in main after machine init
- cpu_synchronize_all_post_reset in qemu_system_reset
  (writeback after system reset)

These writeback points + the existing one of VCPU exec after
cpu_synchronize_state map on three levels of writeback:

- KVM_PUT_RUNTIME_STATE (during runtime, other VCPUs continue to run)
- KVM_PUT_RESET_STATE   (on synchronous system reset, all VCPUs stopped)
- KVM_PUT_FULL_STATE    (on init or vmload, all VCPUs stopped as well)

This level is passed to the arch-specific VCPU state writing function
that will decide which concrete substates need to be written. That way,
no writer of load, save or reset functions that interact with in-kernel
KVM states will ever have to worry about synchronization again. That
also means that a lot of reasons for races, segfaults and deadlocks are
eliminated.

cpu_synchronize_state remains untouched, just as Anthony suggested. We
continue to need it before reading or writing of VCPU states that are
also tracked by in-kernel KVM subsystems.

Consequently, this patch removes many cpu_synchronize_state calls that
are now redundant, just like remaining explicit register syncs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-04 00:29:28 -03:00
Marcelo Tosatti
c902760fb2 Add option to use file backed guest memory
Port qemu-kvm's -mem-path and -mem-prealloc options. These are useful
for backing guest memory with huge pages via hugetlbfs.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
CC: john cooper <john.cooper@redhat.com>
2010-03-04 00:28:47 -03:00
Paul Brook
c527ee8fc8 Avoid tlb_set_page in userspace emulation
tlb_set_page isn't meaningful for userspace emulation, so remove it.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-03-01 04:40:29 +00:00
Paul Brook
c04b2b7899 Move subpage definitions
Move definitions for subpage handling into !CONFIG_USER_ONLY code.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-03-01 04:40:22 +00:00
Paul Brook
a68fe89caf Remove bogus cpu_physical_memory_rw
Userspace doesn't have physical memory, so cpu_physical_memory_rw
makes no sense.  This is only used to implement cpu_memory_rw_debug, so
just implement that directly instead.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-03-01 00:08:59 +00:00
Paul Brook
6d9a13042d Remove l1_phys_map from userspace emulation
Userspace emulation doesn't have a physical address space, so
l1_phys_map makes no sense. This code is never actually used, so don't
try and build it.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-02-28 23:55:53 +00:00
Paul Brook
94df27fd2f Fix userspace breakpoint invalidation
Remove bogus virtual->physical address translation in
breakpoint_invalidate for userspace emulation.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-02-28 23:47:45 +00:00
Michael S. Tsirkin
7b8f3b7834 kvm: move kvm to use memory notifiers
remove direct kvm calls from exec.c, make
kvm use memory notifiers framework instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-09 16:56:13 -06:00
Michael S. Tsirkin
f6f3fbcab0 qemu: memory notifiers
This adds notifiers for phys memory changes: a set of callbacks that
vhost can register and update kernel accordingly.  Down the road, kvm
code can be switched to use these as well, instead of calling kvm code
directly from exec.c as is done now.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-09 16:56:13 -06:00
Anthony Liguori
8217d94586 Merge remote branch 'qemu-kvm/uq/master' into staging-tmp 2010-02-08 10:06:54 -06:00
Riku Voipio
fd052bf63a linux-user: remove signal handler before calling abort()
Qemu may hang in host_signal_handler after qemu has done a
seppuku with cpu_abort(). But at this stage we are not really
interested in target process coredump anymore, so unregister
host_signal_handler to die grafefully.

Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
2010-02-06 17:19:43 +01:00
Riku Voipio
cab1b4bdc7 fix locking error with current_tb
Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
2010-02-06 17:19:43 +01:00
Paolo Bonzini
a484156557 exec.c: dead assignments
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-02-05 18:13:10 +00:00
Sheng Yang
62a2744ca0 kvm: Flush coalesced MMIO buffer periodly
The default action of coalesced MMIO is, cache the writing in buffer, until:
1. The buffer is full.
2. Or the exit to QEmu due to other reasons.

But this would result in a very late writing in some condition.
1. The each time write to MMIO content is small.
2. The writing interval is big.
3. No need for input or accessing other devices frequently.

This issue was observed in a experimental embbed system. The test image
simply print "test" every 1 seconds. The output in QEmu meets expectation,
but the output in KVM is delayed for seconds.

Per Avi's suggestion, I hooked flushing coalesced MMIO buffer in VGA update
handler. By this way, We don't need vcpu explicit exit to QEmu to
handle this issue.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-02-03 19:47:33 -02:00
Herve Poussineau
f8a83245d9 win32: pair qemu_memalign() with qemu_vfree()
Win32 suffers from a very big memory leak when dealing with SCSI devices.
Each read/write request allocates memory with qemu_memalign (ie
VirtualAlloc) but frees it with qemu_free (ie free).
Pair all qemu_memalign() calls with qemu_vfree() to prevent such leaks.

Signed-off-by: Herve Poussineau <hpoussin@reactos.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26 16:41:06 -06:00
Riku Voipio
f76cfe56d9 linux-user: enable tb unlinking when compiled with NPTL
Fixes receiving signals when guest code is being executed in a tight
loop. For an example, try interrupting the following code with ctrl-c.

http://nchipin.kos.to/test-loop.c

The tight loop is ofcourse brainless, but it is also exactly how the waitpid* testcases
are implemented.

Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2009-12-19 19:45:26 +01:00
Riku Voipio
c6703b4761 Give a error when running out of iomem areas.
The limit of iomem areas is quite low. Without the
debug print, it is quite hard to figure out why more
devices are not getting registered.

Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2009-12-18 23:23:56 +01:00
Juha Riihimäki
1e8b27ca85 Fix win32 log file location
/tmp doesn't exist under win32. Ease the pain of win32 development slightly.

From: Juha Riihimäki <juha.riihimaki@nokia.com>
Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2009-12-18 23:23:56 +01:00
Alexander Graf
6b02494d64 Allocate physical memory in low virtual address space
KVM on S390x requires the virtual address space of the guest's RAM to be
within the first 256GB.

The general direction I'd like to see KVM on S390 move is that this requirement
is losened, but for now that's what we're stuck with.

So let's just hack up qemu_ram_alloc until KVM behaves nicely :-).

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2009-12-05 17:36:02 +01:00
Aurelien Jarno
a167ba5085 Add support for GNU/kFreeBSD
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2009-11-29 18:00:41 +01:00
Izik Eidus
ccb167e9d7 ksm support
Call MADV_MERGEABLE on guest memory allocations.  MADV_MERGABLE will be
available starting in Linux 2.6.32.  This system call registers a region of
virtual address space with Linux as a candidate for transparent memory
sharing.

Patchworks-ID: 35447
Signed-off-by: Izik Eidus <ieidus@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-15 09:32:04 -05:00
Michael S. Tsirkin
8f2498f9f6 fix comment on cpu_register_physical_memory_offset
We don't require full pages in cpu_register_physical_memory,
except for RAM.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05 09:32:51 -05:00
Juan Quintela
d4bfa4d7c6 vmstate: remove const from pre_save() functions
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05 09:32:37 -05:00
Juan Quintela
e59fb3741b vmstate: add version_id argument to post_load
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05 09:32:36 -05:00
Anthony Liguori
c227f0995e Revert "Get rid of _t suffix"
In the very least, a change like this requires discussion on the list.

The naming convention is goofy and it causes a massive merge problem.  Something
like this _must_ be presented on the list first so people can provide input
and cope with it.

This reverts commit 99a0949b72.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-01 16:12:16 -05:00
malc
99a0949b72 Get rid of _t suffix
Some not so obvious bits, slirp and Xen were left alone for the time
being.

Signed-off-by: malc <av1474@comtv.ru>
2009-10-01 22:45:02 +04:00
Blue Swirl
72cf2d4f0e Fix sys-queue.h conflict for good
Problem: Our file sys-queue.h is a copy of the BSD file, but there are
some additions and it's not entirely compatible. Because of that, there have
been conflicts with system headers on BSD systems. Some hacks have been
introduced in the commits 15cc923584,
f40d753718,
96555a96d7 and
3990d09adf but the fixes were fragile.

Solution: Avoid the conflict entirely by renaming the functions and the
file. Revert the previous hacks.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-09-12 07:36:22 +00:00
Juan Quintela
e7f4eff7fb vmstate: port cpu_comon
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-11 11:10:05 -05:00
Edgar E. Iglesias
faed1c2a23 microblaze: Trap on bus accesses to unmapped areas.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2009-09-03 13:25:09 +02:00
Avi Kivity
4c0960c0c4 kvm: Simplify cpu_synchronize_state()
cpu_synchronize_state() is a little unreadable since the 'modified'
argument isn't self-explanatory.  Simplify it by making it always
synchronize the kernel state into qemu, and automatically flush the
registers back to the kernel if they've been synchronized on this
exit.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-27 20:35:30 -05:00
Blue Swirl
d60efc6b0d Make CPURead/WriteFunc structure 'const'
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-08-25 18:29:31 +00:00
Anthony Liguori
4a1418e07b Unbreak large mem support by removing kqemu
kqemu introduces a number of restrictions on the i386 target.  The worst is that
it prevents large memory from working in the default build.

Furthermore, kqemu is fundamentally flawed in a number of ways.  It relies on
the TSC as a time source which will not be reliable on a multiple processor
system in userspace.  Since most modern processors are multicore, this severely
limits the utility of kqemu.

kvm is a viable alternative for people looking to accelerate qemu and has the
benefit of being supported by the upstream Linux kernel.  If someone can
implement work arounds to remove the restrictions introduced by kqemu, I'm
happy to avoid and/or revert this patch.

N.B. kqemu will still function in the 0.11 series but this patch removes it from
the 0.12 series.

Paul, please Ack or Nack this patch.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-24 08:02:55 -05:00
Blue Swirl
660f11be54 Fix Sparse warnings: "Using plain integer as NULL pointer"
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-07-31 21:16:51 +00:00
Juan Quintela
2f7bb8780a rename USE_NPTL to CONFIG_USE_NPTL
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-27 14:10:55 -05:00
Filip Navara
bf65f53fba Remove setvbuf(<handle>, NULL, _IOLBF, 0) calls for Win32
On Win32 the setvbuf function requires the last parameter to be size between 2 and INT_MAX bytes, so the calls always failed. Since the whole point of the calls is to set line-buffered mode for the file handle and that's not supported on Win32 anyway, conditionally remove them.

Signed-off-by: Filip Navara <filip.navara@gmail.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-27 14:09:15 -05:00
Blue Swirl
0bf9e31af1 Fix most warnings (errors with -Werror) when debugging is enabled
I used the following command to enable debugging:
perl -p -i -e 's/^\/\/#define DEBUG/#define DEBUG/g' * */* */*/*

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-07-20 17:19:25 +00:00
Igor Kovalenko
0873898472 tlb flush cleanup
Use static empty variable s_cputlb_empty_entry to clear entries,
also reset addend member when clearing entries.
This helps running with valgrind/memcheck

Signed-off-by: igor.v.kovalenko@gmail.com

--
Kind regards,
Igor V. Kovalenko
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-16 17:28:50 -05:00
Blue Swirl
8167ee8839 Update to a hopefully more future proof FSF address
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-07-16 20:47:01 +00:00
Isaku Yamahata
34d5e948e8 cpu_unregister_map_client: fix memory leak.
fix memory leak in cpu_unregister_map_client() and cpu_notify_map_clients().

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-29 14:18:06 -05:00
Stefan Weil
f8e2af11d9 Win32: Reduce section alignment for Windows.
Maximum alignment for Win32 is 16, so don't try
to set it to 32. Otherwise the compiler complains:

exec.c:102: warning: alignment of 'code_gen_prologue'
is greater than maximum object file alignment.  Using 16

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-22 10:15:31 -05:00
Isaku Yamahata
cfde4bd931 exec.c: remove unnecessary #if NB_MMU_MODES
remove unnecessary #if NB_MMU_MODES by using loop.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16 15:52:38 -05:00
Glauber Costa
950f147249 provide cpu_index to env mapping
There are some people interested in, given a cpu number,
pick its CPUState. KVM is an example, although not yet in tree.
This patch provides a way of doing that.

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16 15:36:47 -05:00
Avi Kivity
e9179ce1a0 Rearrange io_mem_init()
Move io_mem_init() downwards to avoid a forward declaration.  No code change.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16 15:18:38 -05:00
Avi Kivity
1eed09cb4a Remove io_index argument from cpu_register_io_memory()
The parameter is always zero except when registering the three internal
io regions (ROM, unassigned, notdirty).  Remove the parameter to reduce
the API's power, thus facilitating future change.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16 15:18:37 -05:00
Mika Westerberg
edf8e2af14 linux-user: implemented ELF coredump support for ARM target
When target process is killed with signal (such signal that
should dump core) a coredump file is created.  This file is
similar than coredump generated by Linux (there are few exceptions
though).

Riku Voipio: added support for rlimit

Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
2009-06-16 16:56:28 +03:00
Nathan Froyd
1e9fa73016 fix gdbstub support for multiple threads in usermode, v3
When debugging multi-threaded programs, QEMU's gdb stub would report the
correct number of threads (the qfThreadInfo and qsThreadInfo packets).
However, the stub was unable to actually switch between threads (the T
packet), since it would report every thread except the first as being
dead.  Furthermore, the stub relied upon cpu_index as a reliable means
of assigning IDs to the threads.  This was a bad idea; if you have this
sequence of events:

initial thread created
new thread #1
new thread #2
thread #1 exits
new thread #3

thread #3 will have the same cpu_index as thread #1, which would confuse
GDB.  (This problem is partly due to the remote protocol not having a
good way to send thread creation/destruction events.)

We fix this by using the host thread ID for the identifier passed to GDB
when debugging a multi-threaded userspace program.  The thread ID might
wrap, but the same sort of problems with wrapping thread IDs would come
up with debugging programs natively, so this doesn't represent a
problem.

Signed-off-by: Nathan Froyd <froydnj@codesourcery.com>
2009-06-04 10:04:49 +01:00
Jan Kiszka
b0a46a333a kvm: Add missing bits to support live migration
This patch adds the missing hooks to allow live migration in KVM mode.
It adds proper synchronization before/after saving/restoring the VCPU
states (note: PPC is untested), hooks into
cpu_physical_memory_set_dirty_tracking() to enable dirty memory logging
at KVM level, and synchronizes that drity log into QEMU's view before
running ram_live_save().

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-22 10:50:33 -05:00
Jan Kiszka
151f7749f2 kvm: Rework dirty bitmap synchronization
Extend kvm_physical_sync_dirty_bitmap() so that is can sync across
multiple slots. Useful for updating the whole dirty log during
migration. Moreover, properly pass down errors the whole call chain.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-22 10:50:33 -05:00
Stuart Brady
ccbb4d44fc Fix typos in comments in exec.c
This patch fixes several typos in comments in exec.c:

            longet -> longer
       recommanded -> recommended
        ajustments -> adjustments
   inconsistancies -> inconsistencies
           phsical -> physical
       positionned -> positioned
       succesfully -> successfully
      regon_offset -> region_offset

and also:

      start_region -> start_addr

Signed-off-by: Stuart Brady <stuart.brady@gmail.com>
2009-05-03 21:58:28 +03:00
Jan Kiszka
6f0437e8de kvm: Avoid COW if KVM MMU is asynchronous
Avi Kivity wrote:
> Suggest wrapping in a function and hiding it deep inside kvm-all.c.
>

Done in v2:

---------->

If the KVM MMU is asynchronous (kernel does not support MMU_NOTIFIER),
we have to avoid COW for the guest memory. Otherwise we risk serious
breakage when guest pages change there physical locations due to COW
after fork. Seen when forking smbd during runtime via -smb.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-01 09:44:11 -05:00