Enabling the I/O thread by default seems like an important part of declaring
1.0. Besides allowing true SMP support with KVM, the I/O thread means that the
TCG VCPU doesn't have to multiplex itself with the I/O dispatch routines which
currently requires a (racey) signal based alarm system.
I know there have been concerns about performance. I think so far the ones that
have come up (virtio-net) are most likely due to secondary reasons like
decreased batching.
I think we ought to force enabling I/O thread early in 1.0 development and
commit to resolving any lingering issues.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We can express the VCPU thread wakeup with the stop mechanism, saving
both qemu_system_ready and the qemu_system_cond. For KVM threads, we can
just enter the main loop as long as the thread is stopped. The central
TCG thread is better held back before the loop as there can be side
effects of the services called even when all CPUs are stopped.
Creating VCPUs in stopped state will also be required for proper CPU
hotplugging support.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In TCG mode, iothread and vcpus run in lock-step. So it's pointless to
send a signal from qemu_cpu_kick to the vcpu thread - if we got here,
the receiver already left the vcpu loop.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This conveys the intention better, and scales to more than >1
threads contending the mutex with the iothread (as long as all
of them have a "quiescent point" like the TCG thread has).
Also, on Mac OS X the fair_mutex somehow didn't work as intended
and deadlocked.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Both the signal thread (via sigwait()) and the cpu thread (via
a normal signal handler) were attempting to catch SIG_IPI.
This resulted in random freezes under Darwin.
This patch separates SIG_IPI from the rest of the signals handled
by the signal thread, because it is independently caught by the cpu
thread.
Signed-off-by: Alexandre Raymond <cerbere@gmail.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Changes since v1:
- take pthread_sigmask() out of the ifdef as it is now common
to both parts.
This fix effectively blocks, in the main thread, the signals handled
by signalfd or the compatibility signal thread.
This way, such signals are received synchronously in the main thread
through sigfd_handler() instead of triggering the signal handler
directly, asynchronously.
Signed-off-by: Alexandre Raymond <cerbere@gmail.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
sigset_t, used by that header, is not available in mingw32 environments.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Add command line support for logging to a location other than /tmp/qemu.log.
With logging enabled (command line option -d), the log is written to
the hard-coded path /tmp/qemu.log. This patch adds support for writing
the log to a different location by passing the -D option.
Signed-off-by: Matthew Fernandez <matthew.fernandez@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
It is purely for icount-based virtual timers. And now that we got the
code right, rename the function to clarify the intended scope.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
The previous patch however is not enough, because if the virtual CPU
goes to sleep waiting for a future timer interrupt to wake it up, qemu
deadlocks. The timer interrupt never comes because time is driven by
icount, but the vCPU doesn't run any insns.
You could say that VCPUs should never go to sleep in icount
mode if there is a pending vm_clock timer; rather time should
just warp to the next vm_clock event with no sleep ever taking place.
Even better, you can sleep for some time related to the
time left until the next event, to avoid that the warps are too visible
externally; for example, you could be sending network packets continously
instead of every 100ms.
This is what this patch implements. qemu_clock_warp is called: 1)
whenever a vm_clock timer is adjusted, to ensure the warp_timer is
synchronized; 2) at strategic points in the CPU thread, to make sure
the insn counter is synchronized before the CPU starts running.
In any case, the warp_timer is disabled while the CPU is running,
because the insn counter will then be making progress on its own.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
The correct fix for -icount is to consider the biggest difference
between iothread and non-iothread modes. In the traditional model,
CPUs run _before_ the iothread calls select (or WaitForMultipleObjects
for Win32). In the iothread model, CPUs run while the iothread
isn't holding the mutex, i.e. _during_ those same calls.
So, the iothread should always block as long as possible to let
the CPUs run smoothly---the timeout might as well be infinite---and
either the OS or the CPU thread itself will let the iothread know
when something happens. At this point, the iothread wakes up and
interrupts the CPU.
This is exactly the approach that this patch takes: when cpu_exec_all
returns in -icount mode, and it is because a vm_clock deadline has
been met, it wakes up the iothread to process the timers. This is
really the "bulk" of fixing icount.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Here the int values fds[0], sigfd, s, sock and fd are converted
to void pointers which are later converted back to an int value.
These conversions should always use intptr_t instead of unsigned long.
They are needed for environments where sizeof(long) != sizeof(void *).
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Based on patch by Glauber Costa:
To allow management applications like libvirt to apply CPU affinities to
the VCPU threads, expose their ID via info cpus. This patch provides the
pre-existing and used interface from qemu-kvm.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
With in-kernel irqchip support enabled, the vcpu threads sleep in kernel
space while halted. Account for this difference in cpu_thread_is_idle.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Commit 83f338f73e broke x86 hardware breakpoint emulation by moving the
debug exception handling out of cpu_exec. Fix this by moving all TCG
related bits back, only leaving the generic guest debugging parts in
cpus.c.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: TeLeMan <geleman@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
qemu_kvm_eat_signals requires POSIX support with realtime extensions for
sigtimedwait. Not all our target platforms provide this. Moreover,
undefined sigbus_reraise was referenced on non-Linux as well.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Whenever env->created becomes true, qemu_cpu_cond is signaled by
{kvm,tcg}_cpu_thread_fn.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
all_vcpus_paused can start returning true after penv->stopped changes
from 0 to 1. When this is done, qemu_pause_cond is always signaled.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
qemu_main_loop_start is the only place where qemu_system_ready is set
to 1.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The following conditions can cause cpu_has_work(env) to become true:
- env->queued_work_first: run_on_cpu is already kicking the VCPU
- env->stop = 1: pause_all_vcpus is already kicking the VCPU
- env->stopped = 0: resume_all_vcpus is already kicking the VCPU
- vm_running = 1: vm_start is calling resume_all_vcpus
- env->halted = 0: see previous patch
- qemu_cpu_has_work(env): when it becomes true, board code should set
env->halted = 0 too.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Sometimes vcpus are stopped directly without going through ->stop = 1.
Exit the VCPU execution loop in this case as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
We have qemu_cpu_self and qemu_thread_self. The latter is retrieving the
current thread, the former is checking for equality (using CPUState). We
also have qemu_thread_equal which is only used like qemu_cpu_self.
This refactors the interfaces, creating qemu_cpu_is_self and
qemu_thread_is_self as well ass qemu_thread_get_self.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Mixing up TCG bits with KVM already led to problems around eflags
emulation on x86. Moreover, quite some code that TCG requires on cpu
enty/exit is useless for KVM. So dispatch between tcg_cpu_exec and
kvm_cpu_exec as early as possible.
The core logic of cpu_halted from cpu_exec is added to
kvm_arch_process_irqchip_events. Moving away from cpu_exec makes
exception_index meaningless for KVM, we can simply pass the exit reason
directly (only "EXCP_DEBUG vs. rest" is relevant).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
To prepare splitting up KVM and TCG CPU entry/exit, move the debug
exception into cpus.c and invoke cpu_handle_debug_exception on return
from qemu_cpu_exec.
This also allows to clean up the debug request signaling: We can assign
the job of informing main-loop to qemu_system_debug_request and stop the
calling cpu directly in cpu_handle_debug_exception. That means a debug
stop will now only be signaled via debug_requested and not additionally
via vmstop_requested.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Instead of fiddling with debug_requested and vmstop_requested directly,
introduce qemu_system_debug_request and turn qemu_system_vmstop_request
into a public interface. This aligns those services with exiting ones in
vl.c.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Define and use dedicated constants for vm_stop reasons, they actually
have nothing to do with the EXCP_* defines used so far. At this chance,
specify more detailed reasons so that VM state change handlers can
evaluate them.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avoid duplicate use of the function name cpu_has_work, it's confusing,
also their scope. Refactor cpu_has_work to cpu_thread_is_idle and do the
same with any_cpu_has_work.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Pure interface cosmetics: Ensure that only kvm core services (as
declared in kvm.h) start with "kvm_". Prepend "qemu_" to those that
violate this rule in cpus.c. Also rename the corresponding tcg functions
for the sake of consistency.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Introduce qemu_cpu_kick_self to send SIG_IPI to the calling VCPU
context. First user will be kvm.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Currently, we only configure and process MCE-related SIGBUS events if
CONFIG_IOTHREAD is enabled. The groundwork is laid, we just need to
factor out the required handler registration and system configuration.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between
checking for exit_request on vcpu entry and timer signals arriving
before KVM starts to catch them. Plug it by blocking both timer related
signals also on !CONFIG_IOTHREAD and process those via signalfd.
As this fix depends on real signalfd support (otherwise the timer
signals only kick the compat helper thread, and the main thread hangs),
we need to detect the invalid constellation and abort configure.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Will be required for SIGBUS handling. For obvious reasons, this will
remain a nop on Windows hosts.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Move qemu_kvm_eat_signals around and call it also when the IO-thread is
not used. Do not yet process SIGBUS, will be armed in a separate step.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
We do not use the timeout, so drop its logic. As we always poll our
signals, we do not need to drop the global lock. Removing those calls
allows some further simplifications. Also fix the error processing of
sigpending at this chance.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Block SIG_IPI, unblock it during KVM_RUN, just like in io-thread mode.
It's unused so far, but this infrastructure will be required for
self-IPIs and to process SIGBUS plus, in KVM mode, SIGIO and SIGALRM. As
Windows doesn't support signal services, we need to provide a stub for
the init function.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>