Also delta in qemu_next_alarm_deadline is a 64 bit value so set the
default to INT64_MAX instead of INT32_MAX.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Fix win32_rearm_timer and mm_rearm_timer: they should be able to handle
INT64_MAX as a delta parameter without overflowing.
Also, the next deadline in ms should be calculated rounding down rather
than up (see unix_rearm_timer and dynticks_rearm_timer).
Finally ChangeTimerQueueTimer takes an unsigned long and timeSetEvent
takes an unsigned int as delta, so cast the ms delta to the appropriate
unsigned integer.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Basically, the main wait loop calls qemu_run_all_timers() unconditionally. The
first thing this routine used to do is to see if a timer had been serviced,
and then reset the loop timeout to the next deadline.
However, the new deadlines had not been calculated at that point, as
qemu_run_timers() had not been called yet for each of the clocks. So
qemu_rearm_alarm_timer() would end up with a negative or zero deadline, and
default to setting a 250us timeout for the loop.
As qemu_run_timers() is called for each clock, the real deadlines would be put
in place, but because a loop timeout was already set, the loop timeout would
not be changed.
Once that 250us timeout fired, the real deadline would be used for the
subsequent timeout.
For idle VMs, this effectively doubles the number of times through the loop,
doubling the number of select() system calls, timer calls, etc. putting added
scheduling pressure on the kernel. And under cgroups, this really causes a big
problem because the cgroup code does not scale well.
By simply running the timers before trying to rearm the timer, we always rearm
with a non-zero deadline, effectively halving the number of system calls.
Signed-off-by: Peter Portante <pportant@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch combines qtest and -icount together to turn the vm_clock
into a source that can be fully managed by the client. To this end new
commands clock_step and clock_set are added. Hooking them with libqtest
is left as an exercise to the reader.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Notifiers do not need to access both ends of the list, and using
a QLIST also simplifies the API.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The non-dynticks timer variations are broken, so they can be
removed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
These will be used when moving icount accounting to cpus.c.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Today, when notifying a VM state change with vm_state_notify(),
we pass a VMSTOP macro as the 'reason' argument. This is not ideal
because the VMSTOP macros tell why qemu stopped and not exactly
what the current VM state is.
One example to demonstrate this problem is that vm_start() calls
vm_state_notify() with reason=0, which turns out to be VMSTOP_USER.
This commit fixes that by replacing the VMSTOP macros with a proper
state type called RunState.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Enabling the I/O thread by default seems like an important part of declaring
1.0. Besides allowing true SMP support with KVM, the I/O thread means that the
TCG VCPU doesn't have to multiplex itself with the I/O dispatch routines which
currently requires a (racey) signal based alarm system.
I know there have been concerns about performance. I think so far the ones that
have come up (virtio-net) are most likely due to secondary reasons like
decreased batching.
I think we ought to force enabling I/O thread early in 1.0 development and
commit to resolving any lingering issues.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Derived from kvm-tool patch
http://thread.gmane.org/gmane.comp.emulators.kvm.devel/74309
Ingo Molnar pointed out that sending the timer signal to the whole
process, just blocking it everywhere, is suboptimal with an increasing
number of threads. QEMU is also using this pattern so far.
Linux provides a (non-portable) way to restrict the signal to a single
thread: We can use SIGEV_THREAD_ID unless we are forced to emulate
signalfd via an additional thread. That case could theoretically be
optimized as well, but it doesn't look worth bothering.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
QEMU_CLOCK_HOST is based on the system time which may jump backward in
case the admin or NTP adjusts it. RTC emulations and other device models
can suffer in this case as timers will stall for the period the clock
was tuned back.
This adds a detection mechanism that checks on every host clock readout
if the new time is before the last result. If that is the case a
notifier list is informed. Device models interested in this event can
register a notifier with the clock.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
A timer that wakes up every millisecond puts a lot of stress on the
iothread. The large amount of IPIs causes very high context switch
activity, making emulation slow and the UI unusable. This is by the
way the same reason why the Windows timers were switched to dynticks.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
dynticks will provide equally good timer granularity on all modern Linux
systems. This is more or less dead code these days.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Commit 68c23e5520 removed the
multimedia timer, but this timer is needed for certain
Linux kernels. Otherwise Linux boot stops with this error:
MP-BIOS bug: 8254 timer not connected to IO-APIC
So the multimedia timer is added again here.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
The type casts are no longer needed after some small changes
in struct qemu_alarm_timer. This also improves readability
of the code.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
It is purely for icount-based virtual timers. And now that we got the
code right, rename the function to clarify the intended scope.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
This reverts commits 225d02cd and c9f7383c. While some parts of
the latter could be saved, I preferred a smooth, complete revert.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
The previous patch however is not enough, because if the virtual CPU
goes to sleep waiting for a future timer interrupt to wake it up, qemu
deadlocks. The timer interrupt never comes because time is driven by
icount, but the vCPU doesn't run any insns.
You could say that VCPUs should never go to sleep in icount
mode if there is a pending vm_clock timer; rather time should
just warp to the next vm_clock event with no sleep ever taking place.
Even better, you can sleep for some time related to the
time left until the next event, to avoid that the warps are too visible
externally; for example, you could be sending network packets continously
instead of every 100ms.
This is what this patch implements. qemu_clock_warp is called: 1)
whenever a vm_clock timer is adjusted, to ensure the warp_timer is
synchronized; 2) at strategic points in the CPU thread, to make sure
the insn counter is synchronized before the CPU starts running.
In any case, the warp_timer is disabled while the CPU is running,
because the insn counter will then be making progress on its own.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
These patches are already not doing a great service to out-of-tree
modifications to QEMU. However, at least we can warn them by getting
rid of the old confusing functions, or otherwise causing compilation
errors. This patch removes qemu_get_clock; the previous one changed
qemu_new_timer's signature.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This enables rt_clock timers to use nanosecond resolution, just by
using the _ns functions; there is really no reason to forbid that.
Migrated timers are all using vm_clock (of course; but I checked that
anyway) so the timers in the savevm files are already in nanosecond
resolution. So this patch makes no change to the migration format.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This was done with:
sed -i 's/qemu_get_clock\>/qemu_get_clock_ns/' \
$(git grep -l 'qemu_get_clock\>' )
sed -i 's/qemu_new_timer\>/qemu_new_timer_ns/' \
$(git grep -l 'qemu_new_timer\>' )
after checking that get_clock and new_timer never occur twice
on the same line. There were no missed occurrences; however, even
if there had been, they would have been caught by the compiler.
There was exactly one false positive in qemu_run_timers:
- current_time = qemu_get_clock (clock);
+ current_time = qemu_get_clock_ns (clock);
which is of course not in this patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This was done with:
sed -i '/get_clock\>.*rt_clock/s/get_clock\>/get_clock_ms/' \
$(git grep -l 'get_clock\>.*rt_clock' )
sed -i '/new_timer\>.*rt_clock/s/new_timer\>/new_timer_ms/' \
$(git grep -l 'new_timer\>.*rt_clock' )
after checking that get_clock and new_timer never occur twice
on the same line. There were no missed occurrences; however, even
if there had been, they would have been caught by the compiler.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Multimedia timers are only useful for compatibility with Windows NT 4.0
and earlier. Plus, the implementation in Wine is extremely heavyweight.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The !use_icount code is the same for iothread and non-iothread,
except that the timeout is different. Since the timeout might as
well be infinite and is only masking bugs, use the higher value.
With this change the !use_icount code is handled equivalently
in qemu_icount_delta and qemu_calculate_timeout, and we rip it
out of the former.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@petalogix.com>
qemu_next_alarm_deadline() is needed by MinGW, too.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This patch shows how using the correct formula for
qemu_next_deadline_dyntick can simplify the code of
host_alarm_handler and eliminate useless duplication.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When the QEMU_CLOCK_HOST clock was added, computation of its
deadline was added to qemu_next_deadline, which is correct but
incomplete.
I noticed this by reading the very convoluted rules whereby
qemu_next_deadline_dyntick is computed, which miss QEMU_CLOCK_HOST
when use_icount is true. This patch inlines qemu_next_deadline
into qemu_next_deadline_dyntick, and then corrects the logic to skip
only QEMU_CLOCK_VIRTUAL when use_icount is true.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When using the iothread together with icount, make sure the
qemu_icount counter makes forward progress when the vcpu is
idle to avoid deadlocks.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Move timer init functions to a new file, qemu-timer-common.c. Make other
critical timer functions inlined to preserve performance in
qemu-timer.c, also move muldiv64() (used by the inline functions)
to qemu-timer.h.
Adjust block/raw-posix.c and simpletrace.c to use get_clock() directly.
Remove a similar/duplicate definition in qemu-tool.c.
Adjust hw/omap_clk.c to include qemu-timer.h because muldiv64() is used
there.
After this change, tracing can be used also for user code and
simpletrace on Win32.
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Acked-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When available, we'd like to be able to access the DeviceState
when registering a savevm. For buses with a get_dev_path()
function, this will allow us to create more unique savevm
id strings.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Arrange various declarations so that also non-CPU code can access
them, adjust users.
Move CPU specific code to cpus.c.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The period for Win32 timers is very short and always the same
independent of dynticks, so it's possible that the timer fires
before qemu_run_all_timers has reset alarm_timer->pending to zero.
Reset alarm_timer->pending before rearming.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>