The way the throttling algorithm works is that requests start being
throttled once the bucket level exceeds the burst limit. When we get
there the bucket leaks at the level set by the user (bkt->avg), and
that leak rate is what prevents guest I/O from exceeding the desired
limit.
If we don't allow bursts (i.e. bkt->max == 0) then we can start
throttling requests immediately. The problem with keeping the
threshold at 0 is that it only allows one request at a time, and as
soon as there's a bit of I/O from the guest every other request will
be throttled and performance will suffer considerably. That can even
make the guest unable to reach the throttle limit if that limit is
high enough, and that happens regardless of the block scheduler used
by the guest.
Increasing that threshold gives flexibility to the guest, allowing it
to perform short bursts of I/O before being throttled. Increasing the
threshold too much does not make a difference in the long run (because
it's the leak rate what defines the actual throughput) but it does
allow the guest to perform longer initial bursts and exceed the
throttle limit for a short while.
A burst value of bkt->avg / 10 allows the guest to perform 100ms'
worth of I/O at the target rate without being throttled.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 31aae6645f0d1fbf3860fb2b528b757236f0c0a7.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
throttle_config() cancels the timers of the calling BlockBackend. This
doesn't make sense because other BlockBackends in the group remain
untouched. There's no need to cancel the timers in the one specific
BlockBackend so let's not do that. Throttled requests will run as
scheduled and future requests will follow the new configuration. This
also allows a throttle group's configuration to be changed even when it
has no members.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Clock type in throttling is currently inferred by the ThrottleTimer's
clock type even though it is a per-ThrottleGroup property; it doesn't
make sense to have different clock types in the same group. Moving this
to a field in ThrottleGroup can simplify some of the throttle functions.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Throttling has a weird property that throttle_get_config() does not
always return the same throttling settings that were given with
throttle_config(). In other words, the set and get functions aren't
symmetric.
If .max is 0 then the throttling code assigns a default value of .avg /
10 in throttle_config(). This is an implementation detail of the
throttling algorithm. When throttle_get_config() is called the .max
value returned should still be 0.
Users are exposed to this quirk via "info block" or "query-block"
monitor commands. This has caused confusion because it looks like a bug
when an unexpected value is reported.
This patch hides the .max value adjustment in throttle_get_config() and
updates test-throttle.c appropriately.
Reported-by: Nini Gu <ngu@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170301115026.22621-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Setting FOO_max to a value that is lower than FOO does not make
sense, and it produces odd results depending on the value of
FOO_max_length. Although the user should not set that configuration
in the first place it's better to reject it explicitly.
https://bugzilla.redhat.com/show_bug.cgi?id=1355665
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reported-by: Gu Nini <ngu@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 663d5aca406060e31f80d8113f77b6feee63b919.1469693110.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In a similar vein to commit ee2bdc33c9
("throttle: refuse bps_max/iops_max without bps/iops") it is likely that
the user made a configuration error if iops-size has been set but no
iops limit has been set.
Print an error message so the user can check their throttling
configuration. They should either remove iops-size if they don't want
any throttling or specify one of iops-total, iops-read, or iops-write.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1464828031-25601-1-git-send-email-stefanha@redhat.com
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds support for burst periods to the throttling code.
With this feature the user can keep performing bursts as defined by
the LeakyBucket.max rate for a configurable period of time.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We can currently initialize ThrottleConfig by zeroing all its fields,
but this will change with the new fields to define the length of the
burst periods.
This patch introduces a new throttle_config_init() function and uses it
to replace all memset() calls that initialize ThrottleConfig directly.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There's no need to keep throttle_conflicting(), throttle_is_valid()
and throttle_max_is_missing_limit() as separate functions, so this
patch merges all three into one.
As a consequence, check_throttle_config() becomes redundant and can be
replaced with throttle_is_valid().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The caller does not need to set it, and this will allow us to refactor
this function later.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The caller does not need to set it, and this will allow us to refactor
this function later.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The caller does not need to set it, and this will allow us to refactor
this function later.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function is only used internally in throttle.c
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-6-git-send-email-peter.maydell@linaro.org
extract_common_blockdev_options() uses qemu_opt_get_number() to parse
the bps/iops numbers to uint64_t, then converts to double and stores in
ThrottleConfig. The actual parsing is done by strtoull() in
parse_option_number(). Negative numbers are wrapped to large positive
ones, and stored.
We used to reject negative numbers since 7d81c1413c, but this regressed
when the option parsing code was changed later. Now fix this again.
This time, define an arbitrary large upper limit (1e15), and check the
values so both negative and impractically big numbers are caught and
reported.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The bps_max/iops_max values are meaningless without corresponding
bps/iops values. Reported an error if bps_max/iops_max is given without
bps/iops.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1438683733-21111-2-git-send-email-stefanha@redhat.com
Commit e0cf11f31c ("timer: Use a single
definition of NSEC_PER_SEC for the whole codebase") renamed
NANOSECONDS_PER_SECOND to NSEC_PER_SEC.
On Mac OS X there is a <dispatch/time.h> system header which also
defines NSEC_PER_SEC. This causes compiler warnings.
Let's use the old name instead. It's longer but it doesn't clash.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1436364609-7929-1-git-send-email-stefanha@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Group throttling will share ThrottleState between multiple bs.
As a consequence the ThrottleState will be accessed by multiple aio
context.
Timers are tied to their aio context so they must go out of the
ThrottleState structure.
This commit paves the way for each bs of a common ThrottleState to
have its own timer.
Signed-off-by: Benoit Canet <benoit.canet@nodalink.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 6cf9ea96d8b32ae2f8769cead38f68a6a0c8c909.1433779731.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Block I/O throttling uses timers and currently always adds them to the
main loop. Throttling will break if bdrv_set_aio_context() is used to
move a BlockDriverState to a different AioContext.
This patch adds throttle_detach/attach_aio_context() interfaces so the
throttling timers and uses them to move timers to the new AioContext.
Note that bdrv_set_aio_context() already drains all requests so we're
sure no throttled requests are pending.
The test cases need to be updated since the throttle_init() interface
has changed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Implement the continuous leaky bucket algorithm devised on IRC as a separate
module.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>