qemu/include
David Gibson ed48c59875 virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size
The virtio-balloon always works in units of 4kiB (BALLOON_PAGE_SIZE), but
we can only actually discard memory in units of the host page size.

Now, we handle this very badly: we silently ignore balloon requests that
aren't host page aligned, and for requests that are host page aligned we
discard the entire host page.  The latter can corrupt guest memory if its
page size is smaller than the host's.

The obvious choice would be to disable the balloon if the host page size is
not 4kiB.  However, that would break the special case where host and guest
have the same page size, but that's larger than 4kiB.  That case currently
works by accident[1] - and is used in practice on many production POWER
systems where 64kiB has long been the Linux default page size on both host
and guest.

To make the balloon safe, without breaking that useful special case, we
need to accumulate 4kiB balloon requests until we have a whole contiguous
host page to discard.

We could in principle do that across all guest memory, but it would require
a large bitmap to track.  This patch represents a compromise: we track
ballooned subpages for a single contiguous host page at a time.  This means
that if the guest discards all 4kiB chunks of a host page in succession,
we will discard it.  This is the expected behaviour in the (host page) ==
(guest page) != 4kiB case we want to support.

If the guest scatters 4kiB requests across different host pages, we don't
discard anything, and issue a warning.  Not ideal, but at least we don't
corrupt guest memory as the previous version could.

Warning reporting is kind of a compromise here.  Determining whether we're
in a problematic state at realize() time is tricky, because we'd have to
look at the host pagesizes of all memory backends, but we can't really know
if some of those backends could be for special purpose memory that's not
subject to ballooning.

Reporting only when the guest tries to balloon a partial page also isn't
great because if the guest page size happens to line up it won't indicate
that we're in a non ideal situation.  It could also cause alarming repeated
warnings whenever a migration is attempted.

So, what we do is warn the first time the guest attempts balloon a partial
host page, whether or not it will end up ballooning the rest of the page
immediately afterwards.

[1] Because when the guest attempts to balloon a page, it will submit
    requests for each 4kiB subpage.  Most will be ignored, but the one
    which happens to be host page aligned will discard the whole lot.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190214043916.22128-6-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-02-22 10:51:31 -05:00
..
block bdrv_query_image_info Error parameter added 2019-02-11 14:35:43 -06:00
chardev chardev: add a note about frontend sources and context switch 2019-02-13 16:46:39 +01:00
crypto Don't talk about the LGPL if the file is licensed under the GPL 2019-01-30 10:51:20 +01:00
disas target/mips: Add disassembler support for nanoMIPS 2018-10-25 22:13:33 +02:00
exec vhost-net: compile it on all targets that have virtio-net. 2019-02-21 12:28:01 -05:00
fpu include/fpu/softfloat: Fix compilation with Clang on s390x 2019-01-22 20:48:24 +00:00
hw virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size 2019-02-22 10:51:31 -05:00
io io: add qio_task_wait_thread to join with a background thread 2019-02-12 17:35:56 +01:00
libdecnumber Clean up ill-advised or unusual header guards 2016-07-12 16:20:46 +02:00
migration vmstate: constify SaveVMHandlers 2019-01-23 15:51:47 +00:00
monitor monitor: Remove "x-oob", offer capability "oob" unconditionally 2018-12-12 10:28:27 +01:00
net slirp: improve send_packet() callback 2019-02-07 15:49:08 +02:00
qapi qapi: remove qmp_unregister_command() 2019-02-18 14:44:05 +01:00
qemu slirp: replace global polling with per-instance & notifier 2019-02-07 15:49:08 +02:00
qom arm: Clarify the logic of set_pc() 2019-02-01 14:55:46 +00:00
scsi avoid TABs in files that only contain a few 2019-01-11 15:46:56 +01:00
standard-headers * cpu-exec fixes (Emilio, Laurent) 2019-02-05 19:39:22 +00:00
sysemu qapi: make query-cpu-definitions depend on specific targets 2019-02-18 14:44:05 +01:00
ui kbd-state: use state tracker for gtk 2019-02-05 10:45:44 +01:00
elf.h pvh: Boot uncompressed kernel using direct boot ABI 2019-02-05 16:50:16 +01:00
glib-compat.h slirp: Move g_spawn_async_with_fds_qemu compatibility to slirp/ 2019-02-07 15:49:08 +02:00
qemu-common.h qemu-common.h: Update copyright string for 2019 2019-02-06 15:45:23 +01:00
qemu-io.h qemu-io: Let command functions return error code 2018-06-11 16:18:45 +02:00
trace-tcg.h trace: get rid of generated-events.h/generated-events.c 2016-10-12 09:54:52 +02:00