mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Michal Privoznik	5d9a9a6170	backends/hostmem: Report error when memory size is unaligned If memory-backend-{file,ram} has a size that's not aligned to underlying page size it is not only wasteful, but also may lead to hard to debug behaviour. For instance, in case memory-backend-file and hugepages, madvise() and mbind() fail. Rightfully so, page is the smallest unit they can work with. And even though an error is reported, the root cause it not very clear: qemu-system-x86_64: Couldn't set property 'dump' on 'memory-backend-file': Invalid argument After this commit: qemu-system-x86_64: backend 'memory-backend-file' memory size must be multiple of 2 MiB Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Mario Casquero <mcasquer@redhat.com> Message-ID: <b5b9f9c6bba07879fb43f3c6f496c69867ae3716.1717584048.git.mprivozn@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-08 10:33:38 +02:00
Xiaoyao Li	37662d85b0	HostMem: Add mechanism to opt in kvm guest memfd via MachineState Add a new member "guest_memfd" to memory backends. When it's set to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm guest_memfd will be allocated during RAMBlock allocation. Memory backend's @guest_memfd is wired with @require_guest_memfd field of MachineState. It avoid looking up the machine in phymem.c. MachineState::require_guest_memfd is supposed to be set by any VMs that requires KVM guest memfd as private memory, e.g., TDX VM. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-ID: <20240320083945.991426-8-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-04-23 17:35:25 +02:00
Philippe Mathieu-Daudé	fdb63cf3b5	backends: Have HostMemoryBackendClass::alloc() handler return a boolean Following the example documented since commit `e3fe3988d7` ("error: Document Error API usage rules"), have HostMemoryBackendClass::alloc return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-17-philmd@linaro.org>	2024-01-05 16:20:15 +01:00
Philippe Mathieu-Daudé	2d7a1eb6e6	backends: Use g_autofree in HostMemoryBackendClass::alloc() handlers In preparation of having HostMemoryBackendClass::alloc() handlers return a boolean, have them use g_autofree. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-15-philmd@linaro.org>	2024-01-05 16:20:15 +01:00
David Hildenbrand	e92666b0ba	backends/hostmem-file: Add "rom" property to support VM templating with R/O files For now, "share=off,readonly=on" would always result in us opening the file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively turning it into ROM. Especially for VM templating, "share=off" is a common use case. However, that use case is impossible with files that lack write permissions, because "share=off,readonly=on" will not give us writable RAM. The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we have users (Kata Containers) that rely on the existing behavior -- malicious VMs should not be able to consume COW memory for R/O NVDIMMs -- we cannot change the semantics of "share=off,readonly=on" So let's add a new "rom" property with on/off/auto values. "auto" is the default and what most people will use: for historical reasons, to not change the old semantics, it defaults to the value of the "readonly" property. For VM templating, one can now use: -object memory-backend-file,share=off,readonly=on,rom=off,... But we'll disallow: -object memory-backend-file,share=on,readonly=on,rom=off,... because we would otherwise get an error when trying to mmap the R/O file shared and writable. An explicit error message is cleaner. We will also disallow for now: -object memory-backend-file,share=off,readonly=off,rom=on,... -object memory-backend-file,share=on,readonly=off,rom=on,... It's not harmful, but also not really required for now. Alternatives that were abandoned: * Make "unarmed=on" for the NVDIMM set the memory region container readonly. We would still see a change of ROM->RAM and possibly run into memslot limits with vhost-user. Further, there might be use cases for "unarmed=on" that should still allow writing to that memory (temporary files, system RAM, ...). * Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues as with "unarmed=on". * Make "readonly" consume "on/off/file" instead of being a 'bool' type. This would slightly changes the behavior of the "readonly" parameter: values like true/false (as accepted by a 'bool'type) would no longer be accepted. Message-ID: <20230906120503.359863-4-david@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
David Hildenbrand	5c52a219bb	softmmu/physmem: Distinguish between file access mode and mmap protection There is a difference between how we open a file and how we mmap it, and we want to support writable private mappings of readonly files. Let's define RAM_READONLY and RAM_READONLY_FD flags, to replace the single "readonly" parameter for file-related functions. In memory_region_init_ram_from_fd() and memory_region_init_ram_from_file(), initialize mr->readonly based on the new RAM_READONLY flag. While at it, add some RAM_* flags we missed to add to the list of accepted flags in the documentation of some functions. No change in functionality intended. We'll make use of both flags next and start setting them independently for memory-backend-file. Message-ID: <20230906120503.359863-3-david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
Steve Sistare	b0182e537e	exec/memory: Introduce RAM_NAMED_FILE flag migrate_ignore_shared() is an optimization that avoids copying memory that is visible and can be mapped on the target. However, a memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED flag set is not migrated when migrate_ignore_shared() is true. This is wrong, because the block has no named backing store, and its contents will be lost. To fix, ignore shared memory iff it is a named file. Define a new flag RAM_NAMED_FILE to distinguish this case. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <1686151116-253260-1-git-send-email-steven.sistare@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-06-13 11:28:58 +02:00
Alexander Graf	4b870dc4d0	hostmem-file: add offset option Add an option for hostmem-file to start the memory object at an offset into the target file. This is useful if multiple memory objects reside inside the same target file, such as a device node. In particular, it's useful to map guest memory directly into /dev/mem for experimentation. To make this work consistently, also fix up all places in QEMU that expect fd offsets to be 0. Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20230403221421.60877-1-graf@amazon.com> Acked-by: Markus Armbruster <armbru@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-05-23 16:47:03 +02:00
Peter Maydell	b85ea5fa2f	include: Move qemu_madvise() and related #defines to new qemu/madvise.h The function qemu_madvise() and the QEMU_MADV_* constants associated with it are used in only 10 files. Move them out of osdep.h to a new qemu/madvise.h header that is included where it is needed. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org	2022-02-21 13:30:20 +00:00
David Hildenbrand	9181fb7043	hostmem: Wire up RAM_NORESERVE via "reserve" property Let's provide a way to control the use of RAM_NORESERVE via memory backends using the "reserve" property which defaults to true (old behavior). Only Linux currently supports clearing the flag (and support is checked at runtime, depending on the setting of "/proc/sys/vm/overcommit_memory"). Windows and other POSIX systems will bail out with "reserve=false". The target use case is virtio-mem, which dynamically exposes memory inside a large, sparse memory area to the VM. This essentially allows avoiding to set "/proc/sys/vm/overcommit_memory == 0") when using virtio-mem and also supporting hugetlbfs in the future. As really only Linux implements RAM_NORESERVE right now, let's expose the property only with CONFIG_LINUX. Setting the property to "false" will then only fail in corner cases -- for example on very old kernels or when memory overcommit was completely disabled by the admin. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Cc: Markus Armbruster <armbru@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-11-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-15 20:27:38 +02:00
Thomas Huth	4c386f8064	Do not include sysemu/sysemu.h if it's not really necessary Stop including sysemu/sysemu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-2-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:50 +02:00
Michal Privoznik	def835f0da	hostmem: Don't report pmem attribute if unsupported When management applications (like Libvirt) want to check whether memory-backend-file.pmem is supported they can list object properties using 'qom-list-properties'. However, 'pmem' is declared always (and thus reported always) and only at runtime QEMU errors out if it was built without libpmem (and thus can not guarantee write persistence). This is suboptimal since we have ability to declare attributes at compile time. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1915216 Message-Id: <dfcc5dc7e2efc0283bc38e3036da2c0323621cdb.1611647111.git.mprivozn@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-02-18 18:34:47 -05:00
Stefan Hajnoczi	86635aa4e9	hostmem-file: add readonly=on\|off option Let -object memory-backend-file work on read-only files when the readonly=on option is given. This can be used to share the contents of a file between multiple guests while preventing them from consuming Copy-on-Write memory if guests dirty the pages, for example. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20210104171320.575838-3-stefanha@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-02-01 17:07:34 -05:00
Stefan Hajnoczi	369d6dc4de	memory: add readonly support to memory_region_init_ram_from_file() There is currently no way to open(O_RDONLY) and mmap(PROT_READ) when creating a memory region from a file. This functionality is needed since the underlying host file may not allow writing. Add a bool readonly argument to memory_region_init_ram_from_file() and the APIs it calls. Extend memory_region_init_ram_from_file() rather than introducing a memory_region_init_rom_from_file() API so that callers can easily make a choice between read/write and read-only at runtime without calling different APIs. No new RAMBlock flag is introduced for read-only because it's unclear whether RAMBlocks need to know that they are read-only. Pass a bool readonly argument instead. Both of these design decisions can be changed in the future. It just seemed like the simplest approach to me. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20210104171320.575838-2-stefanha@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-02-01 17:07:34 -05:00
Eduardo Habkost	8063396bf3	Use OBJECT_DECLARE_SIMPLE_TYPE when possible This converts existing DECLARE_INSTANCE_CHECKER usage to OBJECT_DECLARE_SIMPLE_TYPE when possible. $ ./scripts/codeconverter/converter.py -i \ --pattern=AddObjectDeclareSimpleType $(git grep -l '' -- '*.[ch]') Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20200916182519.415636-6-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-18 14:12:32 -04:00
Eduardo Habkost	8110fa1d94	Use DECLARE_CHECKER macros Generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=TypeCheckMacro $(git grep -l '' -- '*.[ch]') Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-12-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-13-ehabkost@redhat.com> Message-Id: <20200831210740.126168-14-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 09:27:09 -04:00
Eduardo Habkost	db1015e92e	Move QOM typedefs and add missing includes Some typedefs and macros are defined after the type check macros. This makes it difficult to automatically replace their definitions with OBJECT_DECLARE_TYPE. Patch generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=QOMStructTypedefSplit $(git grep -l '' -- '.[ch]') which will split "typdef struct { ... } TypedefName" declarations. Followed by: $ ./scripts/codeconverter/converter.py -i --pattern=MoveSymbols \ $(git grep -l '' -- '.[ch]') which will: - move the typedefs and #defines above the type check macros - add missing #include "qom/object.h" lines if necessary Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-9-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-10-ehabkost@redhat.com> Message-Id: <20200831210740.126168-11-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 09:26:43 -04:00
Markus Armbruster	668f62ec62	error: Eliminate error_propagate() with Coccinelle, part 1 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... return ... } to if (!foo(..., errp)) { ... ... return ... } where nothing else needs @err. Coccinelle script: @rule1 forall@ identifier fun, err, errp, lbl; expression list args, args2; binary operator op; constant c1, c2; symbol false; @@ if ( ( - fun(args, &err, args2) + fun(args, errp, args2) \| - !fun(args, &err, args2) + !fun(args, errp, args2) \| - fun(args, &err, args2) op c1 + fun(args, errp, args2) op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; ) } @rule2 forall@ identifier fun, err, errp, lbl; expression list args, args2; expression var; binary operator op; constant c1, c2; symbol false; @@ - var = fun(args, &err, args2); + var = fun(args, errp, args2); ... when != err if ( ( var \| !var \| var op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; \| return var; ) } @depends on rule1 \|\| rule2@ identifier err; @@ - Error *err = NULL; ... when != err Not exactly elegant, I'm afraid. The "when != lbl:" is necessary to avoid transforming if (fun(args, &err)) { goto out } ... out: error_propagate(errp, err); even though other paths to label out still need the error_propagate(). For an actual example, see sclp_realize(). Without the "when strict", Coccinelle transforms vfio_msix_setup(), incorrectly. I don't know what exactly "when strict" does, only that it helps here. The match of return is narrower than what I want, but I can't figure out how to express "return where the operand doesn't use @err". For an example where it's too narrow, see vfio_intx_enable(). Silently fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. Line breaks tidied up manually. One nested declaration of @local_err deleted manually. Preexisting unwanted blank line dropped in hw/riscv/sifive_e.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-35-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	dcfe480544	error: Avoid unnecessary error_propagate() after error_setg() Replace error_setg(&err, ...); error_propagate(errp, err); by error_setg(errp, ...); Related pattern: if (...) { error_setg(&err, ...); goto out; } ... out: error_propagate(errp, err); return; When all paths to label out are that way, replace by if (...) { error_setg(errp, ...); return; } and delete the label along with the error_propagate(). When we have at most one other path that actually needs to propagate, and maybe one at the end that where propagation is unnecessary, e.g. foo(..., &err); if (err) { goto out; } ... bar(..., &err); out: error_propagate(errp, err); return; move the error_propagate() to where it's needed, like if (...) { foo(..., &err); error_propagate(errp, err); return; } ... bar(..., errp); return; and transform the error_setg() as above. In some places, the transformation results in obviously unnecessary error_propagate(). The next few commits will eliminate them. Bonus: the elimination of gotos will make later patches in this series easier to review. Candidates for conversion tracked down with this Coccinelle script: @@ identifier err, errp; expression list args; @@ - error_setg(&err, args); + error_setg(errp, args); ... when != err error_propagate(errp, err); Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-34-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	62a35aaa31	qapi: Use returned bool to check for failure, Coccinelle part The previous commit enables conversion of visit_foo(..., &err); if (err) { ... } to if (!visit_foo(..., errp)) { ... } for visitor functions that now return true / false on success / error. Coccinelle script: @@ identifier fun =~ "check_list\|input_type_enum\|lv_start_struct\|lv_type_bool\|lv_type_int64\|lv_type_str\|lv_type_uint64\|output_type_enum\|parse_type_bool\|parse_type_int64\|parse_type_null\|parse_type_number\|parse_type_size\|parse_type_str\|parse_type_uint64\|print_type_bool\|print_type_int64\|print_type_null\|print_type_number\|print_type_size\|print_type_str\|print_type_uint64\|qapi_clone_start_alternate\|qapi_clone_start_list\|qapi_clone_start_struct\|qapi_clone_type_bool\|qapi_clone_type_int64\|qapi_clone_type_null\|qapi_clone_type_number\|qapi_clone_type_str\|qapi_clone_type_uint64\|qapi_dealloc_start_list\|qapi_dealloc_start_struct\|qapi_dealloc_type_anything\|qapi_dealloc_type_bool\|qapi_dealloc_type_int64\|qapi_dealloc_type_null\|qapi_dealloc_type_number\|qapi_dealloc_type_str\|qapi_dealloc_type_uint64\|qobject_input_check_list\|qobject_input_check_struct\|qobject_input_start_alternate\|qobject_input_start_list\|qobject_input_start_struct\|qobject_input_type_any\|qobject_input_type_bool\|qobject_input_type_bool_keyval\|qobject_input_type_int64\|qobject_input_type_int64_keyval\|qobject_input_type_null\|qobject_input_type_number\|qobject_input_type_number_keyval\|qobject_input_type_size_keyval\|qobject_input_type_str\|qobject_input_type_str_keyval\|qobject_input_type_uint64\|qobject_input_type_uint64_keyval\|qobject_output_start_list\|qobject_output_start_struct\|qobject_output_type_any\|qobject_output_type_bool\|qobject_output_type_int64\|qobject_output_type_null\|qobject_output_type_number\|qobject_output_type_str\|qobject_output_type_uint64\|start_list\|visit_check_list\|visit_check_struct\|visit_start_alternate\|visit_start_list\|visit_start_struct\|visit_type_."; expression list args; typedef Error; Error err; @@ - fun(args, &err); - if (err) + if (!fun(args, &err)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-19-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	d2623129a7	qom: Drop parameter @errp of object_property_add() & friends The only way object_property_add() can fail is when a property with the same name already exists. Since our property names are all hardcoded, failure is a programming error, and the appropriate way to handle it is passing &error_abort. Same for its variants, except for object_property_add_child(), which additionally fails when the child already has a parent. Parentage is also under program control, so this is a programming error, too. We have a bit over 500 callers. Almost half of them pass &error_abort, slightly fewer ignore errors, one test case handles errors, and the remaining few callers pass them to their own callers. The previous few commits demonstrated once again that ignoring programming errors is a bad idea. Of the few ones that pass on errors, several violate the Error API. The Error ** argument must be NULL, &error_abort, &error_fatal, or a pointer to a variable containing NULL. Passing an argument of the latter kind twice without clearing it in between is wrong: if the first call sets an error, it no longer points to NULL for the second call. ich9_pm_add_properties(), sparc32_ledma_realize(), sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize() are wrong that way. When the one appropriate choice of argument is &error_abort, letting users pick the argument is a bad idea. Drop parameter @errp and assert the preconditions instead. There's one exception to "duplicate property name is a programming error": the way object_property_add() implements the magic (and undocumented) "automatic arrayification". Don't drop @errp there. Instead, rename object_property_add() to object_property_try_add(), and add the obvious wrapper object_property_add(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-15-armbru@redhat.com> [Two semantic rebase conflicts resolved]	2020-05-15 07:07:58 +02:00
Igor Mammedov	4ebc74dbbf	hostmem: fix strict bind policy When option -mem-prealloc is used with one or more memory-backend objects, created backends may not obey configured bind policy or creation may fail after kernel attempts to move pages according to bind policy. Reason is in file_ram_alloc(), which will pre-allocate any descriptor based RAM if global mem_prealloc != 0 and that happens way before bind policy is applied to memory range. One way to fix it would be to extend memory_region_foo() API and add more invariants that could broken later due implicit dependencies that's hard to track. Another approach is to drop adhoc main RAM allocation and consolidate it around memory-backend. That allows to have single place that allocates guest RAM (main and memdev) in the same way and then global mem_prealloc could be replaced by backend's property[s] that will affect created memory-backend objects but only in correct order this time. With main RAM now converted to hostmem backends, there is no point in keeping global mem_prealloc around, so alias -mem-prealloc to "memory-backend.prealloc=on" machine compat[] property and make mem_prealloc a local variable to only stir registration of compat property. ) currently user accessible -global works only with DEVICE based objects and extra work is needed to make it work with hostmem backends. But that is convenience option and out of scope of this already huge refactoring. Hence machine compat properties were used. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20200219160953.13771-78-imammedo@redhat.com>	2020-02-19 16:50:02 +00:00
Igor Mammedov	900c0ba373	machine: alias -mem-path and -mem-prealloc into memory-foo backend Allow machine to opt in for hostmem backend based initial RAM even if user uses old -mem-path/prealloc options by providing MachineClass::default_ram_id Follow up patches will incrementally convert machines to new API, by dropping memory_region_allocate_system_memory() and setting default_ram_id that board used to use before conversion to keep migration stream the same. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20200219160953.13771-4-imammedo@redhat.com>	2020-02-19 16:49:53 +00:00
Stefan Hajnoczi	72d41eb4b8	memory: fetch pmem size in get_file_size() Neither stat(2) nor lseek(2) report the size of Linux devdax pmem character device nodes. Commit `314aec4a6e` ("hostmem-file: reject invalid pmem file sizes") added code to hostmem-file.c to fetch the size from sysfs and compare against the user-provided size=NUM parameter: if (backend->size > size) { error_setg(errp, "size property %" PRIu64 " is larger than " "pmem file \"%s\" size %" PRIu64, backend->size, fb->mem_path, size); return; } It turns out that exec.c:qemu_ram_alloc_from_fd() already has an equivalent size check but it skips devdax pmem character devices because lseek(2) returns 0: if (file_size > 0 && file_size < size) { error_setg(errp, "backing store %s size 0x%" PRIx64 " does not match 'size' option 0x" RAM_ADDR_FMT, mem_path, file_size, size); return NULL; } This patch moves the devdax pmem file size code into get_file_size() so that we check the memory size in a single place: qemu_ram_alloc_from_fd(). This simplifies the code and makes it more general. This also fixes the problem that hostmem-file only checks the devdax pmem file size when the pmem=on parameter is given. An unchecked size=NUM parameter can lead to SIGBUS in QEMU so we must always fetch the file size for Linux devdax pmem character device nodes. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190830093056.12572-1-stefanha@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-09-16 12:32:21 +02:00
Stefan Hajnoczi	7faae95ebc	hostmem-file: fix pmem file size check Commit `314aec4a6e` ("hostmem-file: reject invalid pmem file sizes") added a file size check that verifies the hostmem object's size parameter against the actual devdax pmem file. This is useful because getting the size wrong results in confusing errors inside the guest. However, the code doesn't work properly for files where struct stat::st_size is zero. Hostmem-file's ->alloc() function returns early without setting an Error, causing the following assertion failure: qemu/memory.c:2215: memory_region_get_ram_ptr: Assertion `mr->ram_block' failed. This patch handles the case where qemu_get_pmem_size() returns 0 but there is no error. Fixes: `314aec4a6e` Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190823135632.25010-1-stefanha@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-09-03 14:39:46 -03:00
Markus Armbruster	0b8fa32f55	Include qemu/module.h where needed, drop it from qemu-common.h Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]	2019-06-12 13:18:33 +02:00
Stefan Hajnoczi	314aec4a6e	hostmem-file: reject invalid pmem file sizes Guests started with NVDIMMs larger than the underlying host file produce confusing errors inside the guest. This happens because the guest accesses pages beyond the end of the file. Check the pmem file size on startup and print a clear error message if the size is invalid. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1669053 Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Zhang Yi <yi.z.zhang@linux.intel.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190214031004.32522-3-stefanha@redhat.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Pankaj Gupta <pagupta@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-03-11 10:44:19 -03:00
Igor Mammedov	5c7ba877ef	hostmem-file: simplify ifdef-s in file_backend_memory_alloc() cleanup file_backend_memory_alloc() by using one CONFIG_POSIX ifdef instead of several ones within the function to make it simpler to follow. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Suggested-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190213123858.24620-1-imammedo@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190214031004.32522-2-stefanha@redhat.com> [lv: s/hostmem/hostmem-file/] Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-03-06 11:21:27 +01:00
Zhang Yi	21d1683690	hostmem: add more information in error messages When there are multiple memory backends in use, including the object type and property name in the error message can help users to locate the error. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com> Message-Id: <97d9193875747d8378c05b9e3b3cb39c1b7d2b4e.1546399191.git.yi.z.zhang@linux.intel.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> [ehabkost: reword commit message] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-01-28 15:52:05 -02:00
Marc-André Lureau	fa0cb34d22	hostmem: use object id for memory region name with >= 4.0 hostmem-file and hostmem-memfd use the whole object path for the memory region name, and hostname-ram uses only the path component (the object id, or canonical path basename): qemu -m 1024 -object memory-backend-file,id=mem,size=1G,mem-path=/tmp/foo -numa node,memdev=mem -monitor stdio (qemu) info ramblock Block Name PSize Offset Used Total /objects/mem 4 KiB 0x0000000000000000 0x0000000040000000 0x0000000040000000 qemu -m 1024 -object memory-backend-memfd,id=mem,size=1G -numa node,memdev=mem -monitor stdio (qemu) info ramblock Block Name PSize Offset Used Total /objects/mem 4 KiB 0x0000000000000000 0x0000000040000000 0x0000000040000000 qemu -m 1024 -object memory-backend-ram,id=mem,size=1G -numa node,memdev=mem -monitor stdio (qemu) info ramblock Block Name PSize Offset Used Total mem 4 KiB 0x0000000000000000 0x0000000040000000 0x0000000040000000 For consistency, change to use object id for -file and -memfd as well with >= 4.0. Having a consistent naming allows to migrate to different hostmem backends. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com>	2019-01-07 16:18:42 +04:00
Zhang Yi	87dc3ce60a	hostmem-file: remove object id from pmem error message We will never get the canonical path from the object before object_property_add_child. Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com> Message-Id: <a6491f996827f4039c1a52198ed5dcc7727cb0f9.1540389255.git.yi.z.zhang@linux.intel.com> [ehabkost: reword commit message] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2018-12-11 15:45:22 -02:00
Marc-André Lureau	86100290cb	hostmem: no need to check for host_memory_backend_mr_inited() in alloc() memfd_backend_memory_alloc/file_backend_memory_alloc both needlessly are are calling host_memory_backend_mr_inited() which creates an illusion that alloc could be called multiple times but it isn't, it's called once from UserCreatable complete(). Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-11-28 11:51:24 +01:00
Zhang Yi	a1f3bb1845	hostmem-file: fixed the memory leak while get pmem path. object_get_canonical_path_component() returns a string which must be freed using g_free(). Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com> Message-Id: <7328fb16c394eaf5d65437d11c2a9343647b6d3d.1535471899.git.yi.z.zhang@linux.intel.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2018-10-24 06:44:59 -03:00
Hikaru Nishida	d5dbde4645	hostmem-file: make available memory-backend-file on POSIX-based hosts Before this change, memory-backend-file object is valid for Linux hosts only because hostmem-file.c is compiled only on Linux hosts. However, other POSIX-based hosts (such as macOS) can support memory-backend-file object in the same way as on Linux hosts. This patch makes hostmem-file.c and related functions to be compiled on all POSIX-based hosts to make available memory-backend-file on them. Signed-off-by: Hikaru Nishida <hikarupsp@gmail.com> Message-Id: <20180924123205.29651-1-hikarupsp@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-02 19:09:13 +02:00
Junyan He	a4de8552b2	hostmem-file: add the 'pmem' option When QEMU emulates vNVDIMM labels and migrates vNVDIMM devices, it needs to know whether the backend storage is a real persistent memory, in order to decide whether special operations should be performed to ensure the data persistence. This boolean option 'pmem' allows users to specify whether the backend storage of memory-backend-file is a real persistent memory. If 'pmem=on', QEMU will set the flag RAM_PMEM in the RAM block of the corresponding memory region. If 'pmem' is set while lack of libpmem support, a error is generated. Signed-off-by: Junyan He <junyan.he@intel.com> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-08-10 13:29:39 +03:00
Junyan He	cbfc017103	memory, exec: switch file ram allocation functions to 'flags' parameters As more flag parameters besides the existing 'share' are going to be added to following functions memory_region_init_ram_from_file qemu_ram_alloc_from_fd qemu_ram_alloc_from_file let's switch them to use the 'flags' parameters so as to ease future flag additions. The existing 'share' flag is converted to the RAM_SHARED bit in ram_flags, and other flag bits are ignored by above functions right now. Signed-off-by: Junyan He <junyan.he@intel.com> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2018-08-10 13:29:39 +03:00
Marcel Apfelbaum	06329ccecf	mem: add share parameter to memory-backend-ram Currently only file backed memory backend can be created with a "share" flag in order to allow sharing guest RAM with other processes in the host. Add the "share" flag also to RAM Memory Backend in order to allow remapping parts of the guest RAM to different host virtual addresses. This is needed by the RDMA devices in order to remap non-contiguous QEMU virtual addresses to a contiguous virtual address range. Moved the "share" flag to the Host Memory base class, modified phys_mem_alloc to include the new parameter and a new interface memory_region_init_ram_shared_nomigrate. There are no functional changes if the new flag is not used. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>	2018-02-19 13:03:24 +02:00
Haozhong Zhang	9837684316	hostmem-file: add "align" option When mmap(2) the backend files, QEMU uses the host page size (getpagesize(2)) by default as the alignment of mapping address. However, some backends may require alignments different than the page size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned, fails with a kernel message like [617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff) Because there is no common approach to get such alignment requirement, we add the 'align' option to 'memory-backend-file', so that users or management utils, which have enough knowledge about the backend, can specify a proper alignment via this option. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [ehabkost: fixed typo, fixed error_setg() format string] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2018-01-19 11:18:51 -02:00
Eduardo Habkost	11ae6ed8af	hostmem-file: Add "discard-data" option The new option can be used to indicate that the file contents can be destroyed and don't need to be flushed to disk when QEMU exits or when the memory backend object is removed. Internally, it will trigger a madvise(MADV_REMOVE) call when the memory backend is removed. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170824192315.5897-4-ehabkost@redhat.com> [ehabkost: fixup: improved documentation] Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Tested-by: Zack Cornelius <zack.cornelius@kove.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-09-19 09:09:23 -03:00
Peter Xu	6f4c60e47f	hostmem: use host_memory_backend_mr_inited() where proper Use the new interface to boost readability. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1489151370-15453-3-git-send-email-peterx@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-04-20 15:22:41 -03:00
Eduardo Habkost	026ac483c7	hostmem-file: Register TYPE_MEMORY_BACKEND_FILE properties as class properties To do the conversion, the file_backend_class_init() was moved after the getter/setter functions. The old file_backend_instance_init() function was removed because it is not needed anymore. The NULL errp arguments on the property registration calls were changed to &error_abort. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2016-10-17 15:48:40 -02:00
Marc-André Lureau	bc78a01319	hostmem-file: plug a small leak Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1460566660-19241-1-git-send-email-marcandre.lureau@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-04-15 17:56:06 +02:00
Gonglei	696b55017d	hostmem-file: fix memory leak Signed-off-by: Gonglei <arei.gonglei@huawei.com> Message-Id: <1456998223-12356-5-git-send-email-arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-04-08 00:07:56 +02:00
Markus Armbruster	da34e65cb4	include/qemu/osdep.h: Don't include qapi/error.h Commit `57cb38b` included qapi/error.h into qemu/osdep.h to get the Error typedef. Since then, we've moved to include qemu/osdep.h everywhere. Its file comment explains: "To avoid getting into possible circular include dependencies, this file should not include any other QEMU headers, with the exceptions of config-host.h, compiler.h, os-posix.h and os-win32.h, all of which are doing a similar job to this file and are under similar constraints." qapi/error.h doesn't do a similar job, and it doesn't adhere to similar constraints: it includes qapi-types.h. That's in excess of 100KiB of crap most .c files don't actually need. Add the typedef to qemu/typedefs.h, and include that instead of qapi/error.h. Include qapi/error.h in .c files that need it and don't get it now. Include qapi-types.h in qom/object.h for uint16List. Update scripts/clean-includes accordingly. Update it further to match reality: replace config.h by config-target.h, add sysemu/os-posix.h, sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h comment quoted above similarly. This reduces the number of objects depending on qapi/error.h from "all of them" to less than a third. Unfortunately, the number depending on qapi-types.h shrinks only a little. More work is needed for that one. Signed-off-by: Markus Armbruster <armbru@redhat.com> [Fix compilation without the spice devel packages. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-22 22:20:15 +01:00
Peter Maydell	9c058332f3	backends: Clean up includes Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1454089805-5470-5-git-send-email-peter.maydell@linaro.org	2016-02-04 17:01:04 +00:00
Daniel P. Berrange	ef1e1e0782	maint: avoid useless "if (foo) free(foo)" pattern The free() and g_free() functions both happily accept NULL on any platform QEMU builds on. As such putting a conditional 'if (foo)' check before calls to 'free(foo)' merely serves to bloat the lines of code. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-09-11 10:21:38 +03:00
Jan Kiszka	c2cb2b041b	hostmem: Fix mem-path property name in error report The subtle difference between "property not found" and "property not set" is already confusing enough. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-04-30 16:05:48 +03:00
Paolo Bonzini	dbcb898118	hostmem: add property to map memory with MAP_SHARED A new "share" property can be used with the "memory-file" backend to map memory with MAP_SHARED instead of MAP_PRIVATE. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2014-06-19 18:44:20 +03:00
Paolo Bonzini	a35ba7be4b	hostmem: allow preallocation of any memory region And allow preallocation of file-based memory even without -mem-prealloc. Some care is necessary because -mem-prealloc does not allow disabling preallocation for hostmem-file. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2014-06-19 18:44:20 +03:00
Paolo Bonzini	52330e1a96	hostmem: add file-based HostMemoryBackend Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> MST: comment tweak	2014-06-19 18:44:20 +03:00

50 Commits