mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Stefan Hajnoczi	ebdf417220	* s390x CPU topology support * Simplify the KVM register synchronization code * Disable the analyze-migration.py test on s390x -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmUyDYMRHHRodXRoQHJl ZGhhdC5jb20ACgkQLtnXdP5wLbUlgBAAkF3dvW0vMcb653sCI5vt2GHIvQQtc2Rw ghRRcTBZ7wyVxKHtqohCh7/byzDW5YEuCWUyLsc2oIz/84pc00VR/5Ng1EAxLAfe mvzzjr4jX96SmoO0DbJpqJQXaUPNYdmoshbRL0I3wkIfGtkvGRM8zHZuYINOg0hw bH6gWZ2QL/NFjXh0uAOaJB1+hRtPWvHD2rnVt0g9U9W5QhRxGJqti5YEaLBH7hh5 RydsquRZ/E6uFw4pMjjvCxDaswPwejddrP2YeR5Fd5Zo+Kzp53r9Hf/eJwlZ8yFL 5f1dRb19NZYpW1hZuJVOP8tkPydYxAM85vkUunI7Qg4gez5KI0Nz6hQozw6ufMlQ r8L17fwQMsCrwcRypImYNXyyrtHlNH5Y8FjqTct8aK64Bw3e7Qqi7d3ybFAuYZ+D k2EJ8Rlwhbg69h+Q+ucHx4NkYu9+2MFS6G7w5EcM6xl3WHSwUxh9orlEMsIkyHS3 OMFMTr1jjfFdEN6EafhPwFE/xKglFF2Fe3u6NoR+5pkv3UA5Z87giitxoekYecpH J96P3anORpWW75qvOF+nccqrd7OrUL1/yYdOyJh5Tkm0oCIeQ9E5extVf3Gne3E/ yWzr00GJRiHFO2qbGStgKHTQLItgQpccwNpSzEdgHCqwLbXl6e3Hoq42VIFOlbN/ ZtgpyUkuYyQ= =xDb+ -----END PGP SIGNATURE----- Merge tag 'pull-request-2023-10-20' of https://gitlab.com/thuth/qemu into staging * s390x CPU topology support * Simplify the KVM register synchronization code * Disable the analyze-migration.py test on s390x # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmUyDYMRHHRodXRoQHJl # ZGhhdC5jb20ACgkQLtnXdP5wLbUlgBAAkF3dvW0vMcb653sCI5vt2GHIvQQtc2Rw # ghRRcTBZ7wyVxKHtqohCh7/byzDW5YEuCWUyLsc2oIz/84pc00VR/5Ng1EAxLAfe # mvzzjr4jX96SmoO0DbJpqJQXaUPNYdmoshbRL0I3wkIfGtkvGRM8zHZuYINOg0hw # bH6gWZ2QL/NFjXh0uAOaJB1+hRtPWvHD2rnVt0g9U9W5QhRxGJqti5YEaLBH7hh5 # RydsquRZ/E6uFw4pMjjvCxDaswPwejddrP2YeR5Fd5Zo+Kzp53r9Hf/eJwlZ8yFL # 5f1dRb19NZYpW1hZuJVOP8tkPydYxAM85vkUunI7Qg4gez5KI0Nz6hQozw6ufMlQ # r8L17fwQMsCrwcRypImYNXyyrtHlNH5Y8FjqTct8aK64Bw3e7Qqi7d3ybFAuYZ+D # k2EJ8Rlwhbg69h+Q+ucHx4NkYu9+2MFS6G7w5EcM6xl3WHSwUxh9orlEMsIkyHS3 # OMFMTr1jjfFdEN6EafhPwFE/xKglFF2Fe3u6NoR+5pkv3UA5Z87giitxoekYecpH # J96P3anORpWW75qvOF+nccqrd7OrUL1/yYdOyJh5Tkm0oCIeQ9E5extVf3Gne3E/ # yWzr00GJRiHFO2qbGStgKHTQLItgQpccwNpSzEdgHCqwLbXl6e3Hoq42VIFOlbN/ # ZtgpyUkuYyQ= # =xDb+ # -----END PGP SIGNATURE----- # gpg: Signature made Thu 19 Oct 2023 22:17:55 PDT # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * tag 'pull-request-2023-10-20' of https://gitlab.com/thuth/qemu: (24 commits) tests/qtest/migration-test: Disable the analyze-migration.py test on s390x target/s390x/kvm: Simplify the GPRs, ACRs, CRs and prefix synchronization code target/s390x/kvm: Turn KVM_CAP_SYNC_REGS into a hard requirement tests/avocado: s390x cpu topology bad move tests/avocado: s390x cpu topology dedicated errors tests/avocado: s390x cpu topology test socket full tests/avocado: s390x cpu topology test dedicated CPU tests/avocado: s390x cpu topology entitlement tests tests/avocado: s390x cpu topology polarization tests/avocado: s390x cpu topology core docs/s390x/cpu topology: document s390x cpu topology qapi/s390x/cpu topology: add query-s390x-cpu-polarization command qapi/s390x/cpu topology: CPU_POLARIZATION_CHANGE QAPI event machine: adding s390 topology to info hotpluggable-cpus machine: adding s390 topology to query-cpu-fast qapi/s390x/cpu topology: set-cpu-topology qmp command target/s390x/cpu topology: activate CPU topology s390x/cpu topology: interception of PTF instruction s390x/cpu topology: resetting the Topology-Change-Report s390x/sclp: reporting the maximum nested topology entries ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-10-20 06:46:41 -07:00
Pierre Morel	0d177cdd2b	docs/s390x/cpu topology: document s390x cpu topology Add some basic examples for the definition of cpu topology in s390x. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Message-ID: <20231016183925.2384704-15-nsg@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-10-20 07:16:53 +02:00
Pierre Morel	154893a784	qapi/s390x/cpu topology: add query-s390x-cpu-polarization command The query-s390x-cpu-polarization qmp command returns the current CPU polarization of the machine. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Message-ID: <20231016183925.2384704-14-nsg@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-10-20 07:16:53 +02:00
Pierre Morel	1cfe52b782	qapi/s390x/cpu topology: CPU_POLARIZATION_CHANGE QAPI event When the guest asks to change the polarization this change is forwarded to the upper layer using QAPI. The upper layer is supposed to take according decisions concerning CPU provisioning. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Message-ID: <20231016183925.2384704-13-nsg@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-10-20 07:16:53 +02:00
Pierre Morel	ad2d1afc1d	machine: adding s390 topology to query-cpu-fast S390x provides two more topology attributes, entitlement and dedication. Let's add these CPU attributes to the QAPI command query-cpu-fast. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Message-ID: <20231016183925.2384704-11-nsg@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-10-20 07:16:53 +02:00
Pierre Morel	a457c2ab5a	qapi/s390x/cpu topology: set-cpu-topology qmp command The modification of the CPU attributes are done through a monitor command. It allows to move the core inside the topology tree to optimize the cache usage in the case the host's hypervisor previously moved the CPU. The same command allows to modify the CPU attributes modifiers like polarization entitlement and the dedicated attribute to notify the guest if the host admin modified scheduling or dedication of a vCPU. With this knowledge the guest has the possibility to optimize the usage of the vCPUs. The command has a feature unstable for the moment. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20231016183925.2384704-10-nsg@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-10-20 07:16:53 +02:00
Pierre Morel	f4f54b582f	target/s390x/cpu topology: handle STSI(15) and build the SYSIB On interception of STSI(15.1.x) the System Information Block (SYSIB) is built from the list of pre-ordered topology entries. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Message-ID: <20231016183925.2384704-5-nsg@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-10-20 07:16:53 +02:00
Pierre Morel	5de1aff255	CPU topology: extend with s390 specifics S390 adds two new SMP levels, drawers and books to the CPU topology. S390 CPUs have specific topology features like dedication and entitlement. These indicate to the guest information on host vCPU scheduling and help the guest make better scheduling decisions. Add the new levels to the relevant QAPI structs. Add all the supported topology levels, dedication and entitlement as properties to S390 CPUs. Create machine-common.json so we can later include it in machine-target.json also. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Message-ID: <20231016183925.2384704-3-nsg@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-10-20 07:16:53 +02:00
Nina Schoetterl-Glausch	3da4aef81c	qapi: machine.json: change docs regarding CPU topology Clarify roles of different architectures. Also change things a bit in anticipation of additional members being added. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Message-ID: <20231016183925.2384704-2-nsg@linux.ibm.com> Acked-by: Markus Armbruster <armbru@redhat.com> [thuth: Updated some comments according to suggestions from Markus] Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-10-20 07:16:53 +02:00
Markus Armbruster	0a59c02b0c	qapi: Belatedly update CompatPolicy documentation for unstable Commit `57df0dff1a` (qapi: Extend -compat to set policy for unstable interfaces) neglected to update the "Limitation" paragraph to mention feature 'unstable' in addition to feature 'deprecated'. Do that now. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20231009110449.4015601-1-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-10-19 07:02:29 +02:00
Juan Quintela	e4ceec292f	migration: Improve json and formatting Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231013104736.31722-2-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Peter Xu	8b2395970a	migration: Allow user to specify available switchover bandwidth Migration bandwidth is a very important value to live migration. It's because it's one of the major factors that we'll make decision on when to switchover to destination in a precopy process. This value is currently estimated by QEMU during the whole live migration process by monitoring how fast we were sending the data. This can be the most accurate bandwidth if in the ideal world, where we're always feeding unlimited data to the migration channel, and then it'll be limited to the bandwidth that is available. However in reality it may be very different, e.g., over a 10Gbps network we can see query-migrate showing migration bandwidth of only a few tens of MB/s just because there are plenty of other things the migration thread might be doing. For example, the migration thread can be busy scanning zero pages, or it can be fetching dirty bitmap from other external dirty sources (like vhost or KVM). It means we may not be pushing data as much as possible to migration channel, so the bandwidth estimated from "how many data we sent in the channel" can be dramatically inaccurate sometimes. With that, the decision to switchover will be affected, by assuming that we may not be able to switchover at all with such a low bandwidth, but in reality we can. The migration may not even converge at all with the downtime specified, with that wrong estimation of bandwidth, keeping iterations forever with a low estimation of bandwidth. The issue is QEMU itself may not be able to avoid those uncertainties on measuing the real "available migration bandwidth". At least not something I can think of so far. One way to fix this is when the user is fully aware of the available bandwidth, then we can allow the user to help providing an accurate value. For example, if the user has a dedicated channel of 10Gbps for migration for this specific VM, the user can specify this bandwidth so QEMU can always do the calculation based on this fact, trusting the user as long as specified. It may not be the exact bandwidth when switching over (in which case qemu will push migration data as fast as possible), but much better than QEMU trying to wildly guess, especially when very wrong. A new parameter "avail-switchover-bandwidth" is introduced just for this. So when the user specified this parameter, instead of trusting the estimated value from QEMU itself (based on the QEMUFile send speed), it trusts the user more by using this value to decide when to switchover, assuming that we'll have such bandwidth available then. Note that specifying this value will not throttle the bandwidth for switchover yet, so QEMU will always use the full bandwidth possible for sending switchover data, assuming that should always be the most important way to use the network at that time. This can resolve issues like "unconvergence migration" which is caused by hilarious low "migration bandwidth" detected for whatever reason. Reported-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231010221922.40638-1-peterx@redhat.com>	2023-10-17 09:14:32 +02:00
Peter Xu	c94143e587	migration: Display error in query-migrate irrelevant of status Display it as long as being set, irrelevant of FAILED status. E.g., it may also be applicable to PAUSED stage of postcopy, to provide hint on what has gone wrong. The error_mutex seems to be overlooked when referencing the error, add it to be very safe. This will change QAPI behavior by showing up error message outside !FAILED status, but it's intended and doesn't expect to break anyone. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404 Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231004220240.167175-2-peterx@redhat.com>	2023-10-11 11:17:04 +02:00
Andrei Gudkov	320a6ccc76	migration/dirtyrate: use QEMU_CLOCK_HOST to report start-time Currently query-dirty-rate uses QEMU_CLOCK_REALTIME as the source for start-time field. This translates to clock_gettime(CLOCK_MONOTONIC), i.e. number of seconds since host boot. This is not very useful. The only reasonable use case of start-time I can imagine is to check whether previously completed measurements are too old or not. But this makes sense only if start-time is reported as host wall-clock time. This patch replaces source of start-time from QEMU_CLOCK_REALTIME to QEMU_CLOCK_HOST. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <399861531e3b24a1ecea2ba453fb2c3d129fb03a.1693905328.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>	2023-10-10 08:04:12 +08:00
Andrei Gudkov	34a68001f1	migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ \| calc-time \| dirty rate MiB/s \| \| (milliseconds) +----------------+---------------+--------------+ \| \| theoretical \| page-sampling \| dirty-bitmap \| \| \| (at 3M wr/sec) \| \| \| +----------------+----------------+---------------+--------------+ \| 1GiB \| +----------------+----------------+---------------+--------------+ \| 100 \| 6996 \| 7100 \| 3192 \| \| 200 \| 4606 \| 4660 \| 2655 \| \| 300 \| 3305 \| 3280 \| 2371 \| \| 400 \| 2534 \| 2525 \| 2154 \| \| 500 \| 2041 \| 2044 \| 1871 \| \| 750 \| 1365 \| 1341 \| 1358 \| \| 1000 \| 1024 \| 1052 \| 1025 \| \| 1500 \| 683 \| 678 \| 684 \| \| 2000 \| 512 \| 507 \| 513 \| +----------------+----------------+---------------+--------------+ \| 4GiB \| +----------------+----------------+---------------+--------------+ \| 100 \| 10232 \| 8880 \| 4070 \| \| 200 \| 8954 \| 8049 \| 3195 \| \| 300 \| 7889 \| 7193 \| 2881 \| \| 400 \| 6996 \| 6530 \| 2700 \| \| 500 \| 6245 \| 5772 \| 2312 \| \| 750 \| 4829 \| 4586 \| 2465 \| \| 1000 \| 3865 \| 3780 \| 2178 \| \| 1500 \| 2694 \| 2633 \| 2004 \| \| 2000 \| 2041 \| 2031 \| 1789 \| +----------------+----------------+---------------+--------------+ \| 24GiB \| +----------------+----------------+---------------+--------------+ \| 100 \| 11495 \| 8640 \| 5597 \| \| 200 \| 11226 \| 8616 \| 3527 \| \| 300 \| 10965 \| 8386 \| 2355 \| \| 400 \| 10713 \| 8370 \| 2179 \| \| 500 \| 10469 \| 8196 \| 2098 \| \| 750 \| 9890 \| 7885 \| 2556 \| \| 1000 \| 9354 \| 7506 \| 2084 \| \| 1500 \| 8397 \| 6944 \| 2075 \| \| 2000 \| 7574 \| 6402 \| 2062 \| +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(timewps)) / (time 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>	2023-10-10 08:03:50 +08:00
Andrey Drobyshev via	52b10c9c0c	qemu-img: map: report compressed data blocks Right now "qemu-img map" reports compressed blocks as containing data but having no host offset. This is not very informative. Instead, let's add another boolean field named "compressed" in case JSON output mode is specified. This is achieved by utilizing new allocation status flag BDRV_BLOCK_COMPRESSED for bdrv_block_status(). Also update the expected qemu-iotests outputs to contain the new field. Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> Message-ID: <20230907210226.953821-3-andrey.drobyshev@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Stefan Hajnoczi	4907644841	Hi, "Host Memory Backends" and "Memory devices" queue ("mem"): - Support and document VM templating with R/O files using a new "rom" parameter for memory-backend-file - Some cleanups and fixes around NVDIMMs and R/O file handling for guest RAM - Optimize ioeventfd updates by skipping address spaces that are not applicable -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmUJdykRHGRhdmlkQHJl ZGhhdC5jb20ACgkQTd4Q9wD/g1pf2w//akOUoYMuamySGjXtKLVyMKZkjIys+Ama k2C0xzsWAHBP572ezwHi8uxf5j9kzAjsw6GxDZ7FAamD9MhiohkEvkecloBx6f/c q3fVHblBNkG7v2urtf4+6PJtJvhzOST2SFXfWeYhO/vaA04AYCDgexv82JN3gA6B OS8WyOX62b8wILPSY2GLZ8IqpE9XnOYZwzVBn6YB1yo7ZkYEfXO6cA8nykNuNcOE vppqDo7uVIX6317FWj8ygxmzFfOaj0WT2MT2XFzEIDfg8BInQN8HC4mTn0hcVKMa N1y+eZH733CQKT+uNBRZ5YOeljOi4d6gEEyvkkA/L7e5D3Qg9hIdvHb4uryCFSWX Vt07OP1XLBwCZFobOC6sg+2gtTZJxxYK89e6ZzEd0454S24w5bnEteRAaCGOP0XL ww9xYULqhtZs55UC4rvZHJwdUAk1fIY4VqynwkeQXegvz6BxedNeEkJiiEU0Tizx N2VpsxAJ7H/LLSFeZoCRESo4azrH6U4n7S/eS1tkCniFqibfe2yIQCDoJVfb42ec gfg/vThCrDwHkIHzkMmoV8NndA7Q7SIkyMfYeEEBeZMeg8JzYll4DJEw/jQCacxh KRUa+AZvGlTJUq0mkvyOVfLki+iaehoIUuY1yvMrmdWijPO8n3YybmP9Ljhr8VdR 9MSYZe+I2v8= =iraT -----END PGP SIGNATURE----- Merge tag 'mem-2023-09-19' of https://github.com/davidhildenbrand/qemu into staging Hi, "Host Memory Backends" and "Memory devices" queue ("mem"): - Support and document VM templating with R/O files using a new "rom" parameter for memory-backend-file - Some cleanups and fixes around NVDIMMs and R/O file handling for guest RAM - Optimize ioeventfd updates by skipping address spaces that are not applicable # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmUJdykRHGRhdmlkQHJl # ZGhhdC5jb20ACgkQTd4Q9wD/g1pf2w//akOUoYMuamySGjXtKLVyMKZkjIys+Ama # k2C0xzsWAHBP572ezwHi8uxf5j9kzAjsw6GxDZ7FAamD9MhiohkEvkecloBx6f/c # q3fVHblBNkG7v2urtf4+6PJtJvhzOST2SFXfWeYhO/vaA04AYCDgexv82JN3gA6B # OS8WyOX62b8wILPSY2GLZ8IqpE9XnOYZwzVBn6YB1yo7ZkYEfXO6cA8nykNuNcOE # vppqDo7uVIX6317FWj8ygxmzFfOaj0WT2MT2XFzEIDfg8BInQN8HC4mTn0hcVKMa # N1y+eZH733CQKT+uNBRZ5YOeljOi4d6gEEyvkkA/L7e5D3Qg9hIdvHb4uryCFSWX # Vt07OP1XLBwCZFobOC6sg+2gtTZJxxYK89e6ZzEd0454S24w5bnEteRAaCGOP0XL # ww9xYULqhtZs55UC4rvZHJwdUAk1fIY4VqynwkeQXegvz6BxedNeEkJiiEU0Tizx # N2VpsxAJ7H/LLSFeZoCRESo4azrH6U4n7S/eS1tkCniFqibfe2yIQCDoJVfb42ec # gfg/vThCrDwHkIHzkMmoV8NndA7Q7SIkyMfYeEEBeZMeg8JzYll4DJEw/jQCacxh # KRUa+AZvGlTJUq0mkvyOVfLki+iaehoIUuY1yvMrmdWijPO8n3YybmP9Ljhr8VdR # 9MSYZe+I2v8= # =iraT # -----END PGP SIGNATURE----- # gpg: Signature made Tue 19 Sep 2023 06:25:45 EDT # gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A # gpg: issuer "david@redhat.com" # gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown] # gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full] # gpg: aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown] # gpg: WARNING: The key's User ID is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A * tag 'mem-2023-09-19' of https://github.com/davidhildenbrand/qemu: memory: avoid updating ioeventfds for some address_space machine: Improve error message when using default RAM backend id softmmu/physmem: Hint that "readonly=on,rom=off" exists when opening file R/W for private mapping fails docs: Start documenting VM templating docs: Don't mention "-mem-path" in multi-process.rst softmmu/physmem: Never return directories from file_ram_open() softmmu/physmem: Fail creation of new files in file_ram_open() with readonly=true softmmu/physmem: Bail out early in ram_block_discard_range() with readonly files softmmu/physmem: Remap with proper protection in qemu_ram_remap() backends/hostmem-file: Add "rom" property to support VM templating with R/O files softmmu/physmem: Distinguish between file access mode and mmap protection nvdimm: Reject writing label data to ROM instead of crashing QEMU Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-09-19 13:22:19 -04:00
David Hildenbrand	e92666b0ba	backends/hostmem-file: Add "rom" property to support VM templating with R/O files For now, "share=off,readonly=on" would always result in us opening the file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively turning it into ROM. Especially for VM templating, "share=off" is a common use case. However, that use case is impossible with files that lack write permissions, because "share=off,readonly=on" will not give us writable RAM. The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we have users (Kata Containers) that rely on the existing behavior -- malicious VMs should not be able to consume COW memory for R/O NVDIMMs -- we cannot change the semantics of "share=off,readonly=on" So let's add a new "rom" property with on/off/auto values. "auto" is the default and what most people will use: for historical reasons, to not change the old semantics, it defaults to the value of the "readonly" property. For VM templating, one can now use: -object memory-backend-file,share=off,readonly=on,rom=off,... But we'll disallow: -object memory-backend-file,share=on,readonly=on,rom=off,... because we would otherwise get an error when trying to mmap the R/O file shared and writable. An explicit error message is cleaner. We will also disallow for now: -object memory-backend-file,share=off,readonly=off,rom=on,... -object memory-backend-file,share=on,readonly=off,rom=on,... It's not harmful, but also not really required for now. Alternatives that were abandoned: * Make "unarmed=on" for the NVDIMM set the memory region container readonly. We would still see a change of ROM->RAM and possibly run into memslot limits with vhost-user. Further, there might be use cases for "unarmed=on" that should still allow writing to that memory (temporary files, system RAM, ...). * Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues as with "unarmed=on". * Make "readonly" consume "on/off/file" instead of being a 'bool' type. This would slightly changes the behavior of the "readonly" parameter: values like true/false (as accepted by a 'bool'type) would no longer be accepted. Message-ID: <20230906120503.359863-4-david@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
Ilya Maximets	cb039ef3d9	net: add initial support for AF_XDP network backend AF_XDP is a network socket family that allows communication directly with the network device driver in the kernel, bypassing most or all of the kernel networking stack. In the essence, the technology is pretty similar to netmap. But, unlike netmap, AF_XDP is Linux-native and works with any network interfaces without driver modifications. Unlike vhost-based backends (kernel, user, vdpa), AF_XDP doesn't require access to character devices or unix sockets. Only access to the network interface itself is necessary. This patch implements a network backend that communicates with the kernel by creating an AF_XDP socket. A chunk of userspace memory is shared between QEMU and the host kernel. 4 ring buffers (Tx, Rx, Fill and Completion) are placed in that memory along with a pool of memory buffers for the packet data. Data transmission is done by allocating one of the buffers, copying packet data into it and placing the pointer into Tx ring. After transmission, device will return the buffer via Completion ring. On Rx, device will take a buffer form a pre-populated Fill ring, write the packet data into it and place the buffer into Rx ring. AF_XDP network backend takes on the communication with the host kernel and the network interface and forwards packets to/from the peer device in QEMU. Usage example: -device virtio-net-pci,netdev=guest1,mac=00:16:35:AF:AA:5C -netdev af-xdp,ifname=ens6f1np1,id=guest1,mode=native,queues=1 XDP program bridges the socket with a network interface. It can be attached to the interface in 2 different modes: 1. skb - this mode should work for any interface and doesn't require driver support. With a caveat of lower performance. 2. native - this does require support from the driver and allows to bypass skb allocation in the kernel and potentially use zero-copy while getting packets in/out userspace. By default, QEMU will try to use native mode and fall back to skb. Mode can be forced via 'mode' option. To force 'copy' even in native mode, use 'force-copy=on' option. This might be useful if there is some issue with the driver. Option 'queues=N' allows to specify how many device queues should be open. Note that all the queues that are not open are still functional and can receive traffic, but it will not be delivered to QEMU. So, the number of device queues should generally match the QEMU configuration, unless the device is shared with something else and the traffic re-direction to appropriate queues is correctly configured on a device level (e.g. with ethtool -N). 'start-queue=M' option can be used to specify from which queue id QEMU should start configuring 'N' queues. It might also be necessary to use this option with certain NICs, e.g. MLX5 NICs. See the docs for examples. In a general case QEMU will need CAP_NET_ADMIN and CAP_SYS_ADMIN or CAP_BPF capabilities in order to load default XSK/XDP programs to the network interface and configure BPF maps. It is possible, however, to run with no capabilities. For that to work, an external process with enough capabilities will need to pre-load default XSK program, create AF_XDP sockets and pass their file descriptors to QEMU process on startup via 'sock-fds' option. Network backend will need to be configured with 'inhibit=on' to avoid loading of the program. QEMU will need 32 MB of locked memory (RLIMIT_MEMLOCK) per queue or CAP_IPC_LOCK. There are few performance challenges with the current network backends. First is that they do not support IO threads. This means that data path is handled by the main thread in QEMU and may slow down other work or may be slowed down by some other work. This also means that taking advantage of multi-queue is generally not possible today. Another thing is that data path is going through the device emulation code, which is not really optimized for performance. The fastest "frontend" device is virtio-net. But it's not optimized for heavy traffic either, because it expects such use-cases to be handled via some implementation of vhost (user, kernel, vdpa). In practice, we have virtio notifications and rcu lock/unlock on a per-packet basis and not very efficient accesses to the guest memory. Communication channels between backend and frontend devices do not allow passing more than one packet at a time as well. Some of these challenges can be avoided in the future by adding better batching into device emulation or by implementing vhost-af-xdp variant. There are also a few kernel limitations. AF_XDP sockets do not support any kinds of checksum or segmentation offloading. Buffers are limited to a page size (4K), i.e. MTU is limited. Multi-buffer support implementation for AF_XDP is in progress, but not ready yet. Also, transmission in all non-zero-copy modes is synchronous, i.e. done in a syscall. That doesn't allow high packet rates on virtual interfaces. However, keeping in mind all of these challenges, current implementation of the AF_XDP backend shows a decent performance while running on top of a physical NIC with zero-copy support. Test setup: 2 VMs running on 2 physical hosts connected via ConnectX6-Dx card. Network backend is configured to open the NIC directly in native mode. The driver supports zero-copy. NIC is configured to use 1 queue. Inside a VM - iperf3 for basic TCP performance testing and dpdk-testpmd for PPS testing. iperf3 result: TCP stream : 19.1 Gbps dpdk-testpmd (single queue, single CPU core, 64 B packets) results: Tx only : 3.4 Mpps Rx only : 2.0 Mpps L2 FWD Loopback : 1.5 Mpps In skb mode the same setup shows much lower performance, similar to the setup where pair of physical NICs is replaced with veth pair: iperf3 result: TCP stream : 9 Gbps dpdk-testpmd (single queue, single CPU core, 64 B packets) results: Tx only : 1.2 Mpps Rx only : 1.0 Mpps L2 FWD Loopback : 0.7 Mpps Results in skb mode or over the veth are close to results of a tap backend with vhost=on and disabled segmentation offloading bridged with a NIC. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> (docker/lcitool) Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00
Marc-André Lureau	32aa1f8dee	ui/vc: do not parse VC-specific options in Spice and GTK In commit `6f974c843c` ("gtk: overwrite the console.c char driver"), I shared the VC console parse handler with GTK. And later on in commit `d8aec9d9` ("display: add -display spice-app launching a Spice client"), I also used it to handle spice-app VC. This is not necessary, the VC console options (width/height/cols/rows) are specific, and unused by tty-level GTK/Spice VC. This is not a breaking change, as those options are still being parsed by QAPI ChardevVC. Adjust the documentation about it. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230830093843.3531473-44-marcandre.lureau@redhat.com>	2023-09-04 14:57:37 +04:00
Hyman Huang(黄勇)	ef96537732	qapi: Craft the dirty-limit capability comment Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Message-ID: <169073570563.19893.2928364761104733482-2@git.sr.ht> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2023-08-02 09:33:38 +02:00
Hyman Huang(黄勇)	8abc81150f	qapi: Reformat the dirty-limit migration doc comments Reformat the dirty-limit migration doc comments to conform to current conventions as commit `a937b6aa73` (qapi: Reformat doc comments to conform to current conventions). Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Message-ID: <169073570563.19893.2928364761104733482-1@git.sr.ht> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Whitespace tidied up] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2023-08-02 09:33:34 +02:00
Richard Henderson	ccdd312676	QAPI patches patches for 2023-07-26 -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmTBFvUSHGFybWJydUBy ZWRoYXQuY29tAAoJEDhwtADrkYZTML4QAKhHciLnEudtZ6SFSqpOgt80IJnw8a+r z1AowVYtgPhlZ8TtQJFXpBtAZtKu8xb/QdFxomm4bdNQnWX6CXCoheF5ZJ9V3Rrz A3pA1wt5KTnRif6R9/Rs1dYXEr4cWagg1UNT3g2eOV3fvdDHvJMPOsqK/jWeXuC1 T94yFMv1bZSLyiLgB7QQNYDZhIWQ06RGU6tZdWaZQReA8N8maXiZN5NnUISK32Rq L2X0FtgzyJQ+dLHtbXOw6kIwZdOLNauOM78skZoiZUyFVaH2aDUIg3mnfRw36hN6 feXGtw68PkTQGexKmonPDljIacfMDApmNBelLwsvB9MTrwVV+hKZPy1ZEwPIFDJ9 yid63pp2CtQ1TZ3dSjZ1cGbRR+g2NI5X4g1DlcFPAxydMkv9/m5NwQx8OYqVIzqg VXeS0++O2BM5+ORjlJxMx3RsyH2O1I8DCfwmifzYSo+3Xg/4nCV3f38czbavjCfJ 4T3ooZx0+PRtjlOlfZTkgxV14TMV+XzQr3bsN4wbPdnjnueSE1tyoVGy8MwQ5aXi 2oAsjrR8g7iqU6f+6PyRNn5F6D0ge+AYQ7bYS51i3Hyih/y2QUJECpL3XAgOxREb /68SEtr4m/GJvmQNdwwwu6e1JFo8LknwMfkfzQAOCK1npAJGsWPmJ6iY7KtWgS8F oDwqng/WOhvV =mNMX -----END PGP SIGNATURE----- Merge tag 'pull-qapi-2023-07-26-v2' of https://repo.or.cz/qemu/armbru into staging QAPI patches patches for 2023-07-26 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmTBFvUSHGFybWJydUBy # ZWRoYXQuY29tAAoJEDhwtADrkYZTML4QAKhHciLnEudtZ6SFSqpOgt80IJnw8a+r # z1AowVYtgPhlZ8TtQJFXpBtAZtKu8xb/QdFxomm4bdNQnWX6CXCoheF5ZJ9V3Rrz # A3pA1wt5KTnRif6R9/Rs1dYXEr4cWagg1UNT3g2eOV3fvdDHvJMPOsqK/jWeXuC1 # T94yFMv1bZSLyiLgB7QQNYDZhIWQ06RGU6tZdWaZQReA8N8maXiZN5NnUISK32Rq # L2X0FtgzyJQ+dLHtbXOw6kIwZdOLNauOM78skZoiZUyFVaH2aDUIg3mnfRw36hN6 # feXGtw68PkTQGexKmonPDljIacfMDApmNBelLwsvB9MTrwVV+hKZPy1ZEwPIFDJ9 # yid63pp2CtQ1TZ3dSjZ1cGbRR+g2NI5X4g1DlcFPAxydMkv9/m5NwQx8OYqVIzqg # VXeS0++O2BM5+ORjlJxMx3RsyH2O1I8DCfwmifzYSo+3Xg/4nCV3f38czbavjCfJ # 4T3ooZx0+PRtjlOlfZTkgxV14TMV+XzQr3bsN4wbPdnjnueSE1tyoVGy8MwQ5aXi # 2oAsjrR8g7iqU6f+6PyRNn5F6D0ge+AYQ7bYS51i3Hyih/y2QUJECpL3XAgOxREb # /68SEtr4m/GJvmQNdwwwu6e1JFo8LknwMfkfzQAOCK1npAJGsWPmJ6iY7KtWgS8F # oDwqng/WOhvV # =mNMX # -----END PGP SIGNATURE----- # gpg: Signature made Wed 26 Jul 2023 05:52:05 AM PDT # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [undefined] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * tag 'pull-qapi-2023-07-26-v2' of https://repo.or.cz/qemu/armbru: qapi: Reformat recent doc comments to conform to current conventions qapi/trace: Tidy up trace-event-get-state, -set-state documentation qapi/qdev: Tidy up device_add documentation qapi/block: Tidy up block-latency-histogram-set documentation qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-07-26 07:16:19 -07:00
Markus Armbruster	9e272073e1	qapi: Reformat recent doc comments to conform to current conventions Since commit `a937b6aa73` (qapi: Reformat doc comments to conform to current conventions), a number of comments not conforming to the current formatting conventions were added. No problem, just sweep the entire documentation once more. To check the generated documentation does not change, I compared the generated HTML before and after this commit with "wdiff -3". Finds no differences. Comparing with diff is not useful, as the reflown paragraphs are visible there. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20230720071610.1096458-7-armbru@redhat.com>	2023-07-26 14:51:36 +02:00
Markus Armbruster	e27a9d628d	qapi/trace: Tidy up trace-event-get-state, -set-state documentation trace-event-set-state's explanation of how events are selected is under "Features". Doesn't belong there. Simply delete it, as it feels redundant with documentation of member @name. trace-event-get-state's explanation is under "Returns". Tolerable, but similarly redundant. Delete it, too. Cc: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20230720071610.1096458-5-armbru@redhat.com>	2023-07-26 14:51:36 +02:00
Markus Armbruster	a9c72efd6d	qapi/qdev: Tidy up device_add documentation The notes section comes out like this: Notes Additional arguments depend on the type. 1. For detailed information about this command, please refer to the ‘docs/qdev-device-use.txt’ file. 2. It’s possible to list device properties by running QEMU with the “-device DEVICE,help” command-line argument, where DEVICE is the device’s name The first item isn't numbered. Fix that: 1. Additional arguments depend on the type. 2. For detailed information about this command, please refer to the ‘docs/qdev-device-use.txt’ file. 3. It’s possible to list device properties by running QEMU with the “-device DEVICE,help” command-line argument, where DEVICE is the device’s name Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20230720071610.1096458-4-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-07-26 14:51:36 +02:00
Markus Armbruster	e893b9e3b3	qapi/block: Tidy up block-latency-histogram-set documentation Examples come out like Example set new histograms for all io types with intervals [0, 10), [10, 50), [50, 100), [100, +inf): The sentence "set new histograms ..." starts with a lower case letter. Capitalize it. Same for the other examples. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20230720071610.1096458-3-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-07-26 14:51:36 +02:00
Markus Armbruster	dad3c9565d	qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation Documentation for member @bin comes out like list of io request counts corresponding to histogram intervals. len("bins") = len("boundaries") + 1 For the example above, "bins" may be something like [3, 1, 5, 2], and corresponding histogram looks like: Note how the equation and the sentence following it run together. Replace the equation: list of io request counts corresponding to histogram intervals, one more element than "boundaries" has. For the example above, "bins" may be something like [3, 1, 5, 2], and corresponding histogram looks like: Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20230720071610.1096458-2-armbru@redhat.com> [Off by one fixed]	2023-07-26 14:50:16 +02:00
Juan Quintela	7b24d32634	migration: skipped field is really obsolete. Has return zero for more than 10 years. Specifically we introduced the field in 1.5.0 commit `f1c72795af` Author: Peter Lieven <pl@kamp.de> Date: Tue Mar 26 10:58:37 2013 +0100 migration: do not sent zero pages in bulk stage during bulk stage of ram migration if a page is a zero page do not send it at all. the memory at the destination reads as zero anyway. even if there is an madvise with QEMU_MADV_DONTNEED at the target upon receipt of a zero page I have observed that the target starts swapping if the memory is overcommitted. it seems that the pages are dropped asynchronously. this patch also updates QMP to return the number of skipped pages in MigrationStats. but removed its usage in 1.5.3 commit `9ef051e553` Author: Peter Lieven <pl@kamp.de> Date: Mon Jun 10 12:14:19 2013 +0200 Revert "migration: do not sent zero pages in bulk stage" Not sending zero pages breaks migration if a page is zero at the source but not at the destination. This can e.g. happen if different BIOS versions are used at source and destination. It has also been reported that migration on pseries is completely broken with this patch. This effectively reverts commit `f1c72795af`. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20230612193344.3796-2-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)	15699cf542	migration: Extend query-migrate to provide dirty page limit info Extend query-migrate to provide throttle time and estimated ring full time with dirty-limit capability enabled, through which we can observe if dirty limit take effect during live migration. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-ID: <168733225273.5845.15871826788879741674-8@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)	dc62395557	migration: Introduce dirty-limit capability Introduce migration dirty-limit capability, which can be turned on before live migration and limit dirty page rate durty live migration. Introduce migrate_dirty_limit function to help check if dirty-limit capability enabled during live migration. Meanwhile, refactor vcpu_dirty_rate_stat_collect so that period can be configured instead of hardcoded. dirty-limit capability is kind of like auto-converge but using dirty limit instead of traditional cpu-throttle to throttle guest down. To enable this feature, turn on the dirty-limit capability before live migration using migrate-set-capabilities, and set the parameters "x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably to speed up convergence. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-4@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)	09f9ec9913	qapi/migration: Introduce vcpu-dirty-limit parameters Introduce "vcpu-dirty-limit" migration parameter used to limit dirty page rate during live migration. "vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are two dirty-limit-related migration parameters, which can be set before and during live migration by qmp migrate-set-parameters. This two parameters are used to help implement the dirty page rate limit algo of migration. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-3@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)	4d80785719	qapi/migration: Introduce x-vcpu-dirty-limit-period parameter Introduce "x-vcpu-dirty-limit-period" migration experimental parameter, which is in the range of 1 to 1000ms and used to make dirtyrate calculation period configurable. Currently with the "x-vcpu-dirty-limit-period" varies, the total time of live migration changes, test results show the optimal value of "x-vcpu-dirty-limit-period" ranges from 500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made stable once it proves best value can not be determined with developer's experiments. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-2@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Markus Armbruster	ff62c21016	qapi: Correct "eg." to "e.g." in documentation Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2023-07-25 17:20:32 +03:00
Peter Maydell	7d1d6a0c19	QAPI patches patches for 2023-07-10 -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmSr6HsSHGFybWJydUBy ZWRoYXQuY29tAAoJEDhwtADrkYZTgnUP/1XvFPJ8NUWBjqe4DgYqkjx7rf5Zym+y rluYzLNARWrOZuexvcn6tEiv74MilhSsZHuLvPQaQRF7voLPgD7fbRUBuYWPgodU 36+i3Hk76hAdhx0UMY62wHEviv0sWBr/ZiZjPcwrSS5tSEY23iUMY4ZVf/mIfPHH XYtF0co95SWKvqp9FSnejoYiNBCWKqZpHyDnJoXcd8RynqDt+cmNaZcU+Id+/WTv fLiLGQgHNyKBIYWlljxiDXGMlybnbV88N0dkLZtJ0Z1aJhh6j5grxTp0BRd85nsw QQjGO1qot6adQy04xi1RiMp4VZDJH18/9gBhDRLddVul0q49J1CT9LmKv/lYbpPj 6duZwrO5ciEUQ2usc8/L8ZtM7xIbAXRGqyg69IpmfwVE906LFrHt6N23WJl14a7F UBwD2+uGQNFyjxhFtPZZIYzYIH/49eGA/i6nhSIsd+LCD2r4n3M7FukgF8phuI9t xEX++sW4ix8cStqtsRAtFJ7OCFFKK2al1zpPzgHyZQ4mwMZimRKh6blcD+AnOZms uhiqONr2VlS9kefLAn5oCyTRUzxjJplnsqK44o8bKTfXxGcWBX2mt2nYMZECSLrQ B1HWzr8y4uc8ivYzIErhWMWtIwISa9KQSsuurZXz83vEWnrtVq6hh9B8z6j24hk9 RJRSRZjHHjt7 =3XVF -----END PGP SIGNATURE----- Merge tag 'pull-qapi-2023-07-10' of https://repo.or.cz/qemu/armbru into staging QAPI patches patches for 2023-07-10 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmSr6HsSHGFybWJydUBy # ZWRoYXQuY29tAAoJEDhwtADrkYZTgnUP/1XvFPJ8NUWBjqe4DgYqkjx7rf5Zym+y # rluYzLNARWrOZuexvcn6tEiv74MilhSsZHuLvPQaQRF7voLPgD7fbRUBuYWPgodU # 36+i3Hk76hAdhx0UMY62wHEviv0sWBr/ZiZjPcwrSS5tSEY23iUMY4ZVf/mIfPHH # XYtF0co95SWKvqp9FSnejoYiNBCWKqZpHyDnJoXcd8RynqDt+cmNaZcU+Id+/WTv # fLiLGQgHNyKBIYWlljxiDXGMlybnbV88N0dkLZtJ0Z1aJhh6j5grxTp0BRd85nsw # QQjGO1qot6adQy04xi1RiMp4VZDJH18/9gBhDRLddVul0q49J1CT9LmKv/lYbpPj # 6duZwrO5ciEUQ2usc8/L8ZtM7xIbAXRGqyg69IpmfwVE906LFrHt6N23WJl14a7F # UBwD2+uGQNFyjxhFtPZZIYzYIH/49eGA/i6nhSIsd+LCD2r4n3M7FukgF8phuI9t # xEX++sW4ix8cStqtsRAtFJ7OCFFKK2al1zpPzgHyZQ4mwMZimRKh6blcD+AnOZms # uhiqONr2VlS9kefLAn5oCyTRUzxjJplnsqK44o8bKTfXxGcWBX2mt2nYMZECSLrQ # B1HWzr8y4uc8ivYzIErhWMWtIwISa9KQSsuurZXz83vEWnrtVq6hh9B8z6j24hk9 # RJRSRZjHHjt7 # =3XVF # -----END PGP SIGNATURE----- # gpg: Signature made Mon 10 Jul 2023 12:16:11 BST # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * tag 'pull-qapi-2023-07-10' of https://repo.or.cz/qemu/armbru: migration.json: Don't use space before colon qapi: better docs for calc-dirty-rate and friends Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2023-07-24 18:06:25 +01:00
Marc-André Lureau	20c5124805	audio/pw: Pipewire->PipeWire case fix for user-visible text "PipeWire" is the correct case. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Volker Rümelin <vr_qemu@t-online.de> Message-Id: <20230506163735.3481387-4-marcandre.lureau@redhat.com>	2023-07-17 15:22:56 +04:00
Juan Quintela	fd658a7b8c	migration.json: Don't use space before colon So all the file is consistent. Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230612191604.2219-1-quintela@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2023-07-10 07:47:36 +02:00
Andrei Gudkov	5034e3d4e8	qapi: better docs for calc-dirty-rate and friends Rewrote calc-dirty-rate documentation. Briefly described different modes of dirty page rate measurement. Added some examples. Fixed obvious grammar errors. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Message-Id: <fe7d32a621ebd69ef6974beb2499c0b5dccb9e19.1684854849.git.gudkov.andrei@huawei.com> Acked-by: Markus Armbruster <armbru@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> [Prose tweaked and spacing corrected, as per review] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2023-07-10 07:47:36 +02:00
Avihai Horon	6574232fff	migration: Add switchover ack capability Migration downtime estimation is calculated based on bandwidth and remaining migration data. This assumes that loading of migration data in the destination takes a negligible amount of time and that downtime depends only on network speed. While this may be true for RAM, it's not necessarily true for other migrated devices. For example, loading the data of a VFIO device in the destination might require from the device to allocate resources, prepare internal data structures and so on. These operations can take a significant amount of time which can increase migration downtime. This patch adds a new capability "switchover ack" that prevents the source from stopping the VM and completing the migration until an ACK is received from the destination that it's OK to do so. This can be used by migrated devices in various ways to reduce downtime. For example, a device can send initial precopy metadata to pre-allocate resources in the destination and use this capability to make sure that the pre-allocation is completed before the source VM is stopped, so it will have full effect. This new capability relies on the return path capability to communicate from the destination back to the source. The actual implementation of the capability will be added in the following patches. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-06-30 06:02:51 +02:00
Marc-André Lureau	39324b4966	ui: add egl-headless support on win32 Make GBM optional for EGL code, and enable the build for win32. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20230606115658.677673-13-marcandre.lureau@redhat.com>	2023-06-27 17:08:56 +02:00
Fei Wu	1b65b4f54c	accel/tcg: remove CONFIG_PROFILER TBStats will be introduced to replace CONFIG_PROFILER totally, here remove all CONFIG_PROFILER related stuffs first. Signed-off-by: Vanderson M. do Rosario <vandersonmr2@gmail.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Fei Wu <fei2.wu@intel.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230607122411.3394702-2-fei2.wu@intel.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-06-26 17:33:00 +02:00
Jonathan Cameron	bafe030832	hw/cxl/events: Add injection of Memory Module Events These events include a copy of the device health information at the time of the event. Actually using the emulated device health would require a lot of controls to manipulate that state. Given the aim of this injection code is to just test the flows when events occur, inject the contents of the device health state as well. Future work may add more sophisticate device health emulation including direct generation of these records when events occur (such as a temperature threshold being crossed). That does not reduce the usefulness of this more basic generation of the events. Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230530133603.16934-8-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-06-23 02:54:40 -04:00
Jonathan Cameron	b90a324eda	hw/cxl/events: Add injection of DRAM events Defined in CXL r3.0 8.2.9.2.1.2 DRAM Event Record, this event provides information related to DRAM devices. Example injection command in QMP: { "execute": "cxl-inject-dram-event", "arguments": { "path": "/machine/peripheral/cxl-mem0", "log": "informational", "flags": 1, "dpa": 1000, "descriptor": 3, "type": 3, "transaction-type": 192, "channel": 3, "rank": 17, "nibble-mask": 37421234, "bank-group": 7, "bank": 11, "row": 2, "column": 77, "correction-mask": [33, 44, 55,66] }} Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230530133603.16934-7-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-06-22 18:55:14 -04:00
Ira Weiny	ea9b6d647f	hw/cxl/events: Add injection of General Media Events To facilitate testing provide a QMP command to inject a general media event. The event can be added to the log specified. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230530133603.16934-6-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-06-22 18:55:14 -04:00
Jonathan Cameron	9547754f40	hw/cxl: QMP based poison injection support Inject poison using QMP command cxl-inject-poison to add an entry to the poison list. For now, the poison is not returned CXL.mem reads, but only via the mailbox command Get Poison List. So a normal memory read to an address that is on the poison list will not yet result in a synchronous exception (and similar for partial cacheline writes). That is left for a future patch. See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h) Kernel patches to use this interface here: https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/ To inject poison using QMP (telnet to the QMP port) { "execute": "qmp_capabilities" } { "execute": "cxl-inject-poison", "arguments": { "path": "/machine/peripheral/cxl-pmem0", "start": 2048, "length": 256 } } Adjusted to select a device on your machine. Note that the poison list supported is kept short enough to avoid the complexity of state machine that is needed to handle the MORE flag. Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230526170010.574-3-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-06-22 18:55:14 -04:00
Philippe Mathieu-Daudé	c7b64948f8	meson: Replace CONFIG_SOFTMMU -> CONFIG_SYSTEM_ONLY Since we might have user emulation with softmmu, use the clearer 'CONFIG_SYSTEM_ONLY' key to check for system emulation. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230613133347.82210-9-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-06-20 10:01:30 +02:00
Steve Sistare	b0182e537e	exec/memory: Introduce RAM_NAMED_FILE flag migrate_ignore_shared() is an optimization that avoids copying memory that is visible and can be mapped on the target. However, a memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED flag set is not migrated when migrate_ignore_shared() is true. This is wrong, because the block has no named backing store, and its contents will be lost. To fix, ignore shared memory iff it is a named file. Define a new flag RAM_NAMED_FILE to distinguish this case. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <1686151116-253260-1-git-send-email-steven.sistare@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-06-13 11:28:58 +02:00
Michael Tokarev	40b89515d0	spelling: information 3 trivial fixes: 2 .json comments which goes to executables, and 1 .h file comment. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2023-06-09 23:38:16 +03:00
Jean-Louis Dupond	42a2890a76	qcow2: add discard-no-unref option When we for example have a sparse qcow2 image and discard: unmap is enabled, there can be a lot of fragmentation in the image after some time. Especially on VM's that do a lot of writes/deletes. This causes the qcow2 image to grow even over 110% of its virtual size, because the free gaps in the image get too small to allocate new continuous clusters. So it allocates new space at the end of the image. Disabling discard is not an option, as discard is needed to keep the incremental backup size as low as possible. Without discard, the incremental backups would become large, as qemu thinks it's just dirty blocks but it doesn't know the blocks are unneeded. So we need to avoid fragmentation but also 'empty' the unneeded blocks in the image to have a small incremental backup. In addition, we also want to send the discards further down the stack, so the underlying blocks are still discarded. Therefor we introduce a new qcow2 option "discard-no-unref". When setting this option to true, discards will no longer have the qcow2 driver relinquish cluster allocations. Other than that, the request is handled as normal: All clusters in range are marked as zero, and, if pass-discard-request is true, it is passed further down the stack. The only difference is that the now-zero clusters are preallocated instead of being unallocated. This will avoid fragmentation on the qcow2 image. Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621 Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be> Message-Id: <20230605084523.34134-2-jean-louis@dupond.be> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:15:42 +02:00
Eric Blake	bd1386cce1	cutils: Adjust signature of parse_uint[_full] It's already confusing that we have two very similar functions for wrapping the parse of a 64-bit unsigned value, differing mainly on whether they permit leading '-'. Adjust the signature of parse_uint() and parse_uint_full() to be like all of qemu_strto(): put the result parameter last, use the same types (uint64_t and unsigned long long have the same width, but are not always the same type), and mark endptr const (this latter change only affects the rare caller of parse_uint). Adjust all callers in the tree. While at it, note that since cutils.c already includes: QEMU_BUILD_BUG_ON(sizeof(int64_t) != sizeof(long long)); we are guaranteed that the result of parse_uint cannot exceed UINT64_MAX (or the build would have failed), so we can drop pre-existing dead comparisons in opts-visitor.c that were never false. Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-8-eblake@redhat.com> [eblake: Drop dead code spotted by Markus] Signed-off-by: Eric Blake <eblake@redhat.com>	2023-06-02 12:27:19 -05:00

1 2 3 4 5 ...

1676 Commits