qemu/include/sysemu
Anthony Harivel 0418f90809 Add support for RAPL MSRs in KVM/Qemu
Starting with the "Sandy Bridge" generation, Intel CPUs provide a RAPL
interface (Running Average Power Limit) for advertising the accumulated
energy consumption of various power domains (e.g. CPU packages, DRAM,
etc.).

The consumption is reported via MSRs (model specific registers) like
MSR_PKG_ENERGY_STATUS for the CPU package power domain. These MSRs are
64 bits registers that represent the accumulated energy consumption in
micro Joules. They are updated by microcode every ~1ms.

For now, KVM always returns 0 when the guest requests the value of
these MSRs. Use the KVM MSR filtering mechanism to allow QEMU handle
these MSRs dynamically in userspace.

To limit the amount of system calls for every MSR call, create a new
thread in QEMU that updates the "virtual" MSR values asynchronously.

Each vCPU has its own vMSR to reflect the independence of vCPUs. The
thread updates the vMSR values with the ratio of energy consumed of
the whole physical CPU package the vCPU thread runs on and the
thread's utime and stime values.

All other non-vCPU threads are also taken into account. Their energy
consumption is evenly distributed among all vCPUs threads running on
the same physical CPU package.

To overcome the problem that reading the RAPL MSR requires priviliged
access, a socket communication between QEMU and the qemu-vmsr-helper is
mandatory. You can specified the socket path in the parameter.

This feature is activated with -accel kvm,rapl=true,path=/path/sock.sock

Actual limitation:
- Works only on Intel host CPU because AMD CPUs are using different MSR
  adresses.

- Only the Package Power-Plane (MSR_PKG_ENERGY_STATUS) is reported at
  the moment.

Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-4-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-22 19:19:37 +02:00
..
accel-blocker.h bulk: Do not declare function prototypes using 'extern' keyword 2023-08-31 19:47:43 +02:00
accel-ops.h sysemu: add set_virtual_time to accel ops 2024-06-24 10:14:34 +01:00
arch_init.h target/nios2: Remove the deprecated Nios II target 2024-04-24 16:03:38 +02:00
balloon.h qapi: Restrict balloon-related commands to machine code 2020-09-29 15:41:35 +02:00
block-backend-common.h block: drain from main loop thread in bdrv_co_yield_to_drain() 2023-05-30 17:32:02 +02:00
block-backend-global-state.h block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK 2023-10-12 16:31:33 +02:00
block-backend-io.h util/defer-call: move defer_call() to util/ 2023-10-31 15:41:42 +01:00
block-backend.h include/sysemu/block-backend: split header into I/O and global state (GS) API 2022-03-04 18:18:25 +01:00
block-ram-registrar.h block: add BlockRAMRegistrar 2022-10-26 14:56:42 -04:00
blockdev.h include/sysemu/blockdev.h: global state API 2022-03-04 18:18:25 +01:00
cpu-throttle.h cpu-throttle: new module, extracted from cpus.c 2020-07-10 18:04:49 -04:00
cpu-timers-internal.h system: Rename softmmu/ directory as system/ 2023-10-08 21:08:08 +02:00
cpu-timers.h sysemu: add set_virtual_time to accel ops 2024-06-24 10:14:34 +01:00
cpus.h cpus: Remove unused smp_cores/smp_threads declarations 2023-10-12 00:37:39 +03:00
cryptodev-vhost-user.h cryptodev: Fix Lesser GPL version number 2020-10-27 16:48:49 +01:00
cryptodev-vhost.h include/: spelling fixes 2023-09-08 13:08:52 +03:00
cryptodev.h include/: spelling fixes 2023-09-08 13:08:52 +03:00
device_tree.h kconfig: allow compiling out QEMU device tree code per target 2024-05-10 15:45:15 +02:00
dirtylimit.h migration: Extend query-migrate to provide dirty page limit info 2023-07-26 10:55:56 +02:00
dirtyrate.h include: Include headers where needed 2023-01-08 01:54:22 -05:00
dma.h hw/dma: Let dma_buf_read() / dma_buf_write() propagate MemTxResult 2022-01-18 12:56:29 +01:00
dump-arch.h dump: Add arch cleanup function 2023-11-14 10:42:32 +01:00
dump.h dump: Allow directly outputting raw kdump format 2023-11-02 18:05:02 +04:00
event-loop-base.h Don't include headers already included by qemu/osdep.h 2023-02-08 07:28:05 +01:00
host_iommu_device.h HostIOMMUDevice: Introduce get_page_size_mask() callback 2024-07-09 11:50:37 +02:00
hostmem.h backends/hostmem: Report error when memory size is unaligned 2024-06-08 10:33:38 +02:00
hvf_int.h hvf: Makes assert_hvf_ok report failed expression 2024-06-08 10:33:38 +02:00
hvf.h exec: Rename NEED_CPU_H -> COMPILING_PER_TARGET 2024-04-26 09:49:51 +02:00
hw_accel.h accel: Remove HAX accelerator 2023-08-31 19:46:43 +02:00
iommufd.h backends/iommufd: Introduce helper function iommufd_backend_get_device_info() 2024-06-24 23:15:30 +02:00
iothread.h include/: spelling fixes 2023-09-08 13:08:52 +03:00
kvm_int.h Add support for RAPL MSRs in KVM/Qemu 2024-07-22 19:19:37 +02:00
kvm_xen.h hw/xen: select kernel mode for per-vCPU event channel upcall vector 2023-11-06 10:03:45 +00:00
kvm.h kvm: move target-dependent interrupt routing out of kvm-all.c 2024-05-03 15:47:48 +02:00
memory_mapping.h memory: follow Error API guidelines 2023-10-19 23:13:27 +02:00
numa.h numa: remove types from typedefs.h 2024-05-03 15:47:48 +02:00
nvmm.h exec: Rename NEED_CPU_H -> COMPILING_PER_TARGET 2024-04-26 09:49:51 +02:00
os-posix.h qemu_init: increase NOFILE soft limit on POSIX 2024-02-09 12:47:58 +00:00
os-win32.h qemu_init: increase NOFILE soft limit on POSIX 2024-02-09 12:47:58 +00:00
qtest.h qtest: move qtest_{get, set}_virtual_clock to accel/qtest/qtest.c 2024-06-24 10:14:56 +01:00
replay.h system/replay: Restrict icount to system emulation 2024-01-19 12:28:59 +01:00
reset.h hw/core/reset: Implement qemu_register_reset via qemu_register_resettable 2024-02-27 13:01:42 +00:00
rng-random.h Use OBJECT_DECLARE_SIMPLE_TYPE when possible 2020-09-18 14:12:32 -04:00
rng.h qom: Remove module_obj_name parameter from OBJECT_DECLARE* macros 2020-09-18 14:12:32 -04:00
rtc.h rtc: Use time_t for passing and returning time offsets 2023-08-31 09:45:18 +01:00
runstate-action.h system: Rename softmmu/ directory as system/ 2023-10-08 21:08:08 +02:00
runstate.h hw/misc/pvpanic: add support for normal shutdowns 2024-07-01 17:16:04 -04:00
seccomp.h sandbox: disable -sandbox if CONFIG_SECCOMP undefined 2018-06-01 13:44:15 +02:00
stats.h include/: spelling fixes 2023-09-08 13:08:52 +03:00
sysemu.h stubs: remove obsolete stubs 2024-04-18 11:17:27 +02:00
tcg.h accel: Document generic accelerator headers 2023-06-28 13:55:35 +02:00
tpm_backend.h include/: spelling fixes 2023-09-08 13:08:52 +03:00
tpm_util.h tpm: Fix Lesser GPL version number 2020-11-15 16:44:18 +01:00
tpm.h sysemu/tpm: Clean up global variable shadowing 2023-10-06 13:27:48 +02:00
vhost-user-backend.h qom: Remove module_obj_name parameter from OBJECT_DECLARE* macros 2020-09-18 14:12:32 -04:00
watchdog.h watchdog: remove -watchdog option 2022-09-29 11:40:28 +02:00
whpx.h exec: Rename NEED_CPU_H -> COMPILING_PER_TARGET 2024-04-26 09:49:51 +02:00
xen-mapcache.h xen: mapcache: Pass the ram_addr offset to xen_map_cache() 2024-06-09 20:16:14 +02:00
xen.h xen: mapcache: Add support for grant mappings 2024-06-09 20:16:14 +02:00