qemu/hw
Yanan Wang 864c3b5c32 hw/core/machine: Introduce CPU cluster topology support
The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.

So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
    int thread_id;
    int core_id;
    int cluster_id;
    int package_id;
    int llc_id;
    cpumask_t thread_sibling;
    cpumask_t core_sibling;
    cpumask_t cluster_sibling;
    cpumask_t llc_sibling;
}

A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.

In virtualization, on the Hosts which have pClusters (physical
clusters), if we can design a vCPU topology with cluster level for
guest kernel and have a dedicated vCPU pinning. A Cluster-Aware
Guest kernel can also make use of the cache affinity of CPU clusters
to gain similar scheduling performance.

This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.

Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20211228092221.21068-3-wangyanan55@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[PMD: Added '(since 7.0)' to @clusters in qapi/machine.json]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-12-31 13:42:39 +01:00
..
9pfs 9pfs: use P9Array in v9fs_walk() 2021-10-27 14:45:22 +02:00
acpi failover: fix unplug pending detection 2021-11-28 17:03:52 -05:00
adc hw/adc: Add basic Aspeed ADC model 2021-10-12 08:20:08 +02:00
alpha hw/alpha: Provide a PCI-ISA bridge device node 2021-06-28 07:27:32 -07:00
arm dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
audio pci: Let ld*_pci_dma() propagate MemTxResult 2021-12-31 01:05:27 +01:00
avr hw/avr: Realize AVRCPU qdev object using qdev_realize() 2021-12-17 10:43:24 +01:00
block virtio-blk: Fix clean up of host notifiers for single MR transaction. 2021-12-06 14:21:14 +00:00
char Fix STM32F2XX USART data register readout 2021-12-15 10:11:34 +00:00
core hw/core/machine: Introduce CPU cluster topology support 2021-12-31 13:42:39 +01:00
cpu cpu/core: Fix "help" of CPU core device types 2021-04-09 16:05:16 -04:00
cris Do not include exec/address-spaces.h if it's not really necessary 2021-05-02 17:24:51 +02:00
display dma: Let dma_memory_map() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
dma dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
gpio hw: aspeed_gpio: Fix GPIO array indexing 2021-10-12 08:20:08 +02:00
hppa docs: fix references to docs/devel/tracing.rst 2021-06-02 06:51:09 +02:00
hyperv dma: Let dma_memory_map() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
i2c aspeed/i2c: QOMify AspeedI2CBus 2021-10-12 08:20:08 +02:00
i386 dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
ide dma: Let dma_buf_read() take MemTxAttrs argument 2021-12-31 01:05:27 +01:00
input hw/input/lasips2: Fix typos in function names 2021-10-31 21:05:40 +01:00
intc dma: Let ld*_dma() propagate MemTxResult 2021-12-31 01:05:27 +01:00
ipack qbus: Rename qbus_create_inplace() to qbus_init() 2021-09-30 13:42:10 +01:00
ipmi ipmi/sim: fix watchdog_expired data type error in IPMIBmcSim struct 2021-07-08 14:15:01 -05:00
isa vt82c686: Add a method to VIA_ISA to raise ISA interrupts 2021-10-18 00:41:36 +02:00
m68k m68k pull request 20211109 2021-11-09 13:16:56 +01:00
mem hw/mem/pc-dimm: Restrict NUMA-specific code to NUMA machines 2021-11-11 03:13:05 -05:00
microblaze hw/microblaze: Replace drive_get_next() by drive_get() 2021-12-15 08:38:16 +01:00
mips hw/mips/boston: Fix load_elf() error detection 2021-12-06 11:57:36 +01:00
misc dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
net pci: Let ld*_pci_dma() propagate MemTxResult 2021-12-31 01:05:27 +01:00
nios2 Do not include cpu.h if it's not really necessary 2021-05-02 17:24:51 +02:00
nubus qbus: Rename qbus_create_inplace() to qbus_init() 2021-09-30 13:42:10 +01:00
nvme dma: Let dma_buf_read() take MemTxAttrs argument 2021-12-31 01:05:27 +01:00
nvram dma: Let st*_dma() take MemTxAttrs argument 2021-12-31 01:05:27 +01:00
openrisc Do not include exec/address-spaces.h if it's not really necessary 2021-05-02 17:24:51 +02:00
pci Fix bad overflow check in hw/pci/pcie.c 2021-11-29 08:49:36 -05:00
pci-bridge qdev: Make DeviceState.id independent of QemuOpts 2021-10-15 16:06:35 +02:00
pci-host dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
pcmcia hw/pcmcia: Do not register PCMCIA type if not required 2021-05-02 17:24:50 +02:00
ppc ppc/pnv: Use QOM hierarchy to scan PEC PHB4 devices 2021-12-17 17:57:19 +01:00
rdma qapi: introduce x-query-rdma QMP command 2021-11-02 15:55:14 +00:00
remote hw/remote/proxy: Categorize Wireless devices as 'Network' ones 2021-10-04 09:47:26 +02:00
riscv hw/riscv: Use load address rather than entry point for fw_dynamic next_addr 2021-12-20 14:53:31 +10:00
rtc hw/rtc/pl031: Send RTC_CHANGE QMP event 2021-11-15 18:53:00 +00:00
rx hw/rx/rx-gdbsim: Do not accept invalid memory size 2021-05-03 10:07:41 +02:00
s390x s390x/pci: add supported DT information to clp response 2021-12-17 09:12:37 +01:00
scsi pci: Let ld*_pci_dma() propagate MemTxResult 2021-12-31 01:05:27 +01:00
sd dma: Let dma_memory_read/write() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
sensor hw/misc: Add Infineon DPS310 sensor model 2021-09-20 08:50:59 +02:00
sh4 hw/intc/sh_intc: Inline and drop sh_intc_source() function 2021-10-30 18:39:37 +02:00
smbios hw/smbios: support for type 41 (onboard devices extended information) 2021-05-14 10:26:18 -04:00
sparc sun4m: fix setting CPU id when more than one CPU is present 2021-09-08 11:09:45 +01:00
sparc64 hw: Replace trivial drive_get_next() by drive_get() 2021-12-15 08:38:16 +01:00
ssi aspeed/smc: Use a container for the flash mmio address space 2021-10-22 09:52:17 +02:00
timer hw/timer/sh_timer: Remove use of hw_error 2021-10-30 18:39:37 +02:00
tpm tpm: mark correct memory region range dirty when clearing RAM 2021-10-02 08:43:21 +02:00
tricore hw/tricore: fix inclusion of tricore_testboard 2021-07-20 20:10:21 +02:00
usb pci: Let ld*_pci_dma() take MemTxAttrs argument 2021-12-31 01:05:27 +01:00
vfio vfio: Fix memory leak of hostwin 2021-11-17 11:25:55 -07:00
virtio dma: Let dma_memory_map() take MemTxAttrs argument 2021-12-30 17:16:32 +01:00
watchdog watchdog: remove select_watchdog_action 2021-11-02 15:57:27 +01:00
xen pci: Export pci_for_each_device_under_bus*() 2021-11-01 19:36:11 -04:00
xenpv meson: convert hw/arch* 2020-08-21 06:30:33 -04:00
xtensa Do not include exec/address-spaces.h if it's not really necessary 2021-05-02 17:24:51 +02:00
Kconfig hw/arm: xlnx-zcu102: Add Xilinx eFUSE device 2021-09-30 13:42:10 +01:00
meson.build sensor: Move hardware sensors from misc to a sensor directory 2021-06-17 07:10:32 -05:00