83bcae9820
The "Memory Proximity Domain Attributes" structure of the ACPI HMAT has a "Processor Proximity Domain Valid" flag that is currently always set because Qemu -numa requires an initiator=X value when hmat=on. Unsetting this flag allows to create more complex memory topologies by having multiple best initiators for a single memory target. This patch allows -numa without initiator=X when hmat=on by keeping the default value MAX_NODES in numa_state->nodes[i].initiator. All places reading numa_state->nodes[i].initiator already check whether it's different from MAX_NODES before using it. Tested with qemu-system-x86_64 -accel kvm \ -machine pc,hmat=on \ -drive if=pflash,format=raw,file=./OVMF.fd \ -drive media=disk,format=qcow2,file=efi.qcow2 \ -smp 4 \ -m 3G \ -object memory-backend-ram,size=1G,id=ram0 \ -object memory-backend-ram,size=1G,id=ram1 \ -object memory-backend-ram,size=1G,id=ram2 \ -numa node,nodeid=0,memdev=ram0,cpus=0-1 \ -numa node,nodeid=1,memdev=ram1,cpus=2-3 \ -numa node,nodeid=2,memdev=ram2 \ -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=10 \ -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=10485760 \ -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=20 \ -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=5242880 \ -numa hmat-lb,initiator=0,target=2,hierarchy=memory,data-type=access-latency,latency=30 \ -numa hmat-lb,initiator=0,target=2,hierarchy=memory,data-type=access-bandwidth,bandwidth=1048576 \ -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,latency=20 \ -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=5242880 \ -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=10 \ -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=10485760 \ -numa hmat-lb,initiator=1,target=2,hierarchy=memory,data-type=access-latency,latency=30 \ -numa hmat-lb,initiator=1,target=2,hierarchy=memory,data-type=access-bandwidth,bandwidth=1048576 which reports NUMA node2 at same distance from both node0 and node1 as seen in lstopo: Machine (2966MB total) + Package P#0 NUMANode P#2 (979MB) Group0 NUMANode P#0 (980MB) Core P#0 + PU P#0 Core P#1 + PU P#1 Group0 NUMANode P#1 (1007MB) Core P#2 + PU P#2 Core P#3 + PU P#3 Before this patch, we had to add ",initiator=X" to "-numa node,nodeid=2,memdev=ram2". The lstopo output difference between initiator=1 and no initiator is: @@ -1,10 +1,10 @@ Machine (2966MB total) + Package P#0 + NUMANode P#2 (979MB) Group0 NUMANode P#0 (980MB) Core P#0 + PU P#0 Core P#1 + PU P#1 Group0 NUMANode P#1 (1007MB) - NUMANode P#2 (979MB) Core P#2 + PU P#2 Core P#3 + PU P#3 Corresponding changes in the HMAT MPDA structure: @@ -49,10 +49,10 @@ [078h 0120 2] Structure Type : 0000 [Memory Proximity Domain Attributes] [07Ah 0122 2] Reserved : 0000 [07Ch 0124 4] Length : 00000028 -[080h 0128 2] Flags (decoded below) : 0001 - Processor Proximity Domain Valid : 1 +[080h 0128 2] Flags (decoded below) : 0000 + Processor Proximity Domain Valid : 0 [082h 0130 2] Reserved1 : 0000 -[084h 0132 4] Attached Initiator Proximity Domain : 00000001 +[084h 0132 4] Attached Initiator Proximity Domain : 00000080 [088h 0136 4] Memory Proximity Domain : 00000002 [08Ch 0140 4] Reserved2 : 00000000 [090h 0144 8] Reserved3 : 0000000000000000 Final HMAT SLLB structures: [0A0h 0160 2] Structure Type : 0001 [System Locality Latency and Bandwidth Information] [0A2h 0162 2] Reserved : 0000 [0A4h 0164 4] Length : 00000040 [0A8h 0168 1] Flags (decoded below) : 00 Memory Hierarchy : 0 [0A9h 0169 1] Data Type : 00 [0AAh 0170 2] Reserved1 : 0000 [0ACh 0172 4] Initiator Proximity Domains # : 00000002 [0B0h 0176 4] Target Proximity Domains # : 00000003 [0B4h 0180 4] Reserved2 : 00000000 [0B8h 0184 8] Entry Base Unit : 0000000000002710 [0C0h 0192 4] Initiator Proximity Domain List : 00000000 [0C4h 0196 4] Initiator Proximity Domain List : 00000001 [0C8h 0200 4] Target Proximity Domain List : 00000000 [0CCh 0204 4] Target Proximity Domain List : 00000001 [0D0h 0208 4] Target Proximity Domain List : 00000002 [0D4h 0212 2] Entry : 0001 [0D6h 0214 2] Entry : 0002 [0D8h 0216 2] Entry : 0003 [0DAh 0218 2] Entry : 0002 [0DCh 0220 2] Entry : 0001 [0DEh 0222 2] Entry : 0003 [0E0h 0224 2] Structure Type : 0001 [System Locality Latency and Bandwidth Information] [0E2h 0226 2] Reserved : 0000 [0E4h 0228 4] Length : 00000040 [0E8h 0232 1] Flags (decoded below) : 00 Memory Hierarchy : 0 [0E9h 0233 1] Data Type : 03 [0EAh 0234 2] Reserved1 : 0000 [0ECh 0236 4] Initiator Proximity Domains # : 00000002 [0F0h 0240 4] Target Proximity Domains # : 00000003 [0F4h 0244 4] Reserved2 : 00000000 [0F8h 0248 8] Entry Base Unit : 0000000000000001 [100h 0256 4] Initiator Proximity Domain List : 00000000 [104h 0260 4] Initiator Proximity Domain List : 00000001 [108h 0264 4] Target Proximity Domain List : 00000000 [10Ch 0268 4] Target Proximity Domain List : 00000001 [110h 0272 4] Target Proximity Domain List : 00000002 [114h 0276 2] Entry : 000A [116h 0278 2] Entry : 0005 [118h 0280 2] Entry : 0001 [11Ah 0282 2] Entry : 0005 [11Ch 0284 2] Entry : 000A [11Eh 0286 2] Entry : 0001 Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Hesham Almatary <hesham.almatary@huawei.com> Reviewed-by: Jingqi Liu <jingqi.liu@intel.com> Message-Id: <20221027100037.251-2-hesham.almatary@huawei.com> Tested-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> |
||
---|---|---|
.. | ||
bus.c | ||
clock-vmstate.c | ||
clock.c | ||
cpu-common.c | ||
cpu-sysemu.c | ||
fw-path-provider.c | ||
generic-loader.c | ||
gpio.c | ||
guest-loader.c | ||
guest-loader.h | ||
hotplug-stubs.c | ||
hotplug.c | ||
irq.c | ||
Kconfig | ||
loader-fit.c | ||
loader.c | ||
machine-hmp-cmds.c | ||
machine-qmp-cmds.c | ||
machine-smp.c | ||
machine.c | ||
meson.build | ||
nmi.c | ||
null-machine.c | ||
numa.c | ||
or-irq.c | ||
platform-bus.c | ||
ptimer.c | ||
qdev-clock.c | ||
qdev-fw.c | ||
qdev-hotplug.c | ||
qdev-prop-internal.h | ||
qdev-properties-system.c | ||
qdev-properties.c | ||
qdev.c | ||
register.c | ||
reset.c | ||
resettable.c | ||
split-irq.c | ||
stream.c | ||
sysbus-fdt.c | ||
sysbus.c | ||
trace-events | ||
trace.h | ||
uboot_image.h | ||
vm-change-state-handler.c | ||
vmstate-if.c |