qemu/target-i386
Longpeng(Mike) 14c985cffa target-i386: present virtual L3 cache info for vcpus
Some software algorithms are based on the hardware's cache info, for example,
for x86 linux kernel, when cpu1 want to wakeup a task on cpu2, cpu1 will trigger
a resched IPI and told cpu2 to do the wakeup if they don't share low level
cache. Oppositely, cpu1 will access cpu2's runqueue directly if they share llc.
The relevant linux-kernel code as bellow:

	static void ttwu_queue(struct task_struct *p, int cpu)
	{
		struct rq *rq = cpu_rq(cpu);
		......
		if (... && !cpus_share_cache(smp_processor_id(), cpu)) {
			......
			ttwu_queue_remote(p, cpu); /* will trigger RES IPI */
			return;
		}
		......
		ttwu_do_activate(rq, p, 0); /* access target's rq directly */
		......
	}

In real hardware, the cpus on the same socket share L3 cache, so one won't
trigger a resched IPIs when wakeup a task on others. But QEMU doesn't present a
virtual L3 cache info for VM, then the linux guest will trigger lots of RES IPIs
under some workloads even if the virtual cpus belongs to the same virtual socket.

For KVM, there will be lots of vmexit due to guest send IPIs.
The workload is a SAP HANA's testsuite, we run it one round(about 40 minuates)
and observe the (Suse11sp3)Guest's amounts of RES IPIs which triggering during
the period:
        No-L3           With-L3(applied this patch)
cpu0:	363890		44582
cpu1:	373405		43109
cpu2:	340783		43797
cpu3:	333854		43409
cpu4:	327170		40038
cpu5:	325491		39922
cpu6:	319129		42391
cpu7:	306480		41035
cpu8:	161139		32188
cpu9:	164649		31024
cpu10:	149823		30398
cpu11:	149823		32455
cpu12:	164830		35143
cpu13:	172269		35805
cpu14:	179979		33898
cpu15:	194505		32754
avg:	268963.6	40129.8

The VM's topology is "1*socket 8*cores 2*threads".
After present virtual L3 cache info for VM, the amounts of RES IPIs in guest
reduce 85%.

For KVM, vcpus send IPIs will cause vmexit which is expensive, so it can cause
severe performance degradation. We had tested the overall system performance if
vcpus actually run on sparate physical socket. With L3 cache, the performance
improves 7.2%~33.1%(avg:15.7%).

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
..
arch_dump.c x86: Clean up includes 2016-01-29 15:07:22 +00:00
arch_memory_mapping.c x86: Clean up includes 2016-01-29 15:07:22 +00:00
bpt_helper.c cpu-exec: Rename cpu_resume_from_signal() to cpu_loop_exit_noexc() 2016-06-09 15:55:02 +01:00
cc_helper_template.h target-i386: Implement BLSR, BLSMSK, BLSI 2013-02-18 15:52:05 -08:00
cc_helper.c target-i386: Perform set/reset_inhibit_irq inline 2016-02-13 07:59:59 +11:00
cpu-qom.h target-i386: make cpu-qom.h not target specific 2016-05-19 13:08:04 +02:00
cpu.c target-i386: present virtual L3 cache info for vcpus 2016-09-09 20:58:34 +03:00
cpu.h target-i386: present virtual L3 cache info for vcpus 2016-09-09 20:58:34 +03:00
excp_helper.c cpu: move exec-all.h inclusion out of cpu.h 2016-05-19 16:42:29 +02:00
fpu_helper.c coccinelle: Remove unnecessary variables for function return value 2016-06-20 16:38:13 +02:00
gdbstub.c qemu-common: push cpu.h inclusion out of qemu-common.h 2016-05-19 16:42:29 +02:00
helper.c target-i386: Move user-mode exception actions out of user-exec.c 2016-06-09 15:55:02 +01:00
helper.h target-i386: implement PKE for TCG 2016-03-24 14:01:08 +01:00
hyperv.c event-notifier: Add "is_external" parameter 2016-04-22 16:43:56 +02:00
hyperv.h Clean up header guards that don't match their file name 2016-07-12 16:19:16 +02:00
int_helper.c cpu: move exec-all.h inclusion out of cpu.h 2016-05-19 16:42:29 +02:00
kvm_i386.h kvm: x86: add support for KVM_CAP_SPLIT_IRQCHIP 2015-12-17 17:33:47 +01:00
kvm-stub.c qemu-common: push cpu.h inclusion out of qemu-common.h 2016-05-19 16:42:29 +02:00
kvm.c target-i386: kvm: Report kvm_pv_unhalt as unsupported w/o kernel_irqchip 2016-08-16 08:49:53 -03:00
machine.c target-i386: kvm: Add basic Intel LMCE support 2016-07-07 15:25:16 -03:00
Makefile.objs target-i386: Enable control registers for MPX 2016-02-13 07:59:59 +11:00
mem_helper.c Fix confusing argument names in some common functions 2016-07-12 13:06:08 +01:00
misc_helper.c cpu: move exec-all.h inclusion out of cpu.h 2016-05-19 16:42:29 +02:00
monitor.c x86: Clean up includes 2016-01-29 15:07:22 +00:00
mpx_helper.c cpu: move exec-all.h inclusion out of cpu.h 2016-05-19 16:42:29 +02:00
ops_sse_header.h target-i386: Rename struct XMMReg to ZMMReg 2016-01-21 12:47:15 -02:00
ops_sse.h target-i386: Rename XMM_[BWLSDQ] helpers to ZMM_* 2016-01-21 12:47:16 -02:00
seg_helper.c target-i386: Add comment about do_interrupt_user() next_eip argument 2016-06-09 15:55:02 +01:00
shift_helper_template.h target-i386: compute eflags outside rcl/rcr helper 2013-02-18 15:03:56 -08:00
smm_helper.c target-i386: Enable control registers for MPX 2016-02-13 07:59:59 +11:00
svm_helper.c cpu: move exec-all.h inclusion out of cpu.h 2016-05-19 16:42:29 +02:00
svm.h Clean up ill-advised or unusual header guards 2016-07-12 16:20:46 +02:00
TODO target-i386: fix {min,max}{pd,ps,sd,ss} SSE2 instructions 2012-01-11 09:55:28 +01:00
trace-events trace-events: fix first line comment in trace-events 2016-08-12 10:36:01 +01:00
translate.c target-i386: fix typo in xsetbv implementation 2016-08-02 12:03:58 +02:00