Serialize rdtsc using with lfence, mfence or cpuid to read TSC more precisely.
x86/x86/tsc.c rev. 1.67 reduced cache problem and got big improvement, but it
still has room. I measured the effect of lfence, mfence, cpuid and rdtscp.
The impact to TSC skew and/or drift is:
AMD: mfence > rdtscp > cpuid > lfence-serialize > lfence = nomodify
Intel: lfence > rdtscp > cpuid > nomodify
So, mfence is the best on AMD and lfence is the best on Intel. If it has no
SSE2, we can use cpuid.
NOTE:
- An AMD's document says DE_CFG_LFENCE_SERIALIZE bit can be used for
serializing, but it's not so good.
- On Intel i386(not amd64), it seems the improvement is very little.
- rdtscp instruct can be used as serializing instruction + rdtsc, but
it's not good as [lm]fence. Both Intel and AMD's document say that
the latency of rdtscp is bigger than rdtsc, so I suspect the difference
of the result comes from it.
2020-06-15 12:09:23 +03:00
|
|
|
/* $NetBSD: cpu_counter.h,v 1.7 2020/06/15 09:09:23 msaitoh Exp $ */
|
2007-07-07 21:38:26 +04:00
|
|
|
|
|
|
|
/*-
|
2008-05-10 18:53:54 +04:00
|
|
|
* Copyright (c) 2000, 2008 The NetBSD Foundation, Inc.
|
2007-07-07 21:38:26 +04:00
|
|
|
* All rights reserved.
|
|
|
|
*
|
|
|
|
* This code is derived from software contributed to The NetBSD Foundation
|
|
|
|
* by Bill Sommerfeld.
|
|
|
|
*
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer.
|
|
|
|
* 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
*
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
|
|
|
|
* ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
|
|
|
|
* TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
|
|
|
* PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
|
|
|
|
* BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
|
|
|
|
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
|
|
|
|
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
|
|
|
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
|
|
|
|
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
|
|
|
|
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
|
|
|
|
* POSSIBILITY OF SUCH DAMAGE.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef _X86_CPU_COUNTER_H_
|
|
|
|
#define _X86_CPU_COUNTER_H_
|
|
|
|
|
2008-05-10 18:53:54 +04:00
|
|
|
#ifdef _KERNEL
|
|
|
|
|
Serialize rdtsc using with lfence, mfence or cpuid to read TSC more precisely.
x86/x86/tsc.c rev. 1.67 reduced cache problem and got big improvement, but it
still has room. I measured the effect of lfence, mfence, cpuid and rdtscp.
The impact to TSC skew and/or drift is:
AMD: mfence > rdtscp > cpuid > lfence-serialize > lfence = nomodify
Intel: lfence > rdtscp > cpuid > nomodify
So, mfence is the best on AMD and lfence is the best on Intel. If it has no
SSE2, we can use cpuid.
NOTE:
- An AMD's document says DE_CFG_LFENCE_SERIALIZE bit can be used for
serializing, but it's not so good.
- On Intel i386(not amd64), it seems the improvement is very little.
- rdtscp instruct can be used as serializing instruction + rdtsc, but
it's not good as [lm]fence. Both Intel and AMD's document say that
the latency of rdtscp is bigger than rdtsc, so I suspect the difference
of the result comes from it.
2020-06-15 12:09:23 +03:00
|
|
|
#include <sys/lwp.h>
|
|
|
|
|
|
|
|
extern uint64_t cpu_frequency(struct cpu_info *);
|
|
|
|
extern int cpu_hascounter(void);
|
|
|
|
extern uint64_t (*cpu_counter)(void);
|
|
|
|
extern uint32_t (*cpu_counter32)(void);
|
|
|
|
|
|
|
|
extern uint64_t cpu_counter_cpuid(void);
|
|
|
|
extern uint64_t cpu_counter_lfence(void);
|
|
|
|
extern uint64_t cpu_counter_mfence(void);
|
|
|
|
extern uint32_t cpu_counter32_cpuid(void);
|
|
|
|
extern uint32_t cpu_counter32_lfence(void);
|
|
|
|
extern uint32_t cpu_counter32_mfence(void);
|
2008-05-10 18:53:54 +04:00
|
|
|
|
|
|
|
#endif /* _KERNEL */
|
|
|
|
|
2007-07-07 21:38:26 +04:00
|
|
|
#endif /* !_X86_CPU_COUNTER_H_ */
|