__cpu_simple_unlock(): add a note about memory ordering and why this is
correct, contrary to Intel's documentation.
This commit is contained in:
parent
711e88f6f4
commit
f48eb2511b
@ -1,11 +1,11 @@
|
||||
/* $NetBSD: lock.h,v 1.11 2005/12/28 19:09:30 perry Exp $ */
|
||||
/* $NetBSD: lock.h,v 1.12 2006/12/18 07:34:42 ad Exp $ */
|
||||
|
||||
/*-
|
||||
* Copyright (c) 2000 The NetBSD Foundation, Inc.
|
||||
* Copyright (c) 2000, 2006 The NetBSD Foundation, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* This code is derived from software contributed to The NetBSD Foundation
|
||||
* by Jason R. Thorpe.
|
||||
* by Jason R. Thorpe and Andrew Doran.
|
||||
*
|
||||
* Redistribution and use in source and binary forms, with or without
|
||||
* modification, are permitted provided that the following conditions
|
||||
@ -101,6 +101,58 @@ __cpu_simple_lock_try(__cpu_simple_lock_t *lockp)
|
||||
return (r);
|
||||
}
|
||||
|
||||
/*
|
||||
* Note on x86 memory ordering
|
||||
*
|
||||
* When releasing a lock we must ensure that no stores or loads from within
|
||||
* the critical section are re-ordered by the CPU to occur outside of it:
|
||||
* they must have completed and be visible to other processors once the lock
|
||||
* has been released.
|
||||
*
|
||||
* NetBSD usually runs with the kernel mapped (via MTRR) in a WB (write
|
||||
* back) memory region. In that case, memory ordering on x86 platforms
|
||||
* looks like this:
|
||||
*
|
||||
* i386 All loads/stores occur in instruction sequence.
|
||||
*
|
||||
* i486 All loads/stores occur in instruction sequence. In
|
||||
* Pentium exceptional circumstances, loads can be re-ordered around
|
||||
* stores, but for the purposes of releasing a lock it does
|
||||
* not matter. Stores may not be immediately visible to other
|
||||
* processors as they can be buffered. However, since the
|
||||
* stores are buffered in order the lock release will always be
|
||||
* the last operation in the critical section that becomes
|
||||
* visible to other CPUs.
|
||||
*
|
||||
* Pentium Pro The "Intel 64 and IA-32 Architectures Software Developer's
|
||||
* onwards Manual" volume 3A (order number 248966) says that (1) "Reads
|
||||
* can be carried out speculatively and in any order" and (2)
|
||||
* "Reads can pass buffered stores, but the processor is
|
||||
* self-consistent.". This would be a problem for the below,
|
||||
* and would mandate a locked instruction cycle or load fence
|
||||
* before releasing the simple lock.
|
||||
*
|
||||
* The "Intel Pentium 4 Processor Optimization" guide (order
|
||||
* number 253668-022US) says: "Loads can be moved before stores
|
||||
* that occurred earlier in the program if they are not
|
||||
* predicted to load from the same linear address.". This is
|
||||
* not a problem since the only loads that can be re-ordered
|
||||
* take place once the lock has been released via a store.
|
||||
*
|
||||
* The above two documents seem to contradict each other,
|
||||
* however with the exception of early steppings of the Pentium
|
||||
* Pro, the second document is closer to the truth: a store
|
||||
* will always act as a load fence for all loads that precede
|
||||
* the store in instruction order.
|
||||
*
|
||||
* Again, note that stores can be buffered and will not always
|
||||
* become immediately visible to other CPUs: they are however
|
||||
* buffered in order.
|
||||
*
|
||||
* AMD64 Stores occur in order and are buffered. Loads can be
|
||||
* reordered, however stores act as load fences, meaning that
|
||||
* loads can not be reordered around stores.
|
||||
*/
|
||||
static __inline void
|
||||
__cpu_simple_unlock(__cpu_simple_lock_t *lockp)
|
||||
{
|
||||
|
Loading…
Reference in New Issue
Block a user