- Improved/completed(almost) winpr's critical section implementation
- Replaced WaitForSingleObject locking with critical sections
Note:
WaitForSingleObject should _never_ be used for granular low-contention
locks as it _always_ enters the kernel.
Just replacing WaitForSingleObject locking in Bufferpool with
EnterCriticalSection boosts the multithreaded rfx decoder
performance by almost 400% on win32.
(pull #1388 - cherry picked from commit 81ef251fc8f345c544d459e6a47a8479ff550e8a)