Some component of the encoder chain (I suspect the rlgr encoder) expects
the output buffer to be zeroed. The multithreaded RemoteFX encoder uses
wStreams from the StreamPool which are reused and not zeroed out of
course. For now, in order to prevent data corruption we clear the stream.
(cherry picked from commit ccc5d1b279eb76721adde44576f8226c418abd23)
* support for ndk version r8d+
* improved x86_64 host machine support
* support non-release NDK layouts
(cherry picked from commit 553f7c24f75c5ff59d61aeb3c3da96b18c2ef9d2)
Use InitializeCriticalSectionAndSpinCount instead of IntializeCriticalSection.
Using spin counts for critical sections of short duration enables the calling
thread to avoid the wait operation in most situations which can dramatically
improve the overall performance on multiprocessor systems.
On Linux this change has no effect because the new winpr critical section
implementation does not use the SpinCount field under Linux because the NPTL
synchronization primitives are implemented using the extremely performant
futex system calls which have this magic already built in.
However, on Mac OS X this change improved the overall performance of the
multithreaded RemoteFX decoder by 25 percent.
I've used a SpinCount of 4000 which avoided 99 percent of the wait calls.
This value is also used by Microsoft's heap manager for its per-heap
critical sections.
Note: This change requires pull request #1397 to be merged.
(cherry picked from commit 3a58934eb2abf539ed229e97a5c9dd0a1dac9164)
- Complete implementation including recursion support
- Added an intensive ctest (TestSynchCritical)
- Struct members are used exactly as Windows does it internally:
LockCount starts at -1, RecursionCount at 0
- Same performance optimizations as internally on Windows:
- Fast lock acquisition path using CAS -> SpinCount -> wait
- SpinCount automatically disabled on uniprocessor systems
- On Linux SpinCount is disabled because it provided no advantage over NPTL/futex in all tests
Support for CRITICAL_SECTION's DebugInfo is not yet included (but trivial to add).
(cherry picked from commit 2b25b4a52014d160aea370a7afba9aa1cff18c3f)
The WaitForSingleObject call on TilePool's event is called with a zero time-out
interval and the event is a manual reset event ... thus no locking or waiting
is involved anyways and Queue_Dequeue may very well return NULL independently
of calling WaitForSingleObject which is already correctly handled.
(cherry picked from commit 938a0890a37c219b2f28936e5a33d46e6b74ee5a)
- Improved/completed(almost) winpr's critical section implementation
- Replaced WaitForSingleObject locking with critical sections
Note:
WaitForSingleObject should _never_ be used for granular low-contention
locks as it _always_ enters the kernel.
Just replacing WaitForSingleObject locking in Bufferpool with
EnterCriticalSection boosts the multithreaded rfx decoder
performance by almost 400% on win32.
(pull #1388 - cherry picked from commit 81ef251fc8f345c544d459e6a47a8479ff550e8a)
Frame markes are not really implemented. Just SendFrameAcknowledge on
SURFACECMD_FRAMEACTION_END if settings->FrameAcknowledge > 0
This fixes issue #1352