Fix race condition in COMMIT PREPARED causing orphaned 2PC files

COMMIT PREPARED removes on-disk 2PC files near its end, but the state
checked if a file is on-disk or not gets read from shared memory while
not holding the two-phase state lock.

Because of that, there was a small window where a second backend doing a
PREPARE TRANSACTION could reuse the GlobalTransaction put back into the
2PC free list by the COMMIT PREPARED, overwriting the "ondisk" flag read
afterwards by the COMMIT PREPARED to decide if its on-disk two-phase
state file should be removed, preventing the file deletion.

This commit fixes this issue so as the "ondisk" flag in the
GlobalTransaction is read while holding the two-phase state lock, not
from shared memory after its entry has been added to the free list.

Orphaned two-phase state files flushed to disk after a checkpoint are
discarded at the beginning of recovery.  However, a truncation of
pg_xact/ would make the startup process issue a FATAL when it cannot
read the SLRU page holding the state of the transaction whose 2PC file
was orphaned, which is a necessary step to decide if the 2PC file should
be removed or not.  Removing manually the file would be necessary in
this case.

Issue introduced by effe7d9552dd, so backpatch all the way down.

Mea culpa.

Author: wuchengwen
Discussion: https://postgr.es/m/tencent_A7F059B5136A359625C7B2E4A386B3C3F007@qq.com
Backpatch-through: 12
This commit is contained in:
Michael Paquier 2024-10-01 15:44:07 +09:00 committed by Muhammad Usama
parent b9629211a8
commit 139b2d549e

View File

@ -1505,6 +1505,7 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
GlobalTransaction gxact;
PGPROC *proc;
TransactionId xid;
bool ondisk;
char *buf;
char *bufptr;
TwoPhaseFileHeader *hdr;
@ -1657,6 +1658,12 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
PredicateLockTwoPhaseFinish(xid, isCommit);
/*
* Read this value while holding the two-phase lock, as the on-disk 2PC
* file is physically removed after the lock is released.
*/
ondisk = gxact->ondisk;
/* Clear shared memory state */
RemoveGXact(gxact);
@ -1672,7 +1679,7 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
/*
* And now we can clean up any files we may have left.
*/
if (gxact->ondisk)
if (ondisk)
RemoveTwoPhaseFile(xid, true);
MyLockedGxact = NULL;