mirror of https://github.com/postgres/postgres
Fix parallel BRIN builds with synchronized scans
The brinbuildCallbackParallel callback used by parallel BRIN builds did not consider that the parallel table scans may be synchronized, starting from an arbitrary block and then wrap around. If this happened and the scan actually did wrap around, tuples from the beginning of the table were added to the last range produced by the same worker. The index would be missing range at the beginning of the table, while the last range would be too wide. This would not produce incorrect query results, but it'd be less efficient. Fixed by checking for both past and future ranges in the callback. The worker may produce multiple summaries for the same page range, but the leader will merge them as if the summaries came from different workers. Discussion: https://postgr.es/m/c2ee7d69-ce17-43f2-d1a0-9811edbda6e6%40enterprisedb.com
This commit is contained in:
parent
6c63bcbf3c
commit
cb44a8345e
|
@ -1040,16 +1040,22 @@ brinbuildCallbackParallel(Relation index,
|
|||
thisblock = ItemPointerGetBlockNumber(tid);
|
||||
|
||||
/*
|
||||
* If we're in a block that belongs to a future range, summarize what
|
||||
* If we're in a block that belongs to a different range, summarize what
|
||||
* we've got and start afresh. Note the scan might have skipped many
|
||||
* pages, if they were devoid of live tuples; we do not create emptry BRIN
|
||||
* ranges here - the leader is responsible for filling them in.
|
||||
*
|
||||
* Unlike serial builds, parallel index builds allow synchronized seqscans
|
||||
* (because that's what parallel scans do). This means the block may wrap
|
||||
* around to the beginning of the relation, so the condition needs to
|
||||
* check for both future and past ranges.
|
||||
*/
|
||||
if (thisblock > state->bs_currRangeStart + state->bs_pagesPerRange - 1)
|
||||
if ((thisblock < state->bs_currRangeStart) ||
|
||||
(thisblock > state->bs_currRangeStart + state->bs_pagesPerRange - 1))
|
||||
{
|
||||
|
||||
BRIN_elog((DEBUG2,
|
||||
"brinbuildCallback: completed a range: %u--%u",
|
||||
"brinbuildCallbackParallel: completed a range: %u--%u",
|
||||
state->bs_currRangeStart,
|
||||
state->bs_currRangeStart + state->bs_pagesPerRange));
|
||||
|
||||
|
@ -1201,7 +1207,9 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
|
|||
{
|
||||
/*
|
||||
* Now scan the relation. No syncscan allowed here because we want
|
||||
* the heap blocks in physical order.
|
||||
* the heap blocks in physical order (we want to produce the ranges
|
||||
* starting from block 0, and the callback also relies on this to not
|
||||
* generate summary for the same range twice).
|
||||
*/
|
||||
reltuples = table_index_build_scan(heap, index, indexInfo, false, true,
|
||||
brinbuildCallback, (void *) state, NULL);
|
||||
|
|
Loading…
Reference in New Issue