Fix bug in nbtree array primitive scan scheduling.
A bug in nbtree's handling of primitive index scan scheduling could lead to wrong answers when a scrollable cursor was used with an index scan that had a SAOP index qual. Wrong answers were only possible when the scan direction changed after a primitive scan was scheduled, but before _bt_next was asked to fetch the next tuple in line (i.e. for things to break, _bt_next had to be denied the opportunity to step off the page in the same direction as the one used when the primscan was scheduled). Furthermore, the issue only occurred when the page in question happened to be the first page to be visited by the entire top-level scan; the issue hinged upon the cursor backing up to the absolute beginning of the key space that it returns tuples from (fetching in the opposite scan direction across a "primitive scan boundary" always worked correctly). To fix, make _bt_next unset the "needs primitive index scan" flag when it detects that the current scan direction is not the one that was used by _bt_readpage back when the primitive scan in question was scheduled. This fixes the cases that are known to be faulty, and also seems like a good idea on general robustness grounds. Affected scrollable cursor cases now avoid a spurious primitive index scan when they fetch backwards to the absolute start of the key space to be visited by their cursor. Fetching backwards now only returns those tuples at the start of the scan, as expected. It'll also be okay to once again fetch forwards from the start at that point, since the scan will be left in a state that's exactly consistent with the state it was in before any tuples were ever fetched, as expected. Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wznv49bFsE2jkt4GuZ0tU2C91dEST=50egzjY2FeOcHL4Q@mail.gmail.com Backpatch: 17-, where commit 5bf748b8 first appears.
This commit is contained in:
parent
2a326aa2a2
commit
37e92c4fd0
@ -1628,6 +1628,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
|
||||
* corresponding need for the left-link, since splits always go right.
|
||||
*/
|
||||
so->currPos.nextPage = opaque->btpo_next;
|
||||
so->currPos.dir = dir;
|
||||
|
||||
/* initialize tuple workspace to empty */
|
||||
so->currPos.nextTupleOffset = 0;
|
||||
@ -2083,16 +2084,24 @@ _bt_steppage(IndexScanDesc scan, ScanDirection dir)
|
||||
* In effect, btrestpos leaves advancing the arrays up to the first
|
||||
* _bt_readpage call (that takes place after it has restored markPos).
|
||||
*/
|
||||
Assert(so->markPos.dir == dir);
|
||||
if (so->needPrimScan)
|
||||
{
|
||||
if (ScanDirectionIsForward(dir))
|
||||
if (ScanDirectionIsForward(so->currPos.dir))
|
||||
so->markPos.moreRight = true;
|
||||
else
|
||||
so->markPos.moreLeft = true;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Cancel primitive index scans that were scheduled when the call to
|
||||
* _bt_readpage for currPos happened to use the opposite direction to the
|
||||
* one that we're stepping in now. (It's okay to leave the scan's array
|
||||
* keys as-is, since the next _bt_readpage will advance them.)
|
||||
*/
|
||||
if (so->currPos.dir != dir)
|
||||
so->needPrimScan = false;
|
||||
|
||||
if (ScanDirectionIsForward(dir))
|
||||
{
|
||||
/* Walk right to the next page with data */
|
||||
@ -2653,7 +2662,6 @@ _bt_endpoint(IndexScanDesc scan, ScanDirection dir)
|
||||
static inline void
|
||||
_bt_initialize_more_data(BTScanOpaque so, ScanDirection dir)
|
||||
{
|
||||
so->currPos.dir = dir;
|
||||
if (so->needPrimScan)
|
||||
{
|
||||
Assert(so->numArrayKeys);
|
||||
|
@ -2426,8 +2426,10 @@ new_prim_scan:
|
||||
/*
|
||||
* End this primitive index scan, but schedule another.
|
||||
*
|
||||
* Note: If the scan direction happens to change, this scheduled primitive
|
||||
* index scan won't go ahead after all.
|
||||
* Note: We make a soft assumption that the current scan direction will
|
||||
* also be used within _bt_next, when it is asked to step off this page.
|
||||
* It is up to _bt_next to cancel this scheduled primitive index scan
|
||||
* whenever it steps to a page in the direction opposite currPos.dir.
|
||||
*/
|
||||
pstate->continuescan = false; /* Tell _bt_readpage we're done... */
|
||||
so->needPrimScan = true; /* ...but call _bt_first again */
|
||||
|
Loading…
x
Reference in New Issue
Block a user