postgres

Author	SHA1	Message	Date
Michael Paquier	5d4298e75f	pg_stat_statements: Avoid some locking during PGSS entry scans A single PGSS entry's spinlock is used to be able to modify "counters" without holding pgss->lock exclusively, as mentioned at the top of pg_stat_statements.c and within pgssEntry. Within a single pgssEntry, stats_since and minmax_stats_since are never modified without holding pgss->lock exclusively, so there is no need to hold an entry's spinlock when reading stats_since and minmax_stats_since, as done when scanning all the PGSS entries for function calls of pg_stat_statements(). This also restores the consistency between the code and the comments about the entry's spinlock usage. This change is a performance improvement (it can be argued that this is a logic bug), so there is no need for a backpatch. This saves two instructions from being read while holding an entry's spinlock. Author: Karina Litskevich Reviewed-by: Michael Paquier, wenhui qiu Discussion: https://postgr.es/m/CACiT8ibhCmzbcOxM0v4pRLH3abk-95LPkt7_uC2JMP+miPjxsg@mail.gmail.com	2024-11-11 09:02:30 +09:00
Peter Eisentraut	9be4e5d293	Remove unused #include's from contrib, pl, test .c files as determined by IWYU Similar to commit dbbca2cf299, but for contrib, pl, and src/test/. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org	2024-10-28 08:02:17 +01:00
Michael Paquier	6b652e6ce8	Set query ID for inner queries of CREATE TABLE AS and DECLARE Some utility statements contain queries that can be planned and executed: CREATE TABLE AS and DECLARE CURSOR. This commit adds query ID computation for the inner queries executed by these two utility commands, with and without EXPLAIN. This change leads to four new callers of JumbleQuery() and post_parse_analyze_hook() so as extensions can decide what to do with this new data. Previously, extensions relying on the query ID, like pg_stat_statements, were not able to track these nested queries as the query_id was 0. For pg_stat_statements, this commit leads to additions under !toplevel when pg_stat_statements.track is set to "all", as shown in its regression tests. The output of EXPLAIN for these two utilities gains a "Query Identifier" if compute_query_id is enabled. Author: Anthonin Bonnefoy Reviewed-by: Michael Paquier, Jian He Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com	2024-10-28 09:03:20 +09:00
Michael Paquier	499edb0974	Track more precisely query locations for nested statements Previously, a Query generated through the transform phase would have unset stmt_location, tracking the starting point of a query string. Extensions relying on the statement location to extract its relevant parts in the source text string would fallback to use the whole statement instead, leading to confusing results like in pg_stat_statements for queries relying on nested queries, like: - EXPLAIN, with top-level and nested query using the same query string, and a query ID coming from the nested query when the non-top-level entry. - Multi-statements, with only partial portions of queries being normalized. - COPY TO with a query, SELECT or DMLs. This patch improves things by keeping track of the statement locations and propagate it to Query during transform, allowing PGSS to only show the relevant part of the query for nested query. This leads to less bloat in entries for non-top-level entries, as queries can now be grouped within the same (toplevel, queryid) duos in pg_stat_statements. The result gives a stricter one-one mapping between query IDs and its query strings. The regression tests introduced in 45e0ba30fc40 produce differences reflecting the new logic. Author: Anthonin Bonnefoy Reviewed-by: Michael Paquier, Jian He Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com	2024-10-24 09:29:54 +09:00
Tom Lane	14e5680eee	Improve parser's reporting of statement start locations. Up to now, the parser's reporting of a statement's stmt_location included any preceding whitespace or comments. This isn't really desirable but was done to avoid accounting honestly for nonterminals that reduce to empty. It causes problems for pg_stat_statements, which partially compensates by manually stripping whitespace, but is not bright enough to strip /*-style comments. There will be more problems with an upcoming patch to improve reporting of errors in extension scripts, so it's time to do something about this. The thing we have to do to make it work right is to adjust YYLLOC_DEFAULT to scan the inputs of each production to find the first one that has a valid location (i.e., did not reduce to empty). In theory this adds a little bit of per-reduction overhead, but in practice it's negligible. I checked by measuring the time to run raw_parser() on the contents of information_schema.sql, and there was basically no change. Having done that, we can rely on any nonterminal that didn't reduce to completely empty to have a correct starting location, and we don't need the kluges the stmtmulti production formerly used. This should have a side benefit of allowing parse error reports to include an error position in some cases where they formerly failed to do so, due to trying to report the position of an empty nonterminal. I did not go looking for an example though. The one previously known case where that could happen (OptSchemaEltList) no longer needs the kluge it had; but I rather doubt that that was the only case. Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de	2024-10-22 11:26:05 -04:00
Michael Paquier	45e0ba30fc	pg_stat_statements: Add tests for nested queries with level tracking There have never been any regression tests in PGSS for various query patterns for nested queries combined with level tracking, like: - Multi-statements. - CREATE TABLE AS - CREATE/REFRESH MATERIALIZED VIEW - DECLARE CURSOR - EXPLAIN, with a subset of the above supported. - COPY. All the tests added here track historical, sometimes confusing, existing behaviors. For example, EXPLAIN stores two PGSS entries with the same top-level query string but two different query IDs as one is calculated for the top-level EXPLAIN (this part is right) and a second one for the inner query in the EXPLAIN (this part is not right). A couple of patches are under discussion to improve the situation, and all the tests added here will prove useful to evaluate the changes discussed. Author: Anthonin Bonnefoy Reviewed-by: Michael Paquier, Jian He Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com	2024-10-22 13:05:51 +09:00
Michael Paquier	cf54a2c002	pg_stat_statements: Add columns to track parallel worker activity The view pg_stat_statements gains two columns: - parallel_workers_to_launch, the number of parallel workers planned to be launched. - parallel_workers_launched, the number of parallel workers actually launched. The ratio of both columns offers hints that parallel workers are lacking on a per-statement basis, requiring some tuning, in coordination with "calls", the number of times a query is executed. As of now, these numbers are tracked within Gather and GatherMerge nodes. They could be extended to utilities that make use of parallel workers (parallel btree and brin, VACUUM). The module is bumped to 1.12. Author: Guillaume Lelarge Discussion: https://postgr.es/m/CAECtzeWtTGOK0UgKXdDGpfTVSa5bd_VbUt6K6xn8P7X+_dZqKw@mail.gmail.com	2024-10-09 08:30:45 +09:00
Peter Eisentraut	10b721821d	Use macro to define the number of enum values Refactoring in the interest of code consistency, a follow-up to 2e068db56e31. The argument against inserting a special enum value at the end of the enum definition is that a switch statement might generate a compiler warning unless it has a default clause. Aleksander Alekseev, reviewed by Michael Paquier, Dean Rasheed, Peter Eisentraut Discussion: https://postgr.es/m/CAJ7c6TMsiaV5urU_Pq6zJ2tXPDwk69-NKVh4AMN5XrRiM7N%2BGA%40mail.gmail.com	2024-10-01 09:30:24 -04:00
Michael Paquier	dc68515968	Show values of SET statements as constants in pg_stat_statements This is a continuation of work like 11c34b342bd7, done to reduce the bloat of pg_stat_statements by applying more normalization to query entries. This commit is able to detect and normalize values in VariableSetStmt, resulting in: SET conf_param = $1 Compared to other parse nodes, VariableSetStmt is embedded in much more places in the parser, impacting many query patterns in pg_stat_statements. A custom jumble function is used, with an extra field in the node to decide if arguments should be included in the jumbling or not, a location field being not enough for this purpose. This approach allows for a finer tuning. Clauses relying on one or more keywords are not normalized, for example: * DEFAULT * FROM CURRENT * List of keywords. SET SESSION CHARACTERISTICS AS TRANSACTION, where it is critical to differentiate different sets of options, is a good example of why normalization should not happen. Some queries use VariableSetStmt for some subclauses with SET, that also have their values normalized: - ALTER DATABASE - ALTER ROLE - ALTER SYSTEM - CREATE/ALTER FUNCTION ba90eac7a995 has added test coverage for most of the existing SET patterns. The expected output of these tests shows the difference this commit creates. Normalization could be perhaps applied to more portions of the grammar but what is done here is conservative, and good enough as a starting point. Author: Greg Sabino Mullane, Michael Paquier Discussion: https://postgr.es/m/36e5bffe-e989-194f-85c8-06e7bc88e6f7@amazon.com Discussion: https://postgr.es/m/B44FA29D-EBD0-4DD9-ABC2-16F1CB087074@amazon.com Discussion: https://postgr.es/m/CAKAnmmJtJY2jzQN91=2QAD2eAJAA-Per61eyO48-TyxEg-q0Rg@mail.gmail.com	2024-09-30 14:02:00 +09:00
Michael Paquier	ba90eac7a9	pg_stat_statements: Expand tests for SET statements There are many grammar flavors that depend on the parse node VariableSetStmt. This closes the gap in pg_stat_statements by providing test coverage for what should be a large majority of them, improving more the work begun in de2aca288569. This will be used to ease the evaluation of a path towards more normalization of SET queries with query jumbling. Note that SET NAMES (grammar from the standard, synonym of SET client_encoding) is omitted on purpose, this could use UTF8 with a conditional script where UTF8 is supported, but that does not seem worth the maintenance cost for the sake of these tests. The author has submitted most of these in a TAP test (filled in any holes I could spot), still queries in a SQL file of pg_stat_statements is able to achieve the same goal while being easier to look at when testing normalization patterns. Author: Greg Sabino Mullane, Michael Paquier Discussion: https://postgr.es/m/CAKAnmmJtJY2jzQN91=2QAD2eAJAA-Per61eyO48-TyxEg-q0Rg@mail.gmail.com	2024-09-25 10:04:44 +09:00
Michael Paquier	933848d16d	Add missing query ID reporting in extended query protocol This commit adds query ID reports for two code paths when processing extended query protocol messages: - When receiving a bind message, setting it to the first Query retrieved from a cached cache. - When receiving an execute message, setting it to the first PlannedStmt stored in a portal. An advantage of this method is that this is able to cover all the types of portals handled in the extended query protocol, particularly these two when the report done in ExecutorStart() is not enough (neither is an addition in ExecutorRun(), actually, for the second point): - Multiple execute messages, with multiple ExecutorRun(). - Portal with execute/fetch messages, like a query with a RETURNING clause and a fetch size that stores the tuples in a first execute message going though ExecutorStart() and ExecuteRun(), followed by one or more execute messages doing only fetches from the tuplestore created in the first message. This corresponds to the case where execute_is_fetch is set, for example. Note that the query ID reporting done in ExecutorStart() is still necessary, as an EXECUTE requires it. Query ID reporting is optimistic and more calls to pgstat_report_query_id() don't matter as the first report takes priority except if the report is forced. The comment in ExecutorStart() is adjusted to reflect better the reality with the extended query protocol. The test added in pg_stat_statements is a courtesy of Robert Haas. This uses psql's \bind metacommand, hence this part is backpatched down to v16. Reported-by: Kaido Vaikla, Erik Wienhold Author: Sami Imseih Reviewed-by: Jian He, Andrei Lepikhov, Michael Paquier Discussion: https://postgr.es/m/CA+427g8DiW3aZ6pOpVgkPbqK97ouBdf18VLiHFesea2jUk3XoQ@mail.gmail.com Discussion: https://postgr.es/m/CA+TgmoZxtnf_jZ=VqBSyaU8hfUkkwoJCJ6ufy4LGpXaunKrjrg@mail.gmail.com Discussion: https://postgr.es/m/1391613709.939460.1684777418070@office.mailbox.org Backpatch-through: 14	2024-09-18 09:59:09 +09:00
Tom Lane	94537982ec	Remove obsolete comment in pg_stat_statements. Commit 76db9cb63 removed the use of multiple nesting counters, but missed one comment describing that arrangement. Back-patch to v17 where 76db9cb63 came in, just to avoid confusion. Julien Rouhaud Discussion: https://postgr.es/m/gfcwh3zjxc2vygltapgo7g6yacdor5s4ynr234b6v2ohhuvt7m@gr42joxalenw	2024-09-14 11:42:31 -04:00
Michael Paquier	7b1ddbae36	pg_stat_statements: Add tests with extended query protocol There are currently no tests in the tree checking that queries using the extended query protocol are able to map with their query ID. This can be achieved for some paths of the extended query protocol with the psql meta-commands \bind or \bind_named, so let's add some tests based on both. I have found that to be a useful addition while working on a different issue. Discussion: https://postgr.es/m/ZuEt6MOEBSlifBfn@paquier.xyz	2024-09-13 09:41:06 +09:00
Nathan Bossart	8928817769	Remove volatile qualifiers from pg_stat_statements.c. Prior to commit 0709b7ee72, which changed the spinlock primitives to function as compiler barriers, access to variables within a spinlock-protected section required using a volatile pointer, but that is no longer necessary. Reviewed-by: Bertrand Drouvot, Michael Paquier Discussion: https://postgr.es/m/Zqkv9iK7MkNS0KaN%40nathan	2024-08-06 10:56:37 -05:00
Fujii Masao	97f2bc5aa5	pg_stat_statements: Add regression test for privilege handling. This commit adds a regression test to verify that pg_stat_statements correctly handles privileges, improving its test coverage. Author: Keisuke Kuroda Reviewed-by: Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/2224ccf2e12c41ccb81702ef3303d5ac@nttcom.co.jp	2024-07-24 20:54:51 +09:00
Michael Paquier	c145f321b6	Propagate query IDs of utility statements in functions For utility statements defined within a function, the query tree is copied to a PlannedStmt as utility commands do not require planning. However, the query ID was missing from the information passed down. This leads to plugins relying on the query ID like pg_stat_statements to not be able to track utility statements within function calls. Tests are added to check this behavior, depending on pg_stat_statements.track. This is an old bug. Now, query IDs for utilities are compiled using their parsed trees rather than the query string since v16 (3db72ebcbe20), leading to less bloat with utilities, so backpatch down only to this version. Author: Anthonin Bonnefoy Discussion: https://postgr.es/m/CAO6_XqrGp-uwBqi3vBPLuRULKkddjC7R5QZCgsFren=8E+m2Sg@mail.gmail.com Backpatch-through: 16	2024-07-19 10:21:01 +09:00
Peter Eisentraut	17974ec259	Revise GUC names quoting in messages again After further review, we want to move in the direction of always quoting GUC names in error messages, rather than the previous (PG16) wildly mixed practice or the intermittent (mid-PG17) idea of doing this depending on how possibly confusing the GUC name is. This commit applies appropriate quotes to (almost?) all mentions of GUC names in error messages. It partially supersedes a243569bf65 and 8d9978a7176, which had moved things a bit in the opposite direction but which then were abandoned in a partial state. Author: Peter Smith <smithpb2250@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w%40mail.gmail.com	2024-05-17 11:44:26 +02:00
Nathan Bossart	3b42bdb471	Use new overflow-safe integer comparison functions. Commit 6b80394781 introduced integer comparison functions designed to be as efficient as possible while avoiding overflow. This commit makes use of these functions in many of the in-tree qsort() comparators to help ensure transitivity. Many of these comparator functions should also see a small performance boost. Author: Mats Kindahl Reviewed-by: Andres Freund, Fabrízio de Royes Mello Discussion: https://postgr.es/m/CA%2B14426g2Wa9QuUpmakwPxXFWG_1FaY0AsApkvcTBy-YfS6uaw%40mail.gmail.com	2024-02-16 14:05:36 -06:00
Bruce Momjian	29275b1d17	Update copyright for 2024 Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12	2024-01-03 20:49:05 -05:00
Peter Eisentraut	8ce5aef2e6	Make Perl warnings fatal in newly added TAP tests New TAP tests added by commits 9a17be1e and 4710b67d missed to convert warnings to FATAL. This commit fixes that. Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>	2024-01-03 13:16:55 +01:00
Peter Eisentraut	1141e29b61	Revert "pg_stat_statements: Add coverage for entry_dealloc()" This reverts commit 742f6b3e6df980d7dafa4a18a165d285483d5f0e. The new test failed on big-endian platforms. Discussion: https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75bf3d@eisentraut.org	2023-12-31 15:30:57 +01:00
Peter Eisentraut	4710b67d4d	pg_stat_statements: Add TAP test for testing restarts This tests that pg_stat_statement contents are successfully kept across restart. (This similar to src/test/recovery/t/029_stats_restart.pl for the stats collector.) Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75bf3d@eisentraut.org	2023-12-30 20:18:23 +01:00
Peter Eisentraut	742f6b3e6d	pg_stat_statements: Add coverage for entry_dealloc() This involves creating more than pg_stat_statements.max entries and checking that the limit is kept and the least used entries are kicked out. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75bf3d@eisentraut.org	2023-12-30 20:18:23 +01:00
Peter Eisentraut	3e527aeeed	pg_stat_statements: Add test coverage for pg_stat_statements_reset_1_7 Run pg_stat_statements_reset() once while the appropriate extension version is installed. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75bf3d@eisentraut.org	2023-12-27 10:48:01 +01:00
Peter Eisentraut	3727b8d0e3	pg_stat_statements: Add test coverage for pg_stat_statements_1_8() This requires reading pg_stat_statements at least once while the 1.8 version of the extension is installed. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75bf3d@eisentraut.org	2023-12-27 10:48:01 +01:00
Alexander Korotkov	dc9f8a7983	Track statement entry timestamp in contrib/pg_stat_statements This patch adds 'stats_since' and 'minmax_stats_since' columns to the pg_stat_statements view and pg_stat_statements() function. The new min/max reset mode for the pg_stat_stetments_reset() function is controlled by the parameter minmax_only. 'stat_since' column is populated with the current timestamp when a new statement is added to the pg_stat_statements hashtable. It provides clean information about statistics collection time intervals for each statement. Besides it can be used by sampling solutions to detect situations when a statement was evicted and stored again between samples. Such a sampling solution could derive any pg_stat_statements statistic values for an interval between two samples with the exception of all min/max statistics. To address this issue this patch adds the ability to reset min/max statistics independently of the statement reset using the new minmax_only parameter of the pg_stat_statements_reset(userid oid, dbid oid, queryid bigint, minmax_only boolean) function. The timestamp of such reset is stored in the minmax_stats_since field for each statement. pg_stat_statements_reset() function now returns the timestamp of a reset as the result. Discussion: https://postgr.es/m/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.camel%40moonset.ru Author: Andrei Zubkov Reviewed-by: Julien Rouhaud, Hayato Kuroda, Yuki Seino, Chengxi Sun Reviewed-by: Anton Melnikov, Darren Rush, Michael Paquier, Sergei Kornilov Reviewed-by: Alena Rybakina, Andrei Lepikhov	2023-11-27 02:52:17 +02:00
Alexander Korotkov	6ab1dbd26b	Add NOT NULL checking of pg_stat_statements_reset() in tests This is preliminary patch. It adds NOT NULL checking for the result of pg_stat_statements_reset() function. It is needed for upcoming patch "Track statement entry timestamp" that will change the result type of this function to the timestamp of a reset performed. Discussion: https://postgr.es/m/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.camel%40moonset.ru Author: Andrei Zubkov Reviewed-by: Julien Rouhaud, Hayato Kuroda, Yuki Seino, Chengxi Sun Reviewed-by: Anton Melnikov, Darren Rush, Michael Paquier, Sergei Kornilov Reviewed-by: Alena Rybakina, Andrei Lepikhov	2023-11-27 02:52:17 +02:00
Michael Paquier	108161bcb9	pg_stat_statements: Remove duplicated tests for SET statements This looks like a copy-paste mistake introduced in de2aca288569, that has added checks for more patterns of SET statements while ignoring the original test block that existed. Backpatch down to where this has been introduced, as this shaves some cycles. Author: Sergei Kornilov Discussion: https://postgr.es/m/5689421699428803@mail-sendbernar-production-main-46.myt.yp-c.yandex.net Backpatch-through: 16	2023-11-09 10:04:31 +09:00
Tom Lane	76db9cb636	Fix some issues with tracking nesting level in pg_stat_statements. When we decide that we don't want to track execution time of a specific planner or ProcessUtility call, we still have to increment the nesting depth, or we'll make the wrong determination of whether we are at top level when considering nested statements. (PREPARE and EXECUTE are exceptions, for reasons explained in the code.) Counting planner nesting depth separately from executor nesting depth was a mistake: it causes us to make the wrong determination of whether we are at top level when considering nested statements that get executed during planning (as a result of constant-folding of functions, for example). Merge those counters into one. In passing, get rid of the PGSS_HANDLED_UTILITY macro in favor of explicitly listing statement types. It seems somewhat coincidental that PREPARE and EXECUTE are handled alike in each of the places where that was used: the reasoning tends to be different for each one. Thus, the macro seems as likely to encourage future bugs as prevent them, since it's quite unclear whether any future statement type that might need special-casing here would also need the same choices at each spot. Sergei Kornilov, Julien Rouhaud, and Tom Lane, per bug #17552 from Maxim Boguk. This is pretty clearly a bug fix, but it's also a behavioral change that might surprise somebody, so no back-patch. Discussion: https://postgr.es/m/17552-213b534c56ab5d02@postgresql.org	2023-11-08 12:01:28 -05:00
Peter Eisentraut	611806cd72	Add trailing commas to enum definitions Since C99, there can be a trailing comma after the last value in an enum definition. A lot of new code has been introducing this style on the fly. Some new patches are now taking an inconsistent approach to this. Some add the last comma on the fly if they add a new last value, some are trying to preserve the existing style in each place, some are even dropping the last comma if there was one. We could nudge this all in a consistent direction if we just add the trailing commas everywhere once. I omitted a few places where there was a fixed "last" value that will always stay last. I also skipped the header files of libpq and ecpg, in case people want to use those with older compilers. There were also a small number of cases where the enum type wasn't used anywhere (but the enum values were), which ended up confusing pgindent a bit, so I left those alone. Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org	2023-10-26 09:20:54 +02:00
Michael Paquier	5147ab1dd3	pg_stat_statements: Add local_blk_{read\|write}_time This commit adds to pg_stat_statements the two new fields for local buffers introduced by 295c36c0c1fa, adding the time spent to read and write these blocks. These are similar to what is done for temp and shared blocks. This information available only if track_io_timing is enabled. Like for 5a3423ad8ee17, no version bump is required in the module. Author: Nazir Bilal Yavuz Reviewed-by: Robert Haas, Melanie Plageman Discussion: https://postgr.es/m/CAN55FZ19Ss279mZuqGbuUNxka0iPbLgYuOQXqAKewrjNrp27VA@mail.gmail.com	2023-10-19 14:03:31 +09:00
Michael Paquier	13d00729d4	Rename I/O timing statistics columns to shared_blk_{read\|write}_time These two counters, defined in BufferUsage to track respectively the time spent while reading and writing blocks have historically only tracked data related to shared buffers, when track_io_timing is enabled. An upcoming patch to add specific counters for local buffers will take advantage of this rename as it has come up that no data is currently tracked for local buffers, and tracking local and shared buffers using the same fields would be inconsistent with the treatment done for temp buffers. Renaming the existing fields clarifies what the block type of each stats field is. pg_stat_statement is updated to reflect the rename. No extension version bump is required as 5a3423ad8ee17 has done one, affecting v17~. Author: Nazir Bilal Yavuz Reviewed-by: Robert Haas, Melanie Plageman Discussion: https://postgr.es/m/CAN55FZ19Ss279mZuqGbuUNxka0iPbLgYuOQXqAKewrjNrp27VA@mail.gmail.com	2023-10-19 11:26:40 +09:00
Daniel Gustafsson	c5a032e518	Update oldextversions test for pg_stat_statements 1.11 Commit 5a3423ad8e updated pg_stat_statements to 1.11 due to added columns in the view for jit deform counters. Fixing oldextversion was however missed in that commit. This adds a test for the 1.11 version view definition. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ2Y7c2VEg4S1KDXFcaHjyF8=KZa_rMV6WOe+KwE-KE97g@mail.gmail.com	2023-10-13 13:50:18 +02:00
Michael Paquier	11c34b342b	Show parameters of CALL as constants in pg_stat_statements This commit changes the query jumbling of CallStmt so as its IN/OUT parameters are able to show up as constants with a parameter symbol in pg_stat_statements, like: CALL proc1($1, $2); CALL proc2($1, $2, $3); The transformed FuncExpr is used in the query ID computation instead of the FuncCall generated by the parser, so as it is sensitive to the OID of the procedure and its list of input arguments. The output arguments are handled in a separate list in CallStmt, which is also included in the computation. Tests are added to pg_stat_statements to show how this affects CALL with IN/OUT parameters as well as overloaded functions. Like 638d42a3c520 or 31de7e60da34, this improves the monitoring of workloads with a lot of CALL statements, preventing unnecessary bloat when these use different input (or event output) values. Author: Sami Imseih Discussion: https://postgr.es/m/B44FA29D-EBD0-4DD9-ABC2-16F1CB087074@amazon.com	2023-09-28 15:17:55 +09:00
Andres Freund	7369798a83	Fix tracking of temp table relation extensions as writes Karina figured out that I (Andres) confused BufferUsage.temp_blks_written with BufferUsage.local_blks_written in fcdda1e4b5. Tests in core PG can't easily test this, as BufferUsage is just used for EXPLAIN (ANALYZE, BUFFERS) and pg_stat_statements. Thus this commit adds tests for this to pg_stat_statements. Reported-by: Karina Litskevich <litskevichkarina@gmail.com> Author: Karina Litskevich <litskevichkarina@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CACiT8ibxXA6+0amGikbeFhm8B84XdQVo6D0Qfd1pQ1s8zpsnxQ@mail.gmail.com Backpatch: 16-, where fcdda1e4b5 was merged	2023-09-13 19:14:09 -07:00
Daniel Gustafsson	5a3423ad8e	Add JIT deform_counter generation_counter includes time spent on both JIT:ing expressions and tuple deforming which are configured independently via options jit_expressions and jit_tuple_deforming. As they are combined in the same counter it's not apparent what fraction of time the tuple deforming takes. This adds deform_counter dedicated to tuple deforming, which allows seeing more directly the influence jit_tuple_deforming is having on the query. The counter is exposed in EXPLAIN and pg_stat_statements bumpin pg_stat_statements to 1.11. Author: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20220612091253.eegstkufdsu4kfls@erthalion.local	2023-09-08 15:05:12 +02:00
Michael Paquier	bb45156f34	Show names of DEALLOCATE as constants in pg_stat_statements This commit switches query jumbling so as prepared statement names are treated as constants in DeallocateStmt. A boolean field is added to DeallocateStmt to make a distinction between ALL and named prepared statements, as "name" was used to make this difference before, NULL meaning DEALLOCATE ALL. Prior to this commit, DEALLOCATE was not tracked in pg_stat_statements, for the reason that it was not possible to treat its name parameter as a constant. Now that query jumbling applies to all the utility nodes, this reason does not apply anymore. Like 638d42a3c520, this can be a huge advantage for monitoring where prepared statement names are randomly generated, preventing bloat in pg_stat_statements. A couple of tests are added to track the new behavior. Author: Dagfinn Ilmari Mannsåker, Michael Paquier Reviewed-by: Julien Rouhaud Discussion: https://postgr.es/m/ZMhT9kNtJJsHw6jK@paquier.xyz	2023-08-27 17:27:44 +09:00
Michael Paquier	638d42a3c5	Show GIDs of two-phase commit commands as constants in pg_stat_statements This relies on the "location" field added to TransactionStmt in 31de7e6, now applied to the "gid" field used by 2PC commands. These commands are now reported like: COMMIT PREPARED $1 PREPARE TRANSACTION $1 ROLLBACK PREPARED $1 Applying constants for these commands is a huge advantage for workloads that rely a lot on 2PC commands with different GIDs. Some tests are added to track the new behavior. Reviewed-by: Julien Rouhaud Discussion: https://postgr.es/m/ZMhT9kNtJJsHw6jK@paquier.xyz	2023-08-12 10:44:15 +09:00
Michael Paquier	31de7e60da	Show savepoint names as constants in pg_stat_statements In pg_stat_statements, savepoint names now show up as constants with a parameter symbol, using as base query string the one added as a new entry to the PGSS hash table, leading to: RELEASE $1 ROLLBACK TO $1 SAVEPOINT $1 Applying constants to these query parts is a huge advantage for workloads that generate randomly savepoint points, like ORMs (Django is at the origin of this patch). The ODBC driver is a second layer that likes a lot savepoints, though it does not use a random naming pattern. A "location" field is added to TransactionStmt, now set only for savepoints. The savepoint name is ignored by the query jumbling. The location can be extended to other query patterns, if required, like 2PC commands. Some tests are added to pg_stat_statements for all the query patterns supported by the parser. ROLLBACK, ROLLBACK TO SAVEPOINT and ROLLBACK TRANSACTION TO SAVEPOINT have the same Node representation, so all these are equivalents. The same happens for RELEASE and RELEASE SAVEPOINT. Author: Greg Sabino Mullane Discussion: https://postgr.es/m/CAKAnmm+2s9PA4OaumwMJReWHk8qvJ_-g1WqxDRDAN1BSUfxyTw@mail.gmail.com	2023-07-27 09:42:33 +09:00
Michael Paquier	bc8e9a6a25	pg_stat_statements: Fix second comment related to entry resets This should have been part of dc73db6, but it got lost in the mix. Oversight in 6b4d23f. Author: Japin Li Discussion: https://postgr.es/m/MEYP282MB1669FC91C764E277821936D3B624A@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM Backpatch-through: 14	2023-06-29 09:17:26 +09:00
Michael Paquier	dc73db6743	pg_stat_statements: Fix incorrect comment with entry resets Oversight in 6b4d23f. Author: Japin Li, Richard Guo Discussion: https://postgr.es/m/MEYP282MB1669FC91C764E277821936D3B624A@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM Backpatch-through: 14	2023-06-29 08:04:47 +09:00
Heikki Linnakangas	876d17d62f	Fix comment on clearing padding. Author: Japin Li Discussion: https://www.postgresql.org/message-id/MEYP282MB16696317B5DA7D0D92306149B627A@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM	2023-06-27 10:12:25 +03:00
Michael Paquier	1d477a907e	Fix row tracking in pg_stat_statements with extended query protocol pg_stat_statements relies on EState->es_processed to count the number of rows processed by ExecutorRun(). This proves to be a problem under the extended query protocol when the result of a query is fetched through more than one call of ExecutorRun(), as es_processed is reset each time ExecutorRun() is called. This causes pg_stat_statements to report the number of rows calculated in the last execute fetch, rather than the global sum of all the rows processed. As pquery.c tells, this is a problem when a portal does not use holdStore. For example, DMLs with RETURNING would report a correct tuple count as these do one execution cycle when the query is first executed to fill in the portal's store with one ExecutorRun(), feeding on the portal's store for each follow-up execute fetch depending on the fetch size requested by the client. The fix proposed for this issue is simple with the addition of an extra counter in EState that's preserved across multiple ExecutorRun() calls, incremented with the value calculated in es_processed. This approach is not back-patchable, unfortunately. Note that libpq does not currently give any way to control the fetch size when using the extended v3 protocol, meaning that in-core testing is not possible yet. This issue can be easily verified with the JDBC driver, though, with autocommit disabled. Hence, having in-core tests requires more features, left for future discussion: - At least two new libpq routines splitting PQsendQueryGuts(), one for the bind/describe and a second for a series of execute fetches with a custom fetch size, likely in a fashion similar to what JDBC does. - A psql meta-command for the execute phase. This part is not strictly mandatory, still it could be handy. Reported-by: Andrew Dunstan (original discovery by Simon Siggs) Author: Sami Imseih Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/EBE6C507-9EB6-4142-9E4D-38B1673363A7@amazon.com Discussion: https://postgr.es/m/c90890e7-9c89-c34f-d3c5-d5c763a34bd8@dunslane.net	2023-04-06 09:29:03 +09:00
Michael Paquier	daa8365a90	Reflect normalization of query strings for utilities in pg_stat_statements Applying normalization changes how the following query strings are reflected in pg_stat_statements, by showing Const nodes with a dollar-signed parameter as this is how such queries are structured internally once parsed: - DECLARE - EXPLAIN - CREATE MATERIALIZED VIEW - CREATE TABLE AS More normalization could be done in the future depending on the parts where query jumbling is applied (like A_Const nodes?), the changes being reflected in the regression tests in majority created in de2aca2. This just allows the basics to work for utility queries using Const nodes. Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Y+MRdEq9W9XVa2AB@paquier.xyz	2023-03-08 15:00:50 +09:00
Michael Paquier	9a714b9d6e	Improve cleanup phases in regression tests of pg_stat_statements As shaped, two DROP ROLE queries included in "user_activity" were showing in the reports for "wal". The intention is to keep each test isolated and independent, so this is incorrect. This commit adds some calls to pg_stat_statements_reset() to clean up the statistics once each test finishes, so as there are no risks of overlap in the reports for individial scenarios. The addition in "user_activity" fixes the output of "wal". The new resets done in "level_tracking" and "utility" are added for consistency with the rest, though they do not affect the stats generated in the other tests. Oversight in d0028e3. Reported-by: Andrei Zubkov Discussion: https://postgr.es/m/7beb722dd016bf54f1c78bfd6d44a684e28da624.camel@moonset.ru	2023-03-07 08:58:13 +09:00
Michael Paquier	d0028e35a0	Refactor more the regression tests of pg_stat_statements This commit expands more the refactoring of the regression tests of pg_stat_statements, with tests moved out of pg_stat_statements.sql into separate files. The following file structure is now used: - select is mostly the former pg_stat_statements.sql, renamed. - dml for INSERT/UPDATE/DELETE and MERGE - user_activity, to test role-level checks and stat resets. - wal, to check the WAL generation after some queries. Like e8dbdb1, there is no change in terms of code coverage or results, and this finishes the split I was aiming for in these tests. Most of the tests used "test" of "pgss_test" as names for the tables used, these are renamed to less generic names. Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Y/7Y9U/y/keAW3qH@paquier.xyz	2023-03-03 08:46:11 +09:00
Michael Paquier	de2aca2885	Expand regression tests of pg_stat_statements for utility queries This commit adds more coverage for utility statements so as it is possible to track down all the effects of query normalization done for all the queries that use either Const or A_Const nodes, which are the nodes where normalization makes the most sense as they apply to constants (well, most of the time, really). This set of queries is extracted from an analysis done while looking at full dumps of the regression database when applying different levels of normalization to either Const or A_Const nodes for utilities, as of a minimal set of these, for: - All relkinds (CREATE, ALTER, DROP) - Policies - Cursors - Triggers - Types - Rules - Statistics - CALL - Transaction statements (isolation level, options) - EXPLAIN - COPY Note that pg_stat_statements is not switched yet to show any normalization for utilities, still it improves the default coverage of the query jumbling code (not by as much as enabling query jumbling on the main regression test suite, though): - queryjumblefuncs.funcs.c: 36.8% => 48.5% - queryjumblefuncs.switch.c: 33.2% => 43.1% Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Y+MRdEq9W9XVa2AB@paquier.xyz	2023-02-20 10:16:51 +09:00
Michael Paquier	e8dbdb15db	Refactor tests of pg_stat_statements for planning, utility and level tracking pg_stat_statements.sql acts as the main file for all the core tests of the module, but things have become complicated to follow over the years as some of the sub-scenarios tested in this file rely on assumptions that come from completely different areas of it, like a GUC setup or a relation created previously. For example, row tracking for CTAS/COPY was looking at the number of plans, which was not necessary, or level tracking was mixed with checks on planner counts. This commit refactors the tests of pg_stat_statements, by moving test cases out of pg_stat_statements.sql into their own file, as of: - Planning-related tests in planning.sql, for [re]plan counts and top-level handling. These depend on pg_stat_statements.track_planning. - Utilities in utility.sql (pg_stat_statements.track_utility), that includes now the tests for: -- Row tracking for CTAS, CREATE MATERIALIZED VIEW, COPY. -- Basic utility statements. -- SET statements. - Tracking level, depending on pg_stat_statements.track. This part has been looking at scenarios with DO blocks, PL functions and SQL functions. pg_stat_statements.sql (still named the same for now) still includes some checks for role-level tracking and WAL generation metrics, that ought to become independent in the long term for clarity. While on it, this switches the order of the attributes when querying pg_stat_statements, the query field becoming last. This makes much easier the tracking of changes related to normalization, as queries are the only variable-length attributes queried (unaligned mode would be one extra choice, but that reduces the checks on the other fields). Test scenarios and their results match exactly with what was happening before this commit in terms of calls, number of plans, number of rows, cached data or level tracking, so this has no effect on the coverage in terms of what is produced by the reports in the table pg_stat_statements. A follow-up patch will extend more the tests of pg_stat_statements around utilities, so this split creates a foundation for this purpose, without complicating more pg_stat_statements.sql. Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Y+MRdEq9W9XVa2AB@paquier.xyz	2023-02-20 09:28:29 +09:00
Michael Paquier	9ba37b2cb6	Include values of A_Const nodes in query jumbling Like the implementation for node copy, write and read, this node requires a custom implementation so as the query jumbling is able to consider the correct value assigned to it, depending on its type (int, float, bool, string, bitstring). Based on a dump of pg_stat_statements from the regression database, this would confuse the query jumbling of the following queries: - SET. - COPY TO with SELECT queries. - START TRANSACTION with different isolation levels. - ALTER TABLE with default expressions. - CREATE TABLE with partition bounds. Note that there may be a long-term argument in tracking the location of such nodes so as query strings holding such nodes could be normalized, but this is left as a separate discussion. Oversight in 3db72eb. Discussion: https://postgr.es/m/Y9+HuYslMAP6yyPb@paquier.xyz	2023-02-07 09:03:54 +09:00
Michael Paquier	3db72ebcbe	Generate code for query jumbling through gen_node_support.pl This commit changes the query jumbling code in queryjumblefuncs.c to be generated automatically based on the information of the nodes in the headers of src/include/nodes/ by using gen_node_support.pl. This approach offers many advantages: - Support for query jumbling for all the utility statements, based on the state of their parsed Nodes and not only their query string. This will greatly ease the switch to normalize the information of some DDLs, like SET or CALL for example (this is left unchanged and should be part of a separate discussion). With this feature, the number of entries stored for utilities in pg_stat_statements is reduced (for example now "CHECKPOINT" and "checkpoint" mean the same thing with the same query ID). - Documentation of query jumbling directly in the structure definition of the nodes. Since this code has been introduced in pg_stat_statements and then moved to code, the reasons behind the choices of what should be included in the jumble are rather sparse. Note that some explanation is added for the most relevant parts, as a start. - Overall code reduction and more consistency with the other parts generating read, write and copy depending on the nodes. The query jumbling is controlled by a couple of new node attributes, documented in nodes/nodes.h: - custom_query_jumble, to mark a Node as having a custom implementation. - no_query_jumble, to ignore entirely a Node. - query_jumble_ignore, to ignore a field in a Node. - query_jumble_location, to mark a location in a Node, for normalization. This can apply only to int fields, with "location" in their name (only Const as of this commit). There should be no compatibility impact on pg_stat_statements, as the new code applies the jumbling to the same fields for each node (its regression tests have no modification, for one). Some benchmark of the query jumbling between HEAD and this commit for SELECT and DMLs has proved that this new code does not cause a performance regression, with computation times close for both methods. For utility queries, the new method is slower than the previous method of calculating a hash of the query string, though we are talking about extra ns-level changes based on what I measured, which is unnoticeable even for OLTP workloads as a query ID is calculated once per query post-parse analysis. Author: Michael Paquier Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/Y5BHOUhX3zTH/ig6@paquier.xyz	2023-01-31 15:24:05 +09:00

1 2 3 4 5 ...

253 Commits