Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements - Mailing list pgsql-hackers
From | Andrei Zubkov |
---|---|
Subject | Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements |
Date | |
Msg-id | d80d62f28597fff78321350b1d9f38b2ed54c52b.camel@moonset.ru Whole thread Raw |
In response to | Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
List | pgsql-hackers |
Hi Tomas, On Wed, 2023-01-18 at 17:29 +0100, Tomas Vondra wrote: > I took a quick look at this patch, to see if there's something we > want/can get into v16. The last version was submitted about 9 months > ago, and it doesn't apply cleanly anymore, but the bitrot is fairly > minor. Not sure there's still interest, though. Thank you for your attention to this patch! I'm still very interest in this patch. And I think I'll try to rebase this patch during a week or two if it seems possible to get it in 16.. > > I'd probably call it "created_at" or something like > that, but that's a minor detail. Or maybe stats_reset, which is what > we > use in pgstat? Yes there is some naming issue. My thought was the following: - "stats_reset" is not quite correct here, because the statement entry moment if definitely not a reset. The field named just as it means - this is time of the moment from which statistics is collected for this particular entry. - "created_at" perfectly matches the purpose of the field, but seems not such self-explaining to me. > > But is the second timestamp for the min/max fields really useful? > AFAIK > to perform analysis, people take regular pg_stat_statements > snapshots, > which works fine for counters (calculating deltas) but not for gauges > (which need a reset, to track fresh values). But people analyzing > this > are already resetting the whole entry, and so the snapshots already > are > tracking deltas. > > So I'm not convinced actually need the second timestamp. The main purpose of the patch is to provide means to collecting solutions to avoid the reset of pgss at all. Just like it happens for the pg_stat_ views. The only really need of reset is that we can't be sure that observing statement was not evicted and come back since last sample. Right now we only can do a whole reset on each sample and see how many entries will be in pgss hashtable on the next sample - how close this value to the max. If there is a plenty space in hashtable we can hope that there was not evictions since last sample. However there could be reset performed by someone else and we are know nothing about this. Having a timestamp in stats_since field we are sure about how long this statement statistics is tracked. That said sampling solution can totally avoid pgss resets. Avoiding such resets means avoiding interference between monitoring solutions. But if no more resets is done we can't track min/max values, because they still needs a reset and we can do nothing with such resets - they are necessary. However I still want to know when min/max reset was performed. This will help to detect possible interference on such resets. > > > A couple more comments: > > 1) I'm not sure why the patch is adding tests of permissions on the > pg_stat_statements_reset function? > > 2) If we want the second timestamp, shouldn't it also cover resets of > the mean values, not just min/max? I think that mean values shouldn't be target for a partial reset because the value for mean values can be easily reconstructed by the sampling solution without a reset. > > 3) I don't understand why the patch is adding "IS NOT NULL AS t" to > various places in the regression tests. The most of tests was copied from the previous version. I'll recheck them. > > 4) I rather dislike the "minmax" naming, because that's often used in > other contexts (for BRIN indexes), and as I mentioned maybe it should > also cover the "mean" fields. Agreed, but I couldn't make it better. Other versions seemed worse to me... > > Regards, Andrei Zubkov
pgsql-hackers by date: