Re: Making jsonb_agg() faster - Mailing list pgsql-hackers

From Chao Li
Subject Re: Making jsonb_agg() faster
Date
Msg-id 8A521EB7-2716-4E01-8505-98755B78D8EF@gmail.com
Whole thread Raw
In response to Re: Making jsonb_agg() faster  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

> On Nov 4, 2025, at 02:40, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Chao Li <li.evan.chao@gmail.com> writes:
>>> On Nov 3, 2025, at 04:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> PFA v3, rebased over 8a27d418f, no substantive changes whatever.
>
>> I am on vacation this week, I only find a hour in the evening and did an eyeball review without actually tracing and
testingthis patch set. 
>
> Thanks for looking at it!
>
>> In elsewhere of the patch, you all use “JsonbInState ps = {0}” to do the initialization, only this place uses
memset().Can we keep consistent and use {0} here also. I see there is a “continue” and a “break” before the memset(),
maybeyou want to avoid unnecessary initialization. I guess that is a tiny saving, but if you think that saving is
worthwhile,I’m fine with keeping the current code. 
>
> I intentionally made the 0001 patch as mechanical as possible,
> and in particular did not second-guess existing decisions about
> whether to initialize a variable in the declaration or later.
> So I'm not excited about doing it differently in this one place.
> There are some other memsets in the same area that could also
> be replaced by "= {0}" if we cared to, but that seems like
> material for a separate cleanup patch.
>
>> Instead of hard-coding 12, can we use "sizeof(TimeTzADT)” for better readability?
>
> No, because that's not the same :-(.  sizeof() would produce 16,
> on most machines anyway, thanks to alignment.  But 12 is the
> type's size according to pg_type.
>
>> You left two “ugly hack” comments. Maybe add a short description for why they are hack and what can be improved in
thefuture. 
>
> It's the same ugly hack as before, I just formatted the code more
> legibly ... I didn't stop to look at whether there's a better way to
> do it.  Again, that seems like material for a different patch.
>
> regards, tom lane

I got some time today, so I ran some performance tests, which show me a significant performance improvement from this
patch.

To run the test, I disabled asserts and built with -O2. The tests ran on a MacBook M4.

To prepare for some data:
```
DROP TABLE IF EXISTS jtest;
CREATE TABLE jtest AS SELECT g AS id, (g % 1000) AS grp, g AS ival, (g::text || '_str') AS sval, now() + (g || '
seconds')::intervalAS tsval FROM generate_series(1, 5_000_000) g; 
VACUUM ANALYZE jtest;
```

Then I ran the following tests against both master and a patched branch:

1. Jsonb_agg

Patched:
```
evantest=# SELECT grp, jsonb_agg(jsonb_build_object('id', id, 'ival', ival, 'sval', sval, 'tsval', tsval)) FROM jtest
GROUPBY grp; 
Time: 5810.921 ms (00:05.811)
evantest=# SELECT grp, jsonb_agg(jsonb_build_object('id', id, 'ival', ival, 'sval', sval, 'tsval', tsval)) FROM jtest
GROUPBY grp; 
Time: 5904.465 ms (00:05.904)
evantest=# SELECT grp, jsonb_agg(jsonb_build_object('id', id, 'ival', ival, 'sval', sval, 'tsval', tsval)) FROM jtest
GROUPBY grp; 
Time: 5857.314 ms (00:05.857)
```

Master:
```
evantest=# SELECT grp, jsonb_agg(jsonb_build_object('id', id, 'ival', ival, 'sval', sval, 'tsval', tsval)) FROM jtest
GROUPBY grp; 
Time: 7310.648 ms (00:07.311)
evantest=# SELECT grp, jsonb_agg(jsonb_build_object('id', id, 'ival', ival, 'sval', sval, 'tsval', tsval)) FROM jtest
GROUPBY grp; 
Time: 7333.428 ms (00:07.333)
evantest=# SELECT grp, jsonb_agg(jsonb_build_object('id', id, 'ival', ival, 'sval', sval, 'tsval', tsval)) FROM jtest
GROUPBY grp; 
Time: 7257.666 ms (00:07.258)
```

Roughly 20% improvement.

2. Jsonb_object_agg

Patched:
```
evantest=# SELECT jsonb_object_agg(id, sval) FROM jtest;
Time: 767.258 ms
evantest=# SELECT jsonb_object_agg(id, sval) FROM jtest;
Time: 760.665 ms
evantest=# SELECT jsonb_object_agg(id, sval) FROM jtest;
Time: 772.326 ms
```

Master:
```
evantest=# SELECT jsonb_object_agg(id, sval) FROM jtest;
Time: 1298.433 ms (00:01.298)
evantest=# SELECT jsonb_object_agg(id, sval) FROM jtest;
Time: 1286.029 ms (00:01.286)
evantest=# SELECT jsonb_object_agg(id, sval) FROM jtest;
Time: 1283.624 ms (00:01.284)
```

Roughly 40% improvement.

3. To_jsonb

Patched:
```
evantest=# SELECT to_jsonb(ival), to_jsonb(sval), to_jsonb(tsval) FROM jtest;
Time: 2270.958 ms (00:02.271)
evantest=# SELECT to_jsonb(ival), to_jsonb(sval), to_jsonb(tsval) FROM jtest;
Time: 2313.483 ms (00:02.313)
evantest=# SELECT to_jsonb(ival), to_jsonb(sval), to_jsonb(tsval) FROM jtest;
Time: 2298.716 ms (00:02.299)
```

Master:
```
evantest=# SELECT to_jsonb(ival), to_jsonb(sval), to_jsonb(tsval) FROM jtest;
Time: 2490.771 ms (00:02.491)
evantest=# SELECT to_jsonb(ival), to_jsonb(sval), to_jsonb(tsval) FROM jtest;
Time: 2514.326 ms (00:02.514)
evantest=# SELECT to_jsonb(ival), to_jsonb(sval), to_jsonb(tsval) FROM jtest;
Time: 2509.924 ms (00:02.510)
```

Roughly 8% improvement.

So, overall, this patch has done a great performance improvement.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Changing the state of data checksums in a running cluster
Next
From: Heikki Linnakangas
Date:
Subject: Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue