Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits - Mailing list pgsql-hackers
| From | Anastasia Lubennikova |
|---|---|
| Subject | Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits |
| Date | |
| Msg-id | 58359211-c505-1ee6-c5fd-78db84afb46f@postgrespro.ru Whole thread Raw |
| In response to | Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
| Responses |
Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
|
| List | pgsql-hackers |
On 12.01.2021 22:30, Tomas Vondra wrote:
> Thanks. These patches seem to resolve the TOAST table issue, freezing
> it as expected. I think the code duplication is not an issue, but I
> wonder why heap_insert uses this condition:
>
> /*
> * ...
> *
> * No need to update the visibilitymap if it had all_frozen bit set
> * before this insertion.
> */
> if (all_frozen_set && ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0))
>
> while heap_multi_insert only does this:
>
> if (all_frozen_set) { ... }
>
> I haven't looked at the details, but shouldn't both do the same thing?
I decided to add this check for heap_insert() to avoid unneeded calls of
visibilitymap_set(). If we insert tuples one by one, we can only call
this once per page.
In my understanding, heap_multi_insert() inserts tuples in batches, so
it doesn't need this optimization.
>
>
> However, I've also repeated the test counting all-frozen pages in both
> the main table and TOAST table, and I get this:
>
> patched
> =======
>
> select count(*) from pg_visibility((select reltoastrelid from pg_class
> where relname = 't'));
>
> count
> --------
> 100002
> (1 row)
>
>
> select count(*) from pg_visibility((select reltoastrelid from pg_class
> where relname = 't')) where not all_visible;
>
> count
> --------
> 0
> (1 row)
>
> That is - all TOAST pages are frozen (as expected, which is good). But
> now there are 100002 pages, not just 100000 pages. That is, we're now
> creating 2 extra pages, for some reason. I recall Pavan reported
> similar issue with every 32768-th page not being properly filled, but
> I'm not sure if that's the same issue.
>
>
> regards
>
As Pavan correctly figured it out before the problem is that
RelationGetBufferForTuple() moves to the next page, losing free space in
the block:
> ... I see that a relcache invalidation arrives
> after 1st and then after every 32672th block is filled. That clears the
> rel->rd_smgr field and we lose the information about the saved target
> block. The code then moves to extend the relation again and thus
skips the
> previously less-than-half filled block, losing the free space in that
block.
The reason of this cache invalidation is vm_extend() call, which happens
every 32762 blocks.
RelationGetBufferForTuple() tries to use the last page, but for some
reason this code is under 'use_fsm' check. And COPY FROM doesn't use fsm
(see TABLE_INSERT_SKIP_FSM).
/*
* If the FSM knows nothing of the rel, try the last page before we
* give up and extend. This avoids one-tuple-per-page syndrome
during
* bootstrapping or in a recently-started system.
*/
if (targetBlock == InvalidBlockNumber)
{
BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
if (nblocks > 0)
targetBlock = nblocks - 1;
}
I think we can use this code without regard to 'use_fsm'. With this
change, the number of toast rel pages is correct. The patch is attached.
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment
pgsql-hackers by date: