Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
Date
Msg-id CAMkU=1zduDsROXQ7xy=Td4YDHSVr_ziW2rTKGvf8tKknDTS00A@mail.gmail.com
Whole thread Raw
In response to COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Responses Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
List pgsql-hackers
On Thu, Feb 21, 2019 at 1:05 AM Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
Hi,

Jeff Janes raised an issue [1] about PD_ALL_VISIBLE not being set correctly while loading data via COPY FREEZE and had also posted a draft patch.

I now have what I think is a more complete patch. I took a slightly different approach and instead of setting PD_ALL_VISIBLE bit initially and then not clearing it during insertion, we now recheck the page for all-frozen, all-visible tuples just before switching to a new page. This allows us to then also mark set the visibility map bit, like we do in vacuumlazy.c

Some special treatment is required to handle the last page before bulk insert it shutdown. We could have chosen not to do anything special for the last page and let it remain unfrozen, but I thought it makes sense to take that extra effort so that we can completely freeze the table and set all VM bits at the end of COPY FREEZE.

Let me know what you think.

Hi Pavan, thanks for picking this up.

After doing a truncation and '\copy ... with (freeze)' of a table with long data, I find that the associated toast table has a handful of unfrozen blocks.  I don't know if that is an actual problem, but it does seem a bit odd, and thus suspicious.

perl -le 'print join "", map rand(), 1..500 foreach 1..1000000' > foo

create table foobar1 (x text);
begin;
truncate foobar1;
\copy foobar1 from foo with (freeze)
commit;
select all_visible,all_frozen,pd_all_visible, count(*) from pg_visibility('pg_toast.pg_toast_25085') group by 1,2,3;
 all_visible | all_frozen | pd_all_visible |  count  
-------------+------------+----------------+---------
 f           | f          | f              |      18
 t           | t          | t              | 530,361
(2 rows)

Cheers,

Jeff 

pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: Early WIP/PoC for inlining CTEs
Next
From: "Jamison, Kirk"
Date:
Subject: RE: Timeout parameters