Home > mailing lists

Re: Freeze avoidance of very large table. - Mailing list pgsql-hackers

From	Masahiko Sawada
Subject	Re: Freeze avoidance of very large table.
Date	December 17, 2015 18:18:32
Msg-id	CAD21AoBScUD4k_QWrYGRmbXVruiekPY=2BY2Fxhqq55a+tzUxg@mail.gmail.com Whole thread Raw
In response to	Re: Freeze avoidance of very large table. (Michael Paquier <michael.paquier@gmail.com>)
Responses	Re: Freeze avoidance of very large table.
List	pgsql-hackers

Tree view

On Thu, Dec 17, 2015 at 11:47 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Dec 10, 2015 at 3:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Nov 30, 2015 at 12:58 PM, Bruce Momjian <bruce@momjian.us> wrote:
>>> On Mon, Nov 30, 2015 at 10:48:04PM +0530, Masahiko Sawada wrote:
>>>> On Sun, Nov 29, 2015 at 2:21 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>> > On Tue, Nov 24, 2015 at 3:13 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>> >>
>>>> >> Yeah, we need to consider to compute checksum if enabled.
>>>> >> I've changed the patch, and attached.
>>>> >> Please review it.
>>>> >
>>>> > Thanks for the update.  This now conflicts with the updates doesn to
>>>> > fix pg_upgrade out-of-space issue on Windows. I've fixed (I think) the
>>>> > conflict in order to do some testing, but I'd like to get an updated
>>>> > patch from the author in case I did it wrong.  I don't want to find
>>>> > bugs that I just introduced myself.
>>>> >
>>>>
>>>> Thank you for having a look.
>>>
>>> I would not bother mentioning this detail in the pg_upgrade manual page:
>>>
>>> +   Since the format of visibility map has been changed in version 9.6,
>>> +   <application>pg_upgrade</> creates and rewrite new <literal>'_vm'</literal>
>>> +   file even if upgrading from 9.5 or before to 9.6 or later with link mode (-k).
>>
>> Really?  I know we don't always document things like this, but it
>> seems like a good idea to me that we do so.
>
> Just going though v30...
>
> +    frozen. The whole-table freezing is occuerred only when all pages happen to
> +    require freezing to freeze rows. In other cases such as where
>
> I am not really getting the meaning of this sentence. Shouldn't this
> be reworded something like:
> "Freezing occurs on the whole table once all pages of this relation require it."
>
> +    <structfield>relfrozenxid</> is more than
> <varname>vacuum_freeze_table_age</>
> +    transcations old, where <command>VACUUM</>'s <literal>FREEZE</>
> option is used,
> +    <command>VACUUM</> can skip the pages that all tuples on the page
> itself are
> +    marked as frozen.
> +    When all pages of table are eventually marked as frozen by
> <command>VACUUM</>,
> +    after it's finished <literal>age(relfrozenxid)</> should be a little more
> +    than the <varname>vacuum_freeze_min_age</> setting that was used (more by
> +    the number of transcations started since the <command>VACUUM</> started).
> +    If the advancing of <structfield>relfrozenxid</> is not happend until
> +    <varname>autovacuum_freeze_max_age</> is reached, an autovacuum will soon
> +    be forced for the table.
>
> s/transcations/transactions.
>
> +     <entry><structfield>n_frozen_page</></entry>
> +     <entry><type>integer</></entry>
> +     <entry>Number of frozen pages</entry>
> n_frozen_pages?
>
> make check with pg_upgrade is failing for me:
> Visibility map rewriting test failed
> make: *** [check] Error 1

make check with pg_upgrade is done successfully on my environment.
Could you give me more information about this?

> -GetVisibilityMapPins(Relation relation, Buffer buffer1, Buffer buffer2,
> +GetVisibilitymapPins(Relation relation, Buffer buffer1, Buffer buffer2,
> This looks like an unrelated change.
>
>   * Clearing a visibility map bit is not separately WAL-logged.  The callers
>   * must make sure that whenever a bit is cleared, the bit is cleared on WAL
> - * replay of the updating operation as well.
> + * replay of the updating operation as well.  And all-frozen bit must be
> + * cleared with all-visible at the same time.
> This could be reformulated. This is just an addition on top of the
> existing paragraph.
>
> + * The visibility map has the all-frozen bit which indicates all tuples on
> + * corresponding page has been completely frozen, so the visibility map is also
> "have been completely frozen".
>
> -/* Mapping from heap block number to the right bit in the visibility map */
> -#define HEAPBLK_TO_MAPBLOCK(x) ((x) / HEAPBLOCKS_PER_PAGE)
> -#define HEAPBLK_TO_MAPBYTE(x) (((x) % HEAPBLOCKS_PER_PAGE) /
> HEAPBLOCKS_PER_BYTE)
> Those two declarations are just noise in the patch: those definitions
> are unchanged.
>
> -       elog(DEBUG1, "vm_clear %s %d", RelationGetRelationName(rel), heapBlk);
> +       elog(DEBUG1, "vm_clear %s block %d",
> RelationGetRelationName(rel), heapBlk);
> This may be better as a separate patch.

I've attached 001 patch for this separately.

>
> -visibilitymap_count(Relation rel)
> +visibilitymap_count(Relation rel, BlockNumber *all_frozen)
> I think that this routine would gain in clarity if reworked as follows:
> visibilitymap_count(Relation rel, BlockNumber *all_visible,
> BlockNumber *all_frozen)
>
> +               /*
> +                * Report ANALYZE to the stats collector, too.
> However, if doing
> +                * inherited stats we shouldn't report, because the
> stats collector only
> +                * tracks per-table stats.
> +                */
> +               if (!inh)
> +                       pgstat_report_analyze(onerel, totalrows,
> totaldeadrows, relallfrozen);
> Here we already know that this is working in the non-inherited code
> path. As a whole, why that? Why isn't relallfrozen passed as an
> argument of vac_update_relstats and then inserted in pg_class? Maybe I
> am missing something..

IIUC, as per discussion, the number of frozen pages should not be
inserted into pg_class. Because it's not information used by query
planning like relallvisible, repages.

> +        * mxid full-table scan limit. During full scan, we could skip some pags
> +        * according to all-frozen bit of visibility map.
> s/pags/pages
>
> +        * Also, skipping even a single page accorinding to all-visible bit of
> s/accorinding/according.
>
> So, lazy_scan_heap is the central and really vital portion of the patch...
>
> +                               /* Check whether this tuple is alrady
> frozen or not */
> s/alrady/already
>
> -heap_page_is_all_visible(Relation rel, Buffer buf, TransactionId
> *visibility_cutoff_xid)
> +heap_page_is_all_visible(Relation rel, Buffer buf, TransactionId
> *visibility_cutoff_xid,
> +                                                bool *all_frozen)
> I think you would want to change that to heap_page_visible_status that
> returns *all_visible as well. At least it seems to be a more
> consistent style of routine.
>
> + * The format of visibility map is changed with this 9.6 commit,
> + */
> +#define VISIBILITY_MAP_FROZEN_BIT_CAT_VER 201512021
> It looks a bit strange to have a different flag for the vm with the
> new frozen bit. Couldn't we merge that into a unique version number? I
> guess that we should just ask for a vm rewrite anyway in any case if
> pg_upgrade is used on the version of pg with the new vm format, no?

Thank you for your review.
Please find the attached latest v31 patches.

>
> Sawada-san, are you planning to continue working on that? At this
> stage it seems that this patch is not in commitable state and should
> at best be moved to next CF, or at worst returned with feedback.

Yes, of course.
This patch should be marked as "Move to next CF".

Regards,

--
Masahiko Sawada

Attachment

pgsql-hackers by date:

From: Robert Haas
Date: 17 December 2015, 18:12:29
Subject: Re: Inaccurate results from numeric ln(), log(), exp() and pow()

From: Andres Freund
Date: 17 December 2015, 18:20:31
Subject: Re: Fwd: Cluster "stuck" in "not accepting commands to avoid wraparound data loss"

Re: Freeze avoidance of very large table. - Mailing list pgsql-hackers

Attachment

Previous

Next