Re: Freeze avoidance of very large table. - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Freeze avoidance of very large table. |
Date | |
Msg-id | CAD21AoCTJ4n9qm09S5yucvoDDHHcOJporTFXWbj4ax=NxJ5EPQ@mail.gmail.com Whole thread Raw |
In response to | Re: Freeze avoidance of very large table. (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: Freeze avoidance of very large table.
|
List | pgsql-hackers |
On Thu, Nov 5, 2015 at 6:03 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > Hello, I had a look on v21 patch. > > Though I haven't looked the whole of the patch, I'd like to show > you some comments only for visibilitymap.c and a part of the > documentation. > > > 1. Patch application > > patch command complains about offsets for heapam.c at current > master. > > 2. visitibilymap_test() > > - if (visibilitymap_test(rel, blkno, &vmbuffer)) > + if (visibilitymap_test(rel, blkno, &vmbuffer, VISIBILITYMAP_ALL_VISIBLE) > > The old VM was a simple bitmap so the name _test and the > function are proper but now the bitmap is quad state so it'd be > better chainging the function. Alghough it is not so expensive > to call it twice successively, it is a bit uneasy for me doing > so. One possible shape would be like the following. > > lazy_vacuum_page() > > int vmstate = visibilitymap_get_status(rel, blkno, &vmbuffer); > > if (!(vmstate & VISIBILITYMAP_ALL_VISIBLE)) > > ... > > if (all_frozen && !(vmstate & VISIBILITYMAP_ALL_FROZEN)) > > ... > > if (flags != vmstate) > > visibilitymap_set(...., flags); > > and defining two macros for indivisual tests, > > > #define VM_ALL_VISIBLE(r, b, v) ((vm_get_status((r), (b), (v)) & .._VISIBLE) != 0) > > if (VM_ALL_VISIBLE(rel, blkno, &vmbuffer)) > and > > if (VM_ALL_FROZEN(rel, blkno, &vmbuffer)) > > How about this? > > > 3. visibilitymap.c > > - HEAPBLK_TO_MAPBIT > > In visibilitymap_clear and other functions, mapBit means > mapDualBit in the patch, and mapBit always appears in the form > "mapBit * BITS_PER_HEAPBLOCK". So, it'd be better to change the > definition of HEAPBLK_TO_MAPBIT so that it calculates really > the bit position in a byte. > > - #define HEAPBLK_TO_MAPBIT(x) ((x) % HEAPBLOCKS_PER_BYTE) > + #define HEAPBLK_TO_MAPBIT(x) (((x) % HEAPBLOCKS_PER_BYTE) * BITS_PER_HEAPBLOCK) > > > - visibilitymap_count() > > The third argument all_frozen is not necessary in some > usage. So this interface would be preferable to be as > following, > > BlockNumber > visibilitymap_count(Relation rel, BlockNumber *all_frozen) > { > BlockNumber all_visible = 0; > ... > if (all_frozen) > *all_frozen = 0; > ... something like ... > > - visibilitymap_set() > > The check for ALL_VISIBLE is duplicate in the following > assertion. > > > Assert((flags & VISIBILITYMAP_ALL_VISIBLE) || > > (flags & (VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN))); > > > > 4. documentation > > - 18.11.1 Statement Hehavior > > A typo. > > > VACUUM performs *a* aggressive freezing > > However I am not a fluent English speaker, and such > wordsmithing would be done by someone else, I feel that > "eager/greedy" is more suitable for this meaning.., > nevertheless, the term "whole-table freezing" that you wrote > elsewhere in this patch would be usable. > > "VACUUM performs a whole-table freezing" > > All "a table scan/sweep"s and something has the similar > meaning would be better be changed to "a whole-table > freezing" > > In similar manner, "tuples/rows that are marked as frozen" > could be replaced with "unfrozen tuples/rows". > > - 23.1.5 Preventing Transaction ID Wraparound Failures > > "The whole table is scanned only when all pages happen to > require vacuuming to remove dead row versions." > > This description looks a bit out-of-point. "the whole table > scan" in the original description is what is triggered by > relfrozenxid so the correspondent in the revised description > is "the whole-table freezing", maybe. > > "The whole-table feezing takes place when > <structfield>relfrozenxid</> is more than > <varname>vacuum_freeze_table_age</> transactions old or when > <command>VACUUM</>'s <literal>FREEZE</> option is used. The > whole-table freezing scans all unfreezed pages." > > The last sentence might be unnecessary. > > > - 63.4 Visibility Map > > "pages contain only tuples that are marked as frozen" would be > enough to be "pages contain only frozen tuples" > > and according to the discussion upthread, we might be good to > have some desciption that the name is historically omitting > the aspect of freezemap. > > > At Sat, 31 Oct 2015 18:07:32 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in <CAA4eK1+aTdaSwG3u+y8fXxn67Kkj0T1KzRsFDLEi=tQvTYgFrQ@mail.gmail.com> > amit.kapila16> On Fri, Oct 30, 2015 at 6:03 AM, Masahiko Sawada <sawada.mshk@gmail.com> >> Couple of more review comments: >> ------------------------------------------------------ >> >> 1. >> @@ -615,6 +617,8 @@ typedef struct PgStat_StatTabEntry >> PgStat_Counter n_dead_tuples; >> PgStat_Counter >> changes_since_analyze; >> >> + int32 n_frozen_pages; >> + >> PgStat_Counter blocks_fetched; >> PgStat_Counter >> blocks_hit; >> >> As you are changing above structure, you need to update >> PGSTAT_FILE_FORMAT_ID, refer below code: >> #define PGSTAT_FILE_FORMAT_ID 0x01A5BC9D >> >> 2. It seems that n_frozen_page is not initialized/updated properly >> for toast tables: >> >> Try with below steps: >> >> postgres=# create table t4(c1 int, c2 text); >> CREATE TABLE >> postgres=# select oid, relname from pg_class where relname like '%t4%'; >> oid | relname >> -------+--------- >> 16390 | t4 >> (1 row) >> >> >> postgres=# select oid, relname from pg_class where relname like '%16390%'; >> oid | relname >> -------+---------------------- >> 16393 | pg_toast_16390 >> 16395 | pg_toast_16390_index >> (2 rows) >> >> postgres=# select relname,seq_scan,n_tup_ins,last_vacuum,n_frozen_page from >> pg_s >> tat_all_tables where relname like '%16390%'; >> relname | seq_scan | n_tup_ins | last_vacuum | n_frozen_page >> ----------------+----------+-----------+-------------+--------------- >> pg_toast_16390 | 1 | 0 | | -842150451 >> (1 row) >> >> Note that I have tested above scenario on my Windows 7 m/c. >> >> 3. >> * visibilitymap.c >> * bitmap for tracking visibility of heap tuples >> >> I think above needs to be changed to: >> bitmap for tracking visibility and frozen state of heap tuples >> >> >> 4. >> a. >> /* >> - * If we froze any tuples, mark the buffer dirty, and write a WAL >> - * record recording the changes. We must log the changes to be >> - * crash-safe against future truncation of CLOG. >> + * If we froze any tuples then we mark the buffer dirty, and write a WAL >> >> b. >> - * Release any remaining pin on visibility map page. >> + * Release any remaining pin on visibility map. >> >> c. >> * We do update relallvisible even in the corner case, since if the table >> - * is all-visible >> we'd definitely like to know that. But clamp the value >> - * to be not more than what we're setting >> relpages to. >> + * is all-visible we'd definitely like to know that. >> + * But clamp the value to be not more >> than what we're setting relpages to. >> >> I don't think you need to change above comments. >> >> 5. >> + * Even if scan_all is set so far, we could skip to scan some pages >> + * according by all-frozen >> bit of visibility amp. >> >> /according by/according to >> /amp/map >> >> I suggested to modify comment as below: >> During full scan, we could skip some pages according to all-frozen >> bit of visibility map. >> >> Also no need to start this in new line, start from where the >> previous line of comment ends. >> >> 6. >> /* >> * lazy_scan_heap() -- scan an open heap relation >> * >> * This routine prunes each page in the >> heap, which will among other >> * things truncate dead tuples to dead line pointers, defragment the >> * >> page, and set commit status bits (see heap_page_prune). It also builds >> * lists of dead >> tuples and pages with free space, calculates statistics >> * on the number of live tuples in the >> heap, and marks pages as >> * all-visible if appropriate. >> >> Modify above function header as: >> >> all-visible, all-frozen >> >> 7. >> lazy_scan_heap() >> { >> .. >> >> if (PageIsEmpty(page)) >> { >> empty_pages++; >> freespace = >> PageGetHeapFreeSpace(page); >> >> /* empty pages are always all-visible */ >> if (!PageIsAllVisible(page)) >> .. >> } >> >> Don't we need to ensure that empty pages should get marked as >> all-frozen? >> >> 8. >> lazy_scan_heap() >> { >> .. >> /* >> * As of PostgreSQL 9.2, the visibility map bit should never be set if >> * the page- >> level bit is clear. However, it's possible that the bit >> * got cleared after we checked it >> and before we took the buffer >> * content lock, so we must recheck before jumping to the conclusion >> * that something bad has happened. >> */ >> else if (all_visible_according_to_vm >> && !PageIsAllVisible(page) >> && visibilitymap_test(onerel, blkno, &vmbuffer, >> VISIBILITYMAP_ALL_VISIBLE)) >> { >> elog(WARNING, "page is not marked all-visible >> but visibility map bit is set in relation \"%s\" page %u", >> relname, blkno); >> visibilitymap_clear(onerel, blkno, vmbuffer); >> } >> >> /* >> * >> It's possible for the value returned by GetOldestXmin() to move >> * backwards, so it's not wrong for >> us to see tuples that appear to >> * not be visible to everyone yet, while PD_ALL_VISIBLE is already >> * set. The real safe xmin value never moves backwards, but >> * GetOldestXmin() is >> conservative and sometimes returns a value >> * that's unnecessarily small, so if we see that >> contradiction it just >> * means that the tuples that we think are not visible to everyone yet >> * actually are, and the PD_ALL_VISIBLE flag is correct. >> * >> * There should never >> be dead tuples on a page with PD_ALL_VISIBLE >> * set, however. >> */ >> else >> if (PageIsAllVisible(page) && has_dead_tuples) >> { >> elog(WARNING, "page >> containing dead tuples is marked as all-visible in relation \"%s\" page %u", >> >> relname, blkno); >> PageClearAllVisible(page); >> MarkBufferDirty(buf); >> visibilitymap_clear(onerel, blkno, vmbuffer); >> } >> >> .. >> } >> >> I think both the above cases could happen for frozen state >> as well, unless you think otherwise, we need similar handling >> for frozen bit. > Thank you for reviewing the patch. I changed the patch so that the visibility map become the page info map, in source code and documentation. And fixed review comments I received. Attached v22 patch. > I think both the above cases could happen for frozen state > as well, unless you think otherwise, we need similar handling > for frozen bit. It's not happen the situation where is all-frozen and not all-visible, and the bits of visibility map are cleared at the same time, page flags are as well. So I think it's enough to handle only all-visible situation. Am I missing something? > 2. visitibilymap_test() > > - if (visibilitymap_test(rel, blkno, &vmbuffer)) > + if (visibilitymap_test(rel, blkno, &vmbuffer, VISIBILITYMAP_ALL_VISIBLE) > > The old VM was a simple bitmap so the name _test and the > function are proper but now the bitmap is quad state so it'd be > better chainging the function. Alghough it is not so expensive > to call it twice successively, it is a bit uneasy for me doing > so. One possible shape would be like the following. > > lazy_vacuum_page() > > int vmstate = visibilitymap_get_status(rel, blkno, &vmbuffer); > > if (!(vmstate & VISIBILITYMAP_ALL_VISIBLE)) > > ... > > if (all_frozen && !(vmstate & VISIBILITYMAP_ALL_FROZEN)) > > ... > > if (flags != vmstate) > > visibilitymap_set(...., flags); > > and defining two macros for indivisual tests, > > > #define VM_ALL_VISIBLE(r, b, v) ((vm_get_status((r), (b), (v)) & .._VISIBLE) != 0) > > if (VM_ALL_VISIBLE(rel, blkno, &vmbuffer)) > and > > if (VM_ALL_FROZEN(rel, blkno, &vmbuffer)) > > How about this? I agree. I've changed so. Regards, -- Masahiko Sawada
Attachment
pgsql-hackers by date: