Re: FSM rewrite: doc changes - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: FSM rewrite: doc changes |
Date | |
Msg-id | 48E0D85C.2090205@enterprisedb.com Whole thread Raw |
In response to | Re: FSM rewrite: doc changes (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: FSM rewrite: doc changes
|
List | pgsql-hackers |
Tom Lane wrote: > FreeSpaceMapTruncateRel seems to have a bug in its early-exit test: in the > case where the number of FSM blocks stays the same, it fails to zero out slots > in the last block. I also think it's got an off-by-one problem in figuring > the number of FSM blocks: for the normal case where the new heap end is in > the middle of a FSM block, shouldn't new_nfsmblocks be one larger than it > is? The case where nblocks is an exact multiple of SlotsPerFSMPage would > need to be special-cased to be exactly correct, though I see no real harm in > letting the FSM be left one page too big in that case. > > fsm_truncate_avail seems quite broken: it's clearing the whole page always. Yep, I noticed these myself on Friday after sending the patch.. > I do not like the kluge in heap_xlog_clean one bit, and think it's unnecessary > anyway since we are not relying on the FSM to be accurate. Suggest reverting > the heapam.c changes except for heap_sync(). The point is to have a more up-to-date FSM after recovery. PITR recovery in a warm stand-by server in particular. I'll take it out for now, but it needs more discussion. In fact, I think we should update the FSM even more aggressively, on inserts and updates as well vacuums. Probably not on all inserts and updates, though, to keep the overhead minimal, but there's a tradeoff somewhere in between. > The new search algorithm in fsm_search_avail still doesn't work. Consider > what happens when the target is the rightmost slot on the page; it certainly > won't wrap properly. Crap, you're right. > In fsm_rebuild_page, surely we needn't check "if (lchild < NodesPerPage)". Yes, we do. There can be is a number of completely unused upper nodes on the right. The upper levels of the tree are complete, but the bottom level is missing enough nodes to make room for the page header. So the tree looks something like this: X X X X X X X X X X X X . . . Where . is a missing node. The parents that miss both children will always be zero. > This test in fsm_space_needed_to_cat: > if (needed >= (FSM_CATEGORIES - 1) * FSM_CAT_STEP) > elog(ERROR, "invalid FSM request size"); > reveals a rather fundamental problem: it is clearly possible > for this test to fail on valid request sizes, because the page > header overhead is less than FSM_CAT_STEP (especially if BLCKSZ > is more than 8K). I'm not sure about a really clean solution > here. We could offset the needed_to_cat and avail_to_cat > calculations so that category 255 corresponds exactly to the > maximum possible free space, but that requires assuming that FSM > knows exactly what that is, which is a bit unpleasant. Thoughts? Hmph. The other alternative is to use 2 bytes instead of one per page, and track the free space exactly. But I'd rather not do that just to deal with the very special case of huge requests. Or we could just return -1 instead of throwing an error. Requests higher than the limit would then always have to extend the heap. That's not good, but I think we already have that problem for tuples of exactly MaxHeapTupleSize bytes. Since PageGetFreeSpace subtracts the size of a new line pointer, only newly extended pages that have never had any tuples on them have enough space, as determined by PagetGetFreeSpace, to fit a tuple of MaxHeapTupleSize bytes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: