Thread: BUG #18616: Long-running hash index build can not be interrupted
The following bug has been logged on the website: Bug reference: 18616 Logged by: Alexander Lakhin Email address: exclusion@gmail.com PostgreSQL version: 17rc1 Operating system: Ubuntu 22.04 Description: The following script: CREATE TABLE t(i int); INSERT INTO t SELECT 1 FROM generate_series(1, 10000000); SET maintenance_work_mem = '1GB'; SET statement_timeout = '90s'; CREATE INDEX hi ON t USING hash (i); reaches a state, when the backend can not be interrupted. gdb shows that the code execution loops inside: #0 _h_indexbuild (...) at hashsort.c:151 #1 0x0000557f3671cbf3 in hashbuild (...) at hash.c:183 ... with tups_done increasing slowly (given the total number of tuples). Reproduced on all supported versions.
Hi Pavel,
13.09.2024 17:51, Pavel Borisov wrote:
13.09.2024 17:51, Pavel Borisov wrote:
Reproduced on all supported versions.I was unable to reproduce it on my machine with these settings. Tried statement timeouts 30-120s. Index build appears to be in interruptible phase on my system.But could you check in your environment with the following patch?
Thank you for paying attention to this!
Yes, the patch works for me. The query is interrupted as expected.
But I wonder, why don't you the see same?
Could you please show the server log, with the following change applied and
log_statement = 'all'
backtrace_functions = 'ProcessInterrupts'
?
--- a/src/backend/access/hash/hashsort.c
+++ b/src/backend/access/hash/hashsort.c
@@ -125,6 +125,7 @@ _h_indexbuild(HSpool *hspool, Relation heapRel)
uint32 hashkey = 0;
#endif
+elog(LOG, "_h_indexbuild() start");
tuplesort_performsort(hspool->sortstate);
while ((itup = tuplesort_getindextuple(hspool->sortstate, true)) != NULL)
@@ -151,4 +152,5 @@ _h_indexbuild(HSpool *hspool, Relation heapRel)
pgstat_progress_update_param(PROGRESS_CREATEIDX_TUPLES_DONE,
++tups_done);
}
+elog(LOG, "_h_indexbuild() end; tups_done: %ld", tups_done);
}
Best regards,
Alexander
Alexander Lakhin <exclusion@gmail.com> writes: > 13.09.2024 17:51, Pavel Borisov wrote: >> I was unable to reproduce it on my machine with these settings. Tried statement timeouts 30-120s. Index build appears >> to be in interruptible phase on my system. > Yes, the patch works for me. The query is interrupted as expected. > But I wonder, why don't you the see same? I see the same result as Alexander: the query fails to time out after the expected 90 seconds, and it's looping in _h_indexbuild. Looking at hashbuild, the effective sort_threshold depends on NBuffers, so maybe if you have that set to a high enough value it fails to go into the sort path? If I use SET maintenance_work_mem = '128MB'; instead of the suggested 1GB, I don't see the problem. regards, tom lane
I wrote: > Looking at hashbuild, the effective sort_threshold depends on > NBuffers, so maybe if you have that set to a high enough value > it fails to go into the sort path? If I use > SET maintenance_work_mem = '128MB'; > instead of the suggested 1GB, I don't see the problem. Oh, false alarm: that test in hashbuild takes basically the min of maintenance_work_mem and NBuffers, so that with default NBuffers of 128MB, there's no difference here between those two settings. The reason I see a difference in behavior seems to be that with maintenance_work_mem = 1GB, the tuple sorting step completes faster, allowing control to reach _h_indexbuild before the 10sec timeout I was testing with. With the smaller maintenance_work_mem setting, we're still sorting when it times out --- and there are CHECK_FOR_INTERRUPTS calls in the sort code. regards, tom lane