Re: pgsql: Support partition pruning at execution time - Mailing list pgsql-committers
From | David Rowley |
---|---|
Subject | Re: pgsql: Support partition pruning at execution time |
Date | |
Msg-id | CAKJS1f8o2Yd=rOP=Et3A0FWgF+gSAOkFSU6eNhnGzTPV7nN8sQ@mail.gmail.com Whole thread Raw |
In response to | Re: pgsql: Support partition pruning at execution time (David Rowley <david.rowley@2ndquadrant.com>) |
Responses |
Re: pgsql: Support partition pruning at execution time
|
List | pgsql-committers |
On 11 April 2018 at 18:58, David Rowley <david.rowley@2ndquadrant.com> wrote: > On 10 April 2018 at 08:55, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Alvaro Herrera <alvherre@alvh.no-ip.org> writes: >>> David Rowley wrote: >>>> Okay, I've written and attached a fix for this. I'm not 100% certain >>>> that this is the cause of the problem on pademelon, but the code does >>>> look wrong, so needs to be fixed. Hopefully, it'll make pademelon >>>> happy, if not I'll think a bit harder about what might be causing that >>>> instability. >> >>> Pushed it just now. Let's see what happens with pademelon now. >> >> I've had pademelon's host running a "make installcheck" loop all day >> trying to reproduce the problem. I haven't gotten a bite yet (although >> at 15+ minutes per cycle, this isn't a huge number of tests). I think >> we were remarkably (un)lucky to see the problem so quickly after the >> initial commit, and I'm afraid pademelon isn't going to help us prove >> much about whether this was the same issue. >> >> This does remind me quite a bit though of the ongoing saga with the >> postgres_fdw test instability. Given the frequency with which that's >> failing in the buildfarm, you would not think it's impossible to >> reproduce outside the buildfarm, and yet I'm here to tell you that >> it's pretty damn hard. I haven't succeeded yet, and that's not for >> lack of trying. Could there be something about the buildfarm >> environment that makes these sorts of things more likely? > > coypu just demonstrated that this was not the cause of the problem [1] > > I'll study the code a bit more and see if I can think why this might > be happening. > > [1] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=coypu&dt=2018-04-11%2004%3A17%3A38&stg=install-check-C I've spent a bit of time tonight trying to dig into this problem to see if I can figure out what's going on. I ended up running the following script on both a Linux x86_64 machine and also a power8 machine. #!/bin/bash for x in {1..1000} do echo "$x"; for i in {1..1000} do psql -d postgres -f test.sql -o test.out diff -u test.out test.expect done done I was unable to recreate this problem after about 700k loops on the Linux machine and 130k loops on the power8. I've emailed the owner of coypu to ask if it would be possible to get access to the machine, or have him run the script to see if it does actually fail. Currently waiting to hear back. I've made another pass over the nodeAppend.c code and I'm unable to see what might cause this, although I did discover a bug where first_partial_plan is not set taking into account that some subplans may have been pruned away during executor init. The only thing I think this would cause is for parallel workers to not properly help out with some partial plans if some earlier subplans were pruned. I can see no reason for this to have caused this particular issue since the first_partial_plan would be 0 with and without the attached fix. Tom, would there be any chance you could run the above script for a while on pademelon to see if it can in fact reproduce the problem? coypu did show this problem in the install check, so I don't think it will need the other concurrent tests to fail. If you can recreate, after adjusting the expected output, does the problem still exist in 5c0675215? I also checked with other tests perform an EXPLAIN ANALYZE of a plan with a Parallel Append and I see there's none. So I've not ruled out that this is an existing bug. git grep "explain.*analyze" also does not show much outside of the partition_prune tests either. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
pgsql-committers by date: