Thread: BUG #15594: Unstable tests in contrib/test_decoding/output_iso/
The following bug has been logged on the website: Bug reference: 15594 Logged by: Alexander Lakhin Email address: exclusion@gmail.com PostgreSQL version: 11.1 Operating system: Ubuntu 18.04 Description: Some tests in contrib/test_decoding/output_iso/ occasionally fail. To reproduce it reliable you can do: # sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i contrib/test_decoding/Makefile # make check -C contrib/test_decoding/ ... test delayed_startup ... ok test ondisk_startup ... ok test concurrent_ddl_dml ... ok test oldest_xmin ... ok test snapshot_transfer ... ok ============== shutting down postmaster ============== ========================== 19 of 1005 tests failed. ========================== The differences that caused some tests to fail can be viewed in the file ".../contrib/test_decoding/output_iso/regression.diffs". A copy of the test summary that you see above is saved in the file ".../contrib/test_decoding/output_iso/regression.out". The diff file contains: --- .../contrib/test_decoding/expected/delayed_startup.out 2018-02-08 16:57:42.816744192 +0300 +++ .../contrib/test_decoding/output_iso/results/delayed_startup.out 2019-01-16 17:46:33.138962021 +0300 @@ -12,6 +12,8 @@ step s2start: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', 'false'); data +BEGIN +COMMIT step s1b: BEGIN ISOLATION LEVEL SERIALIZABLE; step s1w: INSERT INTO do_write DEFAULT VALUES; step s1c: COMMIT; ====================================================================== ... I observe this issue with the tests: mxact, delayed_startup, and ondisk_startup. Best regards, Alexander
On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote: > The following bug has been logged on the website: > > Bug reference: 15594 > Logged by: Alexander Lakhin > Email address: exclusion@gmail.com > PostgreSQL version: 11.1 > Operating system: Ubuntu 18.04 > Description: > > Some tests in contrib/test_decoding/output_iso/ occasionally fail. > To reproduce it reliable you can do: > # sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i > contrib/test_decoding/Makefile > # make check -C contrib/test_decoding/ > > ... > test delayed_startup ... ok > test ondisk_startup ... ok > test concurrent_ddl_dml ... ok > test oldest_xmin ... ok > test snapshot_transfer ... ok > ============== shutting down postmaster ============== > > ========================== > 19 of 1005 tests failed. > ========================== > The differences that caused some tests to fail can be viewed in the > file ".../contrib/test_decoding/output_iso/regression.diffs". A copy of the > test summary that you see > above is saved in the file > ".../contrib/test_decoding/output_iso/regression.out". > > The diff file contains: > --- .../contrib/test_decoding/expected/delayed_startup.out 2018-02-08 > 16:57:42.816744192 +0300 > +++ .../contrib/test_decoding/output_iso/results/delayed_startup.out > 2019-01-16 17:46:33.138962021 +0300 > @@ -12,6 +12,8 @@ > step s2start: SELECT data FROM > pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', > 'false'); > data > > +BEGIN > +COMMIT > step s1b: BEGIN ISOLATION LEVEL SERIALIZABLE; > step s1w: INSERT INTO do_write DEFAULT VALUES; > step s1c: COMMIT; > > ====================================================================== > ... > > > I observe this issue with the tests: mxact, delayed_startup, and > ondisk_startup. Interesting - I've not seen that on the buildfarm before, or in local tests with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1' to the options of pg_logical_slot_get_changes()? Greetings, Andres Freund
16.01.2019 22:09, Andres Freund wrote: > On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote: >> I observe this issue with the tests: mxact, delayed_startup, and >> ondisk_startup. > Interesting - I've not seen that on the buildfarm before, or in local tests > with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1' > to the options of pg_logical_slot_get_changes()? > > Greetings, > > Andres Freund Thank you, Andres. Yes, it helps. After applying the attached patch the issue is gone. Just to note that mxact.out and ondisk_startup.out expected an empty transaction. If it's not important, then the fix is ready. Best regards, Alexander
Attachment
On 16/01/2019 20:09, Andres Freund wrote: > On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote: >> The following bug has been logged on the website: >> >> Bug reference: 15594 >> Logged by: Alexander Lakhin >> Email address: exclusion@gmail.com >> PostgreSQL version: 11.1 >> Operating system: Ubuntu 18.04 >> Description: >> >> Some tests in contrib/test_decoding/output_iso/ occasionally fail. >> To reproduce it reliable you can do: >> # sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i >> contrib/test_decoding/Makefile >> # make check -C contrib/test_decoding/ >> >> ... >> test delayed_startup ... ok >> test ondisk_startup ... ok >> test concurrent_ddl_dml ... ok >> test oldest_xmin ... ok >> test snapshot_transfer ... ok >> ============== shutting down postmaster ============== >> >> ========================== >> 19 of 1005 tests failed. >> ========================== > >> The differences that caused some tests to fail can be viewed in the >> file ".../contrib/test_decoding/output_iso/regression.diffs". A copy of the >> test summary that you see >> above is saved in the file >> ".../contrib/test_decoding/output_iso/regression.out". >> >> The diff file contains: >> --- .../contrib/test_decoding/expected/delayed_startup.out 2018-02-08 >> 16:57:42.816744192 +0300 >> +++ .../contrib/test_decoding/output_iso/results/delayed_startup.out >> 2019-01-16 17:46:33.138962021 +0300 >> @@ -12,6 +12,8 @@ >> step s2start: SELECT data FROM >> pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', >> 'false'); >> data >> >> +BEGIN >> +COMMIT >> step s1b: BEGIN ISOLATION LEVEL SERIALIZABLE; >> step s1w: INSERT INTO do_write DEFAULT VALUES; >> step s1c: COMMIT; >> >> ====================================================================== >> ... >> >> >> I observe this issue with the tests: mxact, delayed_startup, and >> ondisk_startup. > > Interesting - I've not seen that on the buildfarm before, or in local tests > with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1' > to the options of pg_logical_slot_get_changes()? > I bet those empty transactions are from autovacuum. The probability of hitting this could be exaggerated by the fact that the test is run within the same instance 1000x rather than running the make check 1000x so we produce plenty of catalog bloat for autovacuum to clean. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
17.01.2019 13:25, Petr Jelinek wrote: > On 16/01/2019 20:09, Andres Freund wrote: >> On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote: >>> The following bug has been logged on the website: >>> >>> Bug reference: 15594 >>> Logged by: Alexander Lakhin >>> Email address: exclusion@gmail.com >>> PostgreSQL version: 11.1 >>> Operating system: Ubuntu 18.04 >>> Description: >>> >>> Some tests in contrib/test_decoding/output_iso/ occasionally fail. >>> To reproduce it reliable you can do: >>> # sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i >>> contrib/test_decoding/Makefile >>> # make check -C contrib/test_decoding/ >>> >>> ... >>> >>> I observe this issue with the tests: mxact, delayed_startup, and >>> ondisk_startup. >> Interesting - I've not seen that on the buildfarm before, or in local tests >> with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1' >> to the options of pg_logical_slot_get_changes()? >> > I bet those empty transactions are from autovacuum. The probability of > hitting this could be exaggerated by the fact that the test is run > within the same instance 1000x rather than running the make check 1000x > so we produce plenty of catalog bloat for autovacuum to clean. > Yes, it's almost impossible to hit this with `make check`, but with `make installcheck` the probability is much higher as we use an existing instance. I've reproduced it with `make check` just to make it simple. Best regards, Alexander
On 2019-01-17 13:42:40 +0300, Alexander Lakhin wrote: > 17.01.2019 13:25, Petr Jelinek wrote: > > On 16/01/2019 20:09, Andres Freund wrote: > >> On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote: > >>> The following bug has been logged on the website: > >>> > >>> Bug reference: 15594 > >>> Logged by: Alexander Lakhin > >>> Email address: exclusion@gmail.com > >>> PostgreSQL version: 11.1 > >>> Operating system: Ubuntu 18.04 > >>> Description: > >>> > >>> Some tests in contrib/test_decoding/output_iso/ occasionally fail. > >>> To reproduce it reliable you can do: > >>> # sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i > >>> contrib/test_decoding/Makefile > >>> # make check -C contrib/test_decoding/ > >>> > >>> ... > >>> > >>> I observe this issue with the tests: mxact, delayed_startup, and > >>> ondisk_startup. > >> Interesting - I've not seen that on the buildfarm before, or in local tests > >> with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1' > >> to the options of pg_logical_slot_get_changes()? > >> > > I bet those empty transactions are from autovacuum. The probability of > > hitting this could be exaggerated by the fact that the test is run > > within the same instance 1000x rather than running the make check 1000x > > so we produce plenty of catalog bloat for autovacuum to clean. > > > Yes, it's almost impossible to hit this with `make check`, but with > `make installcheck` the probability is much higher as we use an existing > instance. > I've reproduced it with `make check` just to make it simple. I'll apply the fix for this, but please note that installcheck *intentionally* is disabled for test_decoding. Greetings, Andres Freund
17.01.2019 20:39, Andres Freund wrote: > On 2019-01-17 13:42:40 +0300, Alexander Lakhin wrote: >> 17.01.2019 13:25, Petr Jelinek wrote: >>> On 16/01/2019 20:09, Andres Freund wrote: >>>> On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote: >>>>> The following bug has been logged on the website: >>>>> >>>>> Bug reference: 15594 >>>>> Logged by: Alexander Lakhin >>>>> Email address: exclusion@gmail.com >>>>> PostgreSQL version: 11.1 >>>>> Operating system: Ubuntu 18.04 >>>>> Description: >>>>> >>>>> Some tests in contrib/test_decoding/output_iso/ occasionally fail. >>>>> To reproduce it reliable you can do: >>>>> # sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i >>>>> contrib/test_decoding/Makefile >>>>> # make check -C contrib/test_decoding/ >>>>> >>>>> ... >>>>> >>>>> I observe this issue with the tests: mxact, delayed_startup, and >>>>> ondisk_startup. >>>> Interesting - I've not seen that on the buildfarm before, or in local tests >>>> with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1' >>>> to the options of pg_logical_slot_get_changes()? >>>> >>> I bet those empty transactions are from autovacuum. The probability of >>> hitting this could be exaggerated by the fact that the test is run >>> within the same instance 1000x rather than running the make check 1000x >>> so we produce plenty of catalog bloat for autovacuum to clean. >>> >> Yes, it's almost impossible to hit this with `make check`, but with >> `make installcheck` the probability is much higher as we use an existing >> instance. >> I've reproduced it with `make check` just to make it simple. > I'll apply the fix for this, but please note that installcheck > *intentionally* is disabled for test_decoding. Thanks, Andres! I see a comment in Makefile: # Disabled because these tests require "wal_level=logical", which # typical installcheck users do not have (e.g. buildfarm clients). So if I use for the installcheck'ed instance a config with the required wal_level, is it ok to perform `make installcheck-force -C contrib/test_decoding`? Or maybe there is some other difficulty with installcheck? Best regards, Alexander