Thread: Old Postgresql version on i7-1165g7
Good day, hackers. I've got HP ProBook 640g8 with i7-1165g7. I've installed Ubuntu 20.04 LTS on it and started to play with PostgreSQL sources. Occasinally I found I'm not able to `make check` old Postgresql versions. At least 9.6 and 10. They are failed at the initdb stage in the call to postgresql. Raw postgresql version 9.6.8 and 10.0 fails in boostrap stage: running bootstrap script ... 2021-04-09 12:33:26.424 MSK [161121] FATAL: could not find tuple for opclass 1 2021-04-09 12:33:26.424 MSK [161121] PANIC: cannot abort transaction 1, it was already committed Aborted (core dumped) child process exited with exit code 134 Our modified custom version 9.6 fails inside of libc __strncmp_avx2 during post-bootstrap with segmentation fault: Program terminated with signal SIGSEGV, Segmentation fault. #0 __strncmp_avx2 () #1 0x0000557168a7eeda in nameeq #2 0x0000557168b4c4a0 in FunctionCall2Coll #3 0x0000557168659555 in heapgettup_pagemode #4 0x000055716865a617 in heap_getnext #5 0x0000557168678cf1 in systable_getnext #6 0x0000557168b5651c in GetDatabaseTuple #7 0x0000557168b574a4 in InitPostgres #8 0x00005571689dcb7d in PostgresMain #9 0x00005571688844d5 in main I've bisected between REL_11_0 and "Rename pg_rewind's copy_file_range()" and found 372728b0d49552641f0ea83d9d2e08817de038fa > Replace our traditional initial-catalog-data format with a better > design. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=372728b0d49552641f0ea83d9d2e08817de038fa This is first commit where `make check` doesn't fail during initdb on my machine. Therefore 02f3e558f21c0fbec9f94d5de9ad34f321eb0e57 is the last one where `make check` fails. I've tried with gcc9, gcc10 and clang10. I've configured either without parameters or with `CFLAGS=-O0 ./configure --enable-debug`. Thing doesn't happen on Intel CPU of 10th series (i7-10510U and i9-10900K). Unfortunately, I have no fellows or colleagues with Intel CPU 11 series, therefore I couldn't tell if this bug of 11 series or bug of concrete CPU installed in the notebook. It will be great if some with i7-11* could try to make check and report if it also fails or not. With regards, Yura Sokolov PostgresPro
Yura Sokolov писал 2021-04-09 16:28: > Good day, hackers. > > I've got HP ProBook 640g8 with i7-1165g7. I've installed Ubuntu 20.04 > LTS on it > and started to play with PostgreSQL sources. > > Occasinally I found I'm not able to `make check` old Postgresql > versions. > At least 9.6 and 10. They are failed at the initdb stage in the call > to postgresql. > > Raw postgresql version 9.6.8 and 10.0 fails in boostrap stage: > > running bootstrap script ... 2021-04-09 12:33:26.424 MSK [161121] > FATAL: could not find tuple for opclass 1 > 2021-04-09 12:33:26.424 MSK [161121] PANIC: cannot abort > transaction 1, it was already committed > Aborted (core dumped) > child process exited with exit code 134 > > Our modified custom version 9.6 fails inside of libc __strncmp_avx2 > during post-bootstrap > with segmentation fault: > > Program terminated with signal SIGSEGV, Segmentation fault. > #0 __strncmp_avx2 () > #1 0x0000557168a7eeda in nameeq > #2 0x0000557168b4c4a0 in FunctionCall2Coll > #3 0x0000557168659555 in heapgettup_pagemode > #4 0x000055716865a617 in heap_getnext > #5 0x0000557168678cf1 in systable_getnext > #6 0x0000557168b5651c in GetDatabaseTuple > #7 0x0000557168b574a4 in InitPostgres > #8 0x00005571689dcb7d in PostgresMain > #9 0x00005571688844d5 in main > > I've bisected between REL_11_0 and "Rename pg_rewind's > copy_file_range()" and > found 372728b0d49552641f0ea83d9d2e08817de038fa >> Replace our traditional initial-catalog-data format with a better >> design. > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=372728b0d49552641f0ea83d9d2e08817de038fa > > This is first commit where `make check` doesn't fail during initdb on > my machine. > Therefore 02f3e558f21c0fbec9f94d5de9ad34f321eb0e57 is the last one > where `make check` fails. > > I've tried with gcc9, gcc10 and clang10. > I've configured either without parameters or with `CFLAGS=-O0 > ./configure --enable-debug`. > > Thing doesn't happen on Intel CPU of 10th series (i7-10510U and > i9-10900K). > Unfortunately, I have no fellows or colleagues with Intel CPU 11 > series, > therefore I couldn't tell if this bug of 11 series or bug of concrete > CPU installed > in the notebook. > > It will be great if some with i7-11* could try to make check and report > if it also fails or not. BTW, problem remains in Debian stable (10.4) inside docker on same machine. > > With regards, > Yura Sokolov > PostgresPro
On Fri, Apr 09, 2021 at 04:28:25PM +0300, Yura Sokolov wrote: > Good day, hackers. > > I've got HP ProBook 640g8 with i7-1165g7. I've installed Ubuntu 20.04 LTS on > it > and started to play with PostgreSQL sources. > > Occasinally I found I'm not able to `make check` old Postgresql versions. Do you mean that HEAD works consistently, but v9.6 and v10 sometimes work but sometimes fail ? > #5 0x0000557168678cf1 in systable_getnext > #6 0x0000557168b5651c in GetDatabaseTuple > #7 0x0000557168b574a4 in InitPostgres > #8 0x00005571689dcb7d in PostgresMain > #9 0x00005571688844d5 in main > > I've bisected between REL_11_0 and "Rename pg_rewind's copy_file_range()" > and > found 372728b0d49552641f0ea83d9d2e08817de038fa > > Replace our traditional initial-catalog-data format with a better > > design. > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=372728b0d49552641f0ea83d9d2e08817de038fa > > This is first commit where `make check` doesn't fail during initdb on my > machine. Therefore 02f3e558f21c0fbec9f94d5de9ad34f321eb0e57 is the last one where > `make check` fails. This doesn't make much sense or help much, since 372728b doesn't actually change the catalogs, or any .c file. > I've tried with gcc9, gcc10 and clang10. > I've configured either without parameters or with `CFLAGS=-O0 ./configure > --enable-debug`. You used make clean too, right ? I would also use --with-cassert, since it might catch problems you'd otherwise miss. If that doesn't expose anything, maybe try to #define USE_VALGRIND in src/include/pg_config_manual.h, and run with valgrind --trace-children=yes -- Justin
Justin Pryzby <pryzby@telsasoft.com> writes: > On Fri, Apr 09, 2021 at 04:28:25PM +0300, Yura Sokolov wrote: >> Occasinally I found I'm not able to `make check` old Postgresql versions. >> I've bisected between REL_11_0 and "Rename pg_rewind's copy_file_range()" >> and >> found 372728b0d49552641f0ea83d9d2e08817de038fa >>> Replace our traditional initial-catalog-data format with a better >>> design. >> This is first commit where `make check` doesn't fail during initdb on my >> machine. > This doesn't make much sense or help much, since 372728b doesn't actually > change the catalogs, or any .c file. It could make sense if some part of the toolchain that was previously used to generate postgres.bki doesn't work right on that machine. Overall though I'd have thought that 372728b would increase not decrease our toolchain footprint. It also seems unlikely that a recent Ubuntu release would contain toolchain bugs that we hadn't already heard about. > You used make clean too, right ? Really, when bisecting, you need to use "make distclean" or even "git clean -dfx" between steps, or you may get bogus results, because our makefiles aren't that great about tracking dependencies, especially when you move backwards in the history. So perhaps a more plausible theory is that this bisection result is wrong because you weren't careful enough. regards, tom lane
Tom Lane писал 2021-04-13 17:45: > Justin Pryzby <pryzby@telsasoft.com> writes: >> On Fri, Apr 09, 2021 at 04:28:25PM +0300, Yura Sokolov wrote: >>> Occasinally I found I'm not able to `make check` old Postgresql >>> versions. > >>> I've bisected between REL_11_0 and "Rename pg_rewind's >>> copy_file_range()" >>> and >>> found 372728b0d49552641f0ea83d9d2e08817de038fa >>>> Replace our traditional initial-catalog-data format with a better >>>> design. >>> This is first commit where `make check` doesn't fail during initdb on >>> my >>> machine. > >> This doesn't make much sense or help much, since 372728b doesn't >> actually >> change the catalogs, or any .c file. > > It could make sense if some part of the toolchain that was previously > used to generate postgres.bki doesn't work right on that machine. > Overall though I'd have thought that 372728b would increase not > decrease our toolchain footprint. It also seems unlikely that a > recent Ubuntu release would contain toolchain bugs that we hadn't > already heard about. > >> You used make clean too, right ? > > Really, when bisecting, you need to use "make distclean" or even > "git clean -dfx" between steps, or you may get bogus results, > because our makefiles aren't that great about tracking dependencies, > especially when you move backwards in the history. > > So perhaps a more plausible theory is that this bisection result > is wrong because you weren't careful enough. > > regards, tom lane Sorry for missing mail for a week. I believe I cleaned before each step since I'm building in external directory and cleanup is just `rm * -r`. But I'll repeat bisecting tomorrow to be sure. I don't think it is really PostgreSQL or toolchain bug. I believe it is some corner case that were changed in new Intel CPU. With regards, Yura Sokolov.
Yura Sokolov писал 2021-04-18 23:29: > Tom Lane писал 2021-04-13 17:45: >> Justin Pryzby <pryzby@telsasoft.com> writes: >>> On Fri, Apr 09, 2021 at 04:28:25PM +0300, Yura Sokolov wrote: >>>> Occasinally I found I'm not able to `make check` old Postgresql >>>> versions. >> >>>> I've bisected between REL_11_0 and "Rename pg_rewind's >>>> copy_file_range()" >>>> and >>>> found 372728b0d49552641f0ea83d9d2e08817de038fa >>>>> Replace our traditional initial-catalog-data format with a better >>>>> design. >>>> This is first commit where `make check` doesn't fail during initdb >>>> on my >>>> machine. >> >>> This doesn't make much sense or help much, since 372728b doesn't >>> actually >>> change the catalogs, or any .c file. >> >> It could make sense if some part of the toolchain that was previously >> used to generate postgres.bki doesn't work right on that machine. >> Overall though I'd have thought that 372728b would increase not >> decrease our toolchain footprint. It also seems unlikely that a >> recent Ubuntu release would contain toolchain bugs that we hadn't >> already heard about. >> >>> You used make clean too, right ? >> >> Really, when bisecting, you need to use "make distclean" or even >> "git clean -dfx" between steps, or you may get bogus results, >> because our makefiles aren't that great about tracking dependencies, >> especially when you move backwards in the history. Yep, "git clean -dfx" did the job. "make distclean" didn't, btw. I've had "src/backend/catalog/schemapg.h" file in source tree generated with "make submake-generated-headers" on REL_13_0. It were not shown with "git status", therefore I didn't notice its existence. It were not deleted neither with "make distclean", nor with "git clean -dx" I tried before. Only "git clean -dfx" deletes it. Thank you for the suggestion, Tom. You've saved my sanity. Regards, Yura Sokolov.