Re: Non-reproducible AIO failure - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Non-reproducible AIO failure |
Date | |
Msg-id | kzasuj5yb7o2fbii43pmxwdvbk5z7j2kbwwigfmdgv4vve2tkl@wvtxubexck2r Whole thread Raw |
In response to | Re: Non-reproducible AIO failure (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Non-reproducible AIO failure
Re: Non-reproducible AIO failure |
List | pgsql-hackers |
Hi, On 2025-06-06 15:21:13 -0400, Tom Lane wrote: > Konstantin Knizhnik <knizhnik@garret.ru> writes: > > There is really essential difference in code generated by clang 15 > > (working) and 16 (not working). > > It's a mistake to think that this is a compiler bug. The C standard > explicitly allows compilers to use word-wide operations to access > bit-field struct members. Such accesses may fetch or overwrite other > bit-fields in the same word, using a non-atomic read/modify/write > sequence. I don't have a recent C standard at hand, but I found this > quote on the web [1]: > > The C Standard, 3.17, paragraph 3 [ISO/IEC 9899:2024], states > > Note 2 to entry: A bit-field and an adjacent non-bit-field member > are in separate memory locations. The same applies to two > bit-fields, if one is declared inside a nested structure > declaration and the other is not, or if the two are separated by a > zero-length bit-field declaration, or if they are separated by a > non-bit-field member declaration. It is not safe to concurrently > update two non-atomic bit-fields in the same structure if all > members declared between them are also (nonzero-length) > bit-fields, no matter what the sizes of those intervening > bit-fields happen to be. > > So it's our code that is busted. No doubt, what is happening is > that process A is fetching two fields, modifying one of them, > and storing the word back (with the observed value of the other > field) concurrently with some other process trying to update > the other field. So the other process's update is lost. There shouldn't be any concurrent accesses here, so I don't really see how the above would explain the problem (the IO can only ever be modified by one backend, initially the "owning backend", then, when submitted, by the IO worker, and then again by the backend). Alexander's latest post has a good example (due to the added logging) At the start of reclaim (looks sane): !!!pgaio_io_reclaim [91056]| ioh: 0x1066d15d0, ioh->op: 170, ioh->state: 6, ioh->result: 8192, ioh->num_callbacks: 2, ioh->generation:29202 At the end of reclaim (looks sane), note ioh->op = 0 !!!pgaio_io_reclaim [91056]| ioh: 0x1066d15d0, ioh->op: 0, ioh->generation: 29203 Just a bit later (broken): !!!AsyncReadBuffers [91056] (1)| blocknum: 73, ioh: 0x1066d15d0, ioh->op: 170, ioh->state: 1, ioh->result: 0, ioh->num_callbacks:0, ioh->generation: 29203 Note that ioh->op went back to 170, despite the code that sets ioh->op not yet having been called. However the increment of the increment of ioh->generation survived. Between the second "!!!pgaio_io_reclaim" and "!!!AsyncReadBuffers" there definitely aren't any other processes accessing things. > If we want to preserve the compactness of this struct, we have > to use primitive int types not bit-fields. I think we have to do that, if for no other reason than how bad the generated code is. But I would still like to understand what actually causes the problem here - I'm far from sure it's actually the bitfield. Greetings, Andres Freund
pgsql-hackers by date: