Thread: BUG #14208: Inconsistent code modification - 3
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDIwOApMb2dnZWQgYnk6ICAg ICAgICAgIFBldHJ1LUZsb3JpbiBNaWhhbmNlYQpFbWFpbCBhZGRyZXNzOiAg ICAgIHBldHJ1bUBnbWFpbC5jb20KUG9zdGdyZVNRTCB2ZXJzaW9uOiA5LjQu NApPcGVyYXRpbmcgc3lzdGVtOiAgIE1hY09TWApEZXNjcmlwdGlvbjogICAg ICAgIAoKRmlsZTogcG9zdGdyZXNxbC05LjQuNC9zcmMvYmFja2VuZC9yZXBs aWNhdGlvbi9sb2dpY2FsL3Jlb3JkZXJidWZmZXIuYw0KRnVuY3Rpb246IFJl b3JkZXJCdWZmZXJJbnRlclRYTkluaXQNCkxpbmU6IDg3MA0KDQpUaGUgbGlu ZSBpcw0KDQppZiAodHhuLT5uZW50cmllcyAhPSB0eG4tPm5lbnRyaWVzX21l bSkNCg0KQnV0IHNob3VsZG4ndCBiZSB0aGVyZSBjdXJfdHhuIGluc3RlYWQg b2YgdHhuPw0KDQpJIGRvIG5vdCBrbm93IGV4YWN0bHkgdGhlIHNlbWFudGlj cyBvZiB0aGUgY29kZSBiZWNhdXNlIEkgZGV0ZWN0ZWQgdGhlCnByb2JsZW0g d2l0aCBhIENvZGVTb25hciBwcm90b3R5cGUgcGx1Z2luLiBIb3dldmVyLCBs ZXQgbWUgZXhwbGFpbiB3aHkgSQp0aGluayBpdCBpcyBhIHByb2JsZW06DQoN ClRoZSBsaW5lIDg3MCBpcyBwYXJ0IG9mIGEgcHJvY2Vzc2luZyBzdGVwIChs aW5lcyA4NjYtODgzKSBwZXJmb3JtZWQgZm9yIGVhY2gKc3ViLXRyYW5zYWN0 aW9uIG9mIGEgdHJhbnNhY3Rpb24uIEp1c3QgYmVmb3JlIHRoaXMgY29kZSBm cmFnbWVudCwgdGhlIHNhbWUKcHJvY2Vzc2luZyBzdGVwIGlzIHBlcmZvcm1l ZCBmb3IgdGhlIHRvcGxldmVsIHRyYW5zYWN0aW9uIChsaW5lcyA4NDEtODU3 KS4KVGhlIGNoZWNrIGF0IDg0NSBpcyBleGFjdGx5IGFzIHRoZSBvbmUgaW4g ODcwLiANCg0KSG93ZXZlciwgdGhlIGJ1ZmZlciBvZiBhIHN1Yi10cmFuc2Fj dGlvbiBpcyBjdXJfdHhuIChhbmQgbm90IHR4bikgYW5kIHRodXMsCkkgdGhp bmsgY3VyX3R4biBzaG91bGQgYmUgdXNlZCBpbiA4NzAuIE1vcmVvdmVyLCBp biB0aGUgZG9jIGNvbW1lbnQgb2YKbmVudHJpZXMgZmllbGQgaXQgaXMgc3Bl Y2lmaWVkIHRoYXQgIkNoYW5nZXMgaW4gc3VidHJhbnNhY3Rpb25zIGFyZSAq bm90KgppbmNsdWRlZCBidXQgdHJhY2tlZCBzZXBhcmF0ZWx5Ii4gVGh1cyBh Z2FpbiwgaXQgbG9va3MgdGhhdCB0aGUgbmVudHJpZXMKZmllbGQgZm9yIGEg c3ViLXRyYW5zYWN0aW9uIHNob3VsZCBiZSB1c2VkIGluIGxpbmUgODcwIGFu ZCBub3QgdGhlIG9uZSBvZgp0aGUgdG9wbGV2ZWwgdHJhbnNhY3Rpb24uCgo=
petrum@gmail.com writes: > File: postgresql-9.4.4/src/backend/replication/logical/reorderbuffer.c > Function: ReorderBufferInterTXNInit > Line: 870 > The line is > if (txn->nentries != txn->nentries_mem) > But shouldn't be there cur_txn instead of txn? Actually, the function is ReorderBufferIterTXNInit, and in HEAD this is line 963, but yeah that looks pretty broken. Andres, do you concur? Or maybe the logic needs to be different for subtransactions? > I do not know exactly the semantics of the code because I detected the > problem with a CodeSonar prototype plugin. Seems like a cool tool. regards, tom lane
On 2016-06-22 11:45:16 -0400, Tom Lane wrote: > petrum@gmail.com writes: > > File: postgresql-9.4.4/src/backend/replication/logical/reorderbuffer.c > > Function: ReorderBufferInterTXNInit > > Line: 870 > > > The line is > > if (txn->nentries != txn->nentries_mem) > > But shouldn't be there cur_txn instead of txn? > > Actually, the function is ReorderBufferIterTXNInit, and in HEAD this > is line 963, but yeah that looks pretty broken. Andres, do you > concur? Ugh, yes, that looks broken. In a way that can very likely lead to wrong data being returned :(. I assume an empty toplevel transaction + subtransactions with spilled-to-disk contents will be bad. > Or maybe the logic needs to be different for subtransactions? > > > I do not know exactly the semantics of the code because I detected the > > problem with a CodeSonar prototype plugin. > > Seems like a cool tool. Indeed. What heuristic lead to detecting this? I can think of some, but they all owuld have significant false-positive rates.
> On 24 Jun 2016, at 01:53, Andres Freund <andres@anarazel.de> wrote: >=20 > On 2016-06-22 11:45:16 -0400, Tom Lane wrote: >> petrum@gmail.com writes: >>> File: = postgresql-9.4.4/src/backend/replication/logical/reorderbuffer.c >>> Function: ReorderBufferInterTXNInit >>> Line: 870 >>=20 >>> The line is >>> if (txn->nentries !=3D txn->nentries_mem) >>> But shouldn't be there cur_txn instead of txn? >>=20 >> Actually, the function is ReorderBufferIterTXNInit, and in HEAD this >> is line 963, but yeah that looks pretty broken. Andres, do you >> concur? >=20 > Ugh, yes, that looks broken. In a way that can very likely lead to = wrong > data being returned :(. I assume an empty toplevel transaction + > subtransactions with spilled-to-disk contents will be bad. >=20 >=20 >> Or maybe the logic needs to be different for subtransactions? >>=20 >>> I do not know exactly the semantics of the code because I detected = the >>> problem with a CodeSonar prototype plugin. >>=20 >> Seems like a cool tool. >=20 > Indeed. What heuristic lead to detecting this? I can think of some, = but > they all owuld have significant false-positive rates. Thank you :). I cannot disclose this. The tool I develop is a CodeSonar = (www.grammatech.com) plugin. It is work in progress still under refinement.
Andres Freund <andres@anarazel.de> writes: > On 2016-06-22 11:45:16 -0400, Tom Lane wrote: >> Actually, the function is ReorderBufferIterTXNInit, and in HEAD this >> is line 963, but yeah that looks pretty broken. Andres, do you >> concur? > Ugh, yes, that looks broken. In a way that can very likely lead to wrong > data being returned :(. I assume an empty toplevel transaction + > subtransactions with spilled-to-disk contents will be bad. Actually, doesn't this mean spilled subtransactions will *always* be lost? Whether or not the toplevel transaction is empty, by the time we get here it would have nentries == nentries_mem, no? Anyway, fix pushed. I did not try to devise a regression test case. regards, tom lane
On 2016-06-30 12:40:19 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2016-06-22 11:45:16 -0400, Tom Lane wrote: > >> Actually, the function is ReorderBufferIterTXNInit, and in HEAD this > >> is line 963, but yeah that looks pretty broken. Andres, do you > >> concur? > > > Ugh, yes, that looks broken. In a way that can very likely lead to wrong > > data being returned :(. I assume an empty toplevel transaction + > > subtransactions with spilled-to-disk contents will be bad. > > Actually, doesn't this mean spilled subtransactions will *always* be lost? > Whether or not the toplevel transaction is empty, by the time we get here > it would have nentries == nentries_mem, no? Not, if the top-level transaction spilled to disk. > Anyway, fix pushed. Thanks!. > I did not try to devise a regression test case. I'll try to add some. Andres
Andres Freund <andres@anarazel.de> writes: > On 2016-06-30 12:40:19 -0400, Tom Lane wrote: >> Whether or not the toplevel transaction is empty, by the time we get here >> it would have nentries == nentries_mem, no? > Not, if the top-level transaction spilled to disk. But doesn't the code stanza just above this loop pull that spillage back in? It's certainly doing *something* to txn->nentries_mem. regards, tom lane
On 2016-06-30 12:51:51 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2016-06-30 12:40:19 -0400, Tom Lane wrote: > >> Whether or not the toplevel transaction is empty, by the time we get here > >> it would have nentries == nentries_mem, no? > > > Not, if the top-level transaction spilled to disk. > > But doesn't the code stanza just above this loop pull that spillage > back in? Do you mean the following? /* add toplevel transaction if it contains changes */ if (txn->nentries > 0) { ReorderBufferChange *cur_change; if (txn->nentries != txn->nentries_mem) ReorderBufferRestoreChanges(rb, txn, &state->entries[off].fd, &state->entries[off].segno); If so, sure, it pulls changes back in, but only the first static const Size max_changes_in_memory = 4096; ones. We should never reconstruct a whole large transaction in memory... - Andres
Andres Freund <andres@anarazel.de> writes: > On 2016-06-30 12:51:51 -0400, Tom Lane wrote: >> But doesn't the code stanza just above this loop pull that spillage >> back in? > If so, sure, it pulls changes back in, but only the first > static const Size max_changes_in_memory = 4096; > ones. We should never reconstruct a whole large transaction in memory... OK, so the failure case is not "empty top level transaction", but "top level transaction small enough to not have spilled", plus a spilled subtransaction, correct? regards, tom lane
On June 30, 2016 9:59:07 AM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote: >Andres Freund <andres@anarazel.de> writes: >> On 2016-06-30 12:51:51 -0400, Tom Lane wrote: >>> But doesn't the code stanza just above this loop pull that spillage >>> back in? > >> If so, sure, it pulls changes back in, but only the first >> static const Size max_changes_in_memory = 4096; >> ones. We should never reconstruct a whole large transaction in >memory... > >OK, so the failure case is not "empty top level transaction", but >"top level transaction small enough to not have spilled", plus a >spilled subtransaction, correct? That's how it looks from afar, without having investigated in depth. -- Sent from my Android device with K-9 Mail. Please excuse my brevity.