Thread: dump/restore doesn't preserve row ordering?
I happened to notice, while experimenting with the data set used in the SPGIST-for-inet thread, that loading the supplied pg_dump script and immediately dumping it does not reproduce the row order appearing in the original dump script. I thought maybe this had something to do with the heap_multi_insert() optimization that COPY uses nowadays, but disabling that didn't change it. Further experimentation says it's been like that since 8.4; 8.3 is the last version that reproduces the source row order in this test. I can't immediately think of a reason for this. In everyday updates you could theorize about effects like autovacuum asynchonously updating the FSM, but surely the FSM should have no impact on where COPY puts stuff when loading into an empty table. I know we've had complaints in the past about dump/reload failing to preserve row order; that's the reason why pg_dump takes care to turn off synchronize_seqscans. And it seems bad from a testing standpoint too. So this bothers me. But I have no idea what's causing it. Does this ring any bells for anyone? regards, tom lane
On 2016-08-23 17:22:03 -0400, Tom Lane wrote: > I happened to notice, while experimenting with the data set used > in the SPGIST-for-inet thread, that loading the supplied pg_dump > script and immediately dumping it does not reproduce the row order > appearing in the original dump script. I thought maybe this had > something to do with the heap_multi_insert() optimization that > COPY uses nowadays, but disabling that didn't change it. Further > experimentation says it's been like that since 8.4; 8.3 is the > last version that reproduces the source row order in this test. That's http://archives.postgresql.org/message-id/CAE2gYzxv8YKEd4O%2B9HUYuQ%3DQMH4pwt9n9cmU-OchV-%3DN8Q7yXQ%40mail.gmail.com ? > I can't immediately think of a reason for this. In everyday > updates you could theorize about effects like autovacuum > asynchonously updating the FSM, but surely the FSM should have no > impact on where COPY puts stuff when loading into an empty table. It seems possible that a larger row didn't fit on a page anymore, then later when a new page was is needed for a smaller row, the earlier page is found again. Due to RelationGetBufferForTuple() updating the fsm when an old target buffer is present: /* * Update FSM as to condition of this page, and ask for another page *to try. */ targetBlock = RecordAndGetPageWithFreeSpace(relation, targetBlock, pageFreeSpace, len + saveFreeSpace); that looks like it's even possible without a concurrent autovacuum. Andres
Andres Freund <andres@anarazel.de> writes: > On 2016-08-23 17:22:03 -0400, Tom Lane wrote: >> I can't immediately think of a reason for this. In everyday >> updates you could theorize about effects like autovacuum >> asynchonously updating the FSM, but surely the FSM should have no >> impact on where COPY puts stuff when loading into an empty table. > It seems possible that a larger row didn't fit on a page anymore, then > later when a new page was is needed for a smaller row, the earlier page > is found again. Due to RelationGetBufferForTuple() updating the fsm > when an old target buffer is present: Ah. That matches the symptoms --- small groups of rows are getting relocated, seems like. And there's definitely a wide range of row lengths in this data. It's interesting that nobody has complained about this behavior. Maybe the old fogies are all gone ... regards, tom lane
On Tue, Aug 23, 2016 at 8:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > It's interesting that nobody has complained about this behavior. We have been known to emphasize that unless you have an ORDER BY clause at the outermost level of a query, there is no guarantee about the order of rows returned. In general, even with autovacuum off and no DML running, the same query can return rows in a different order because of synchronized scans -- so I would tend to view the lack of complaints as a sign of the maturity and awareness of the user base. Of course, it can still be an inconvenience for testing, but that's a different issue. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company