Re: VM corruption on standby - Mailing list pgsql-hackers

From Aleksander Alekseev
Subject Re: VM corruption on standby
Date
Msg-id CAJ7c6TOtYagmAm+f4B3JEWoahG3bocoBNe1Gvdrjejo5MMMC1g@mail.gmail.com
Whole thread Raw
In response to Re: VM corruption on standby  (Aleksander Alekseev <aleksander@tigerdata.com>)
List pgsql-hackers
Hi,

> If my understanding is correct, we should make a WAL record with the
> XLH_LOCK_ALL_FROZEN_CLEARED flag *before* we modify the VM but within
> the same critical section [...]
>
> A draft patch is attached. It makes the test pass and doesn't seem to
> break any other tests.
>
> Thoughts?

In order not to forget - assuming I'm not wrong about the cause of the
issue, we might want to recheck the order of visibilitymap_* and XLog*
calls in the following functions too:

- heap_multi_insert
- heap_delete
- heap_update
- heap_lock_tuple
- heap_lock_updated_tuple_rec

By a quick look all named functions modify the VM before making a
corresponding WAL record. This can cause a similar issue:

1. VM modified
2. evicted asynchronously before logging
3. kill 9
4. different state of VM on primary and standby



pgsql-hackers by date:

Previous
From: Xuneng Zhou
Date:
Subject: Re: Implement waiting for wal lsn replay: reloaded
Next
From: Ilia Evdokimov
Date:
Subject: Re: stylesheet-html-common: only apply Bootstrap container classes in website build