Thread: FATAL: could not receive data from WAL stream
Hi guys,
I got a slave server running Postgres 9.2 with streaming replication and wal_archive in an EC2 Instance at Amazon.
Postgres logs are showing me this error:
restored log file "000000020000179A000000F8" from archive
invalid record length at 179A/F8FFF3D0
WAL segment `/var/lib/pgsql/9.2/archive/00000003.history` not found
streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000020000179A000000F8 has already been removed
However, 000000020000179A000000F8 file is inside /var/lib/pgsql/9.2/archive directory:
postgres@devops:/var/lib/pgsql/9.2/archive$ ls -la | grep 000000020000179A000000F8
-rw------- 1 postgres postgres 16777216 Sep 16 05:16 000000020000179A000000F8
It's an UBUNTU instance, so my recovery.conf is:
/etc/postgresql/9.2/main/ recovery.conf:
restore_command = 'exec /var/lib/pgsql/bin/restore_wal_segment.bash "/var/lib/pgsql/9.2/wal_ archive/%f" "%p"'
archive_cleanup_command = '/var/lib/postgresql/bin/pg_archivecleaup_mv.bash'
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=IP_MY_SLAVE port=5432 user=replicator application_name=devops'
What can be happening, if the file is in there?
Thanks
Patrick
On Tue, Sep 20, 2016 at 12:38 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
Do you mean to say that the WAL file "000000020000179A000000F8" is available @ "/var/lib/pgsql/9.2/archive" location ?Hi guys,I got a slave server running Postgres 9.2 with streaming replication and wal_archive in an EC2 Instance at Amazon.Postgres logs are showing me this error:restored log file "000000020000179A000000F8" from archive
invalid record length at 179A/F8FFF3D0
WAL segment `/var/lib/pgsql/9.2/archive/00000003.history` not found
streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000020000179A000000F8 has already been removedHowever, 000000020000179A000000F8 file is inside /var/lib/pgsql/9.2/archive directory:postgres@devops:/var/lib/pgsql/9.2/archive$ ls -la | grep 000000020000179A000000F8
-rw------- 1 postgres postgres 16777216 Sep 16 05:16 000000020000179A000000F8It's an UBUNTU instance, so my recovery.conf is:/etc/postgresql/9.2/main/recovery.conf: restore_command = 'exec /var/lib/pgsql/bin/restore_wal_segment.bash "/var/lib/pgsql/9.2/wal_archiv e/%f" "%p"'
archive_cleanup_command = '/var/lib/postgresql/bin/pg_archivecleaup_mv.bash'
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=IP_MY_SLAVE port=5432 user=replicator application_name=devops'What can be happening, if the file is in there?
Regards,
Venkata B N
Fujitsu Australia
2016-09-20 15:14 GMT+12:00 Venkata B Nagothi <nag1010@gmail.com>:
On Tue, Sep 20, 2016 at 12:38 PM, Patrick B <patrickbakerbr@gmail.com> wrote:Do you mean to say that the WAL file "000000020000179A000000F8" is available @ "/var/lib/pgsql/9.2/archive" location ?Hi guys,I got a slave server running Postgres 9.2 with streaming replication and wal_archive in an EC2 Instance at Amazon.Postgres logs are showing me this error:restored log file "000000020000179A000000F8" from archive
invalid record length at 179A/F8FFF3D0
WAL segment `/var/lib/pgsql/9.2/archive/00000003.history` not found
streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000020000179A000000F8 has already been removedHowever, 000000020000179A000000F8 file is inside /var/lib/pgsql/9.2/archive directory:postgres@devops:/var/lib/pgsql/9.2/archive$ ls -la | grep 000000020000179A000000F8
-rw------- 1 postgres postgres 16777216 Sep 16 05:16 000000020000179A000000F8It's an UBUNTU instance, so my recovery.conf is:/etc/postgresql/9.2/main/recovery.conf: restore_command = 'exec /var/lib/pgsql/bin/restore_wal_segment.bash "/var/lib/pgsql/9.2/wal_archiv e/%f" "%p"'
archive_cleanup_command = '/var/lib/postgresql/bin/pg_archivecleaup_mv.bash'
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=IP_MY_SLAVE port=5432 user=replicator application_name=devops'What can be happening, if the file is in there?Regards,Venkata B NFujitsu Australia
Yes.....
2016-09-20 16:29 GMT+12:00 Lucas Possamai <drum.lucas@gmail.com>:
2016-09-20 15:14 GMT+12:00 Venkata B Nagothi <nag1010@gmail.com>:On Tue, Sep 20, 2016 at 12:38 PM, Patrick B <patrickbakerbr@gmail.com> wrote:Do you mean to say that the WAL file "000000020000179A000000F8" is available @ "/var/lib/pgsql/9.2/archive" location ?Hi guys,I got a slave server running Postgres 9.2 with streaming replication and wal_archive in an EC2 Instance at Amazon.Postgres logs are showing me this error:restored log file "000000020000179A000000F8" from archive
invalid record length at 179A/F8FFF3D0
WAL segment `/var/lib/pgsql/9.2/archive/00000003.history` not found
streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000020000179A000000F8 has already been removedHowever, 000000020000179A000000F8 file is inside /var/lib/pgsql/9.2/archive directory:postgres@devops:/var/lib/pgsql/9.2/archive$ ls -la | grep 000000020000179A000000F8
-rw------- 1 postgres postgres 16777216 Sep 16 05:16 000000020000179A000000F8It's an UBUNTU instance, so my recovery.conf is:/etc/postgresql/9.2/main/recovery.conf: restore_command = 'exec /var/lib/pgsql/bin/restore_wal_segment.bash "/var/lib/pgsql/9.2/wal_archiv e/%f" "%p"'
archive_cleanup_command = '/var/lib/postgresql/bin/pg_archivecleaup_mv.bash'
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=IP_MY_SLAVE port=5432 user=replicator application_name=devops'What can be happening, if the file is in there?Regards,Venkata B NFujitsu AustraliaYes.....
Ops.. sorry... sent to the wrong email
2016-09-20 15:14 GMT+12:00 Venkata B Nagothi <nag1010@gmail.com>:
On Tue, Sep 20, 2016 at 12:38 PM, Patrick B <patrickbakerbr@gmail.com> wrote:Do you mean to say that the WAL file "000000020000179A000000F8" is available @ "/var/lib/pgsql/9.2/archive" location ?Hi guys,I got a slave server running Postgres 9.2 with streaming replication and wal_archive in an EC2 Instance at Amazon.Postgres logs are showing me this error:restored log file "000000020000179A000000F8" from archive
invalid record length at 179A/F8FFF3D0
WAL segment `/var/lib/pgsql/9.2/archive/00000003.history` not found
streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000020000179A000000F8 has already been removedHowever, 000000020000179A000000F8 file is inside /var/lib/pgsql/9.2/archive directory:postgres@devops:/var/lib/pgsql/9.2/archive$ ls -la | grep 000000020000179A000000F8
-rw------- 1 postgres postgres 16777216 Sep 16 05:16 000000020000179A000000F8It's an UBUNTU instance, so my recovery.conf is:/etc/postgresql/9.2/main/recovery.conf: restore_command = 'exec /var/lib/pgsql/bin/restore_wal_segment.bash "/var/lib/pgsql/9.2/wal_archiv e/%f" "%p"'
archive_cleanup_command = '/var/lib/postgresql/bin/pg_archivecleaup_mv.bash'
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=IP_MY_SLAVE port=5432 user=replicator application_name=devops'What can be happening, if the file is in there?
Yes!
On Tue, Sep 20, 2016 at 1:30 PM, Patrick B <patrickbakerbr@gmail.com> wrote: > 2016-09-20 15:14 GMT+12:00 Venkata B Nagothi <nag1010@gmail.com>: >> Do you mean to say that the WAL file "000000020000179A000000F8" is >> available @ "/var/lib/pgsql/9.2/archive" location ? > > Yes! Timeline 2 has visibly reached its end at segment 000000020000179A000000F8 and it cannot find in the archive the history file to see from which timeline it needs to fetch afterwards. As the timeline file cannot be found, it then attempts to fetch the segment that it thinks is complete from the master itself. Didn't you trigger a promotion which would make the master reach the timeline 3? And are you sure that 00000003.history is not in the archives? -- Michael
2016-09-20 16:46 GMT+12:00 Michael Paquier <michael.paquier@gmail.com>:
On Tue, Sep 20, 2016 at 1:30 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
> 2016-09-20 15:14 GMT+12:00 Venkata B Nagothi <nag1010@gmail.com>:
>> Do you mean to say that the WAL file "000000020000179A000000F8" is
>> available @ "/var/lib/pgsql/9.2/archive" location ?
>
> Yes!
Timeline 2 has visibly reached its end at segment
000000020000179A000000F8 and it cannot find in the archive the history
file to see from which timeline it needs to fetch afterwards. As the
timeline file cannot be found, it then attempts to fetch the segment
that it thinks is complete from the master itself.
Didn't you trigger a promotion which would make the master reach the
timeline 3? And are you sure that 00000003.history is not in the
archives?
--
Michael
The server went down and when it came back online I got that errors..
I got some errors on the logs: systemd1: Removed slice User Slice of postgres.
I belive something happened with Postgres user and when the server came back online it started postgres in a new path... that excluded recovery.conf and the server might have been promoted as master
This means I'll have to re-build the DB right?
Patrick