Copy/delete issue - Mailing list pgsql-general
From | Herouth Maoz |
---|---|
Subject | Copy/delete issue |
Date | |
Msg-id | 4948BA90.5010300@unicell.co.il Whole thread Raw |
Responses |
Re: Copy/delete issue
|
List | pgsql-general |
I have a strange situation that occurs every now and again.
We have a reports system that gathers all the data from our various production systems during the night, where we can run heavy reports on it without loading the production databases.
I have two shell scripts that do this nightly transfer of data. The production database is Sybase. So I have a shell script that scans a list of tables and databases and dumps them into a format suitable for postgres COPY. After it dumps everything, another shell script scans the same list, and loads each dump file into the proper table.
The shell script first runs psql with a DELETE command. For transaction tables (ones where data accumulates by date) the records for two days are deleted, and for non-transaction tables (ones that have records that might change but don't accumulate based on time) it's DELETE without WHERE.
I run psql with ON_ERROR_STOP and check the exit status. If the DELETE failed, I should get an error status, so I do not proceed to the copy.
Then I run psql again, with ON_ERROR_STOP, and run a \copy command that loads the data to the same table.
For some reason, once in a while, that fails. Always on the same table - violating the unique constraint of the primary key. Now, this is impossible because there was a successful delete beforehand, as I said, and the data comes from a database where that same primary key is enforced. Moreover, when I re-run the script, everything runs fine.
This happens at least once a week - always with the same table.
Can anybody think of a reason why psql will not report an error on deletion? Or why it would tell me that a constraint has been violated when loading the same data 5 minutes later works fine?
Thanks,
Herouth
Here is the relevant shell code (the relevant table has "*' in the file for datefield):
# The names of the tables are stored in a text file
exec 4<$TABLES_FILE
dstamp N "Starting postgres load" >> $LOAD_LOG
while read -u 4 ignored1 ignored2 local_table datefield
do
dstamp N "Now loading $local_table" >> $LOAD_LOG
filename="$DUMPDIR/$local_table.tsv"
# Stop if the dump file does not exist.
if [ ! -f "$filename" ]
then
errexit "Dump file not found for table: $local_table" 1 >> $LOAD_LOG
fi
# If the datefield contains "*", it means the table contents are fully
# replaced, otherwise use this as the field on which to limit the deletion.
if [ "$datefield" = "*" ]
then
CMD="DELETE FROM $local_table"
else
CMD="DELETE FROM $local_table WHERE $datefield >= current_date - 2"
fi
# Run the deletion command
echo -e "\\\\set ON_ERROR_STOP\\n$CMD;" | $PSQLCMD -q -f - > $TMPFILE 2>&1
# Report errors and stop the loop if any occured
rc=$?
if [ "$rc" != "0" ]
then
# Copy the error output, properly formatted, to the log file
sed "s/^/$(date +%Y-%m-%d%t%T) E /" $TMPFILE >> $LOAD_LOG
# Send mail message about the failure
rm -f $TMPFILE
errexit "Deletion failed with status $rc on table: $local_table" $rc >> $LOAD_LOG
fi
# Now run the load command
echo -e "\\\\set ON_ERROR_STOP\\n\\\\copy $local_table from $filename" | $PSQLCMD -q -f - > $TMPFILE 2>&1
rc=$?
# Check for errors and report
if [ "$rc" != "0" ]
then
# Copy the error output, properly formatted, to the log file
sed "s/^/$(date +%Y-%m-%d%t%T) E /" $TMPFILE >> $LOAD_LOG
# Send mail message about the failure
rm -f $TMPFILE
errexit "Copy failed with status $rc on table: $local_table" $rc >> $LOAD_LOG
fi
# Remove the dump file, as well as the output file from the psql command
rm -f "$filename"
# Update statistics with the ANALYZE command
dstamp N "Updating statistics for $local_table" >> $LOAD_LOG
echo -e "\\\\set ON_ERROR_STOP\\nANALYZE $local_table;" | $PSQLCMD -q -f - > $TMPFILE 2>&1
# Report errors and stop the loop if any occured
rc=$?
if [ "$rc" != "0" ]
then
# Copy the error output, properly formatted, to the log file
sed "s/^/$(date +%Y-%m-%d%t%T) E /" $TMPFILE >> $LOAD_LOG
# Send mail message about the failure
rm -f $TMPFILE
errexit "ANALYZE failed with status $rc on table: $local_table" $rc >> $LOAD_LOG
fi
done
We have a reports system that gathers all the data from our various production systems during the night, where we can run heavy reports on it without loading the production databases.
I have two shell scripts that do this nightly transfer of data. The production database is Sybase. So I have a shell script that scans a list of tables and databases and dumps them into a format suitable for postgres COPY. After it dumps everything, another shell script scans the same list, and loads each dump file into the proper table.
The shell script first runs psql with a DELETE command. For transaction tables (ones where data accumulates by date) the records for two days are deleted, and for non-transaction tables (ones that have records that might change but don't accumulate based on time) it's DELETE without WHERE.
I run psql with ON_ERROR_STOP and check the exit status. If the DELETE failed, I should get an error status, so I do not proceed to the copy.
Then I run psql again, with ON_ERROR_STOP, and run a \copy command that loads the data to the same table.
For some reason, once in a while, that fails. Always on the same table - violating the unique constraint of the primary key. Now, this is impossible because there was a successful delete beforehand, as I said, and the data comes from a database where that same primary key is enforced. Moreover, when I re-run the script, everything runs fine.
This happens at least once a week - always with the same table.
Can anybody think of a reason why psql will not report an error on deletion? Or why it would tell me that a constraint has been violated when loading the same data 5 minutes later works fine?
Thanks,
Herouth
Here is the relevant shell code (the relevant table has "*' in the file for datefield):
# The names of the tables are stored in a text file
exec 4<$TABLES_FILE
dstamp N "Starting postgres load" >> $LOAD_LOG
while read -u 4 ignored1 ignored2 local_table datefield
do
dstamp N "Now loading $local_table" >> $LOAD_LOG
filename="$DUMPDIR/$local_table.tsv"
# Stop if the dump file does not exist.
if [ ! -f "$filename" ]
then
errexit "Dump file not found for table: $local_table" 1 >> $LOAD_LOG
fi
# If the datefield contains "*", it means the table contents are fully
# replaced, otherwise use this as the field on which to limit the deletion.
if [ "$datefield" = "*" ]
then
CMD="DELETE FROM $local_table"
else
CMD="DELETE FROM $local_table WHERE $datefield >= current_date - 2"
fi
# Run the deletion command
echo -e "\\\\set ON_ERROR_STOP\\n$CMD;" | $PSQLCMD -q -f - > $TMPFILE 2>&1
# Report errors and stop the loop if any occured
rc=$?
if [ "$rc" != "0" ]
then
# Copy the error output, properly formatted, to the log file
sed "s/^/$(date +%Y-%m-%d%t%T) E /" $TMPFILE >> $LOAD_LOG
# Send mail message about the failure
rm -f $TMPFILE
errexit "Deletion failed with status $rc on table: $local_table" $rc >> $LOAD_LOG
fi
# Now run the load command
echo -e "\\\\set ON_ERROR_STOP\\n\\\\copy $local_table from $filename" | $PSQLCMD -q -f - > $TMPFILE 2>&1
rc=$?
# Check for errors and report
if [ "$rc" != "0" ]
then
# Copy the error output, properly formatted, to the log file
sed "s/^/$(date +%Y-%m-%d%t%T) E /" $TMPFILE >> $LOAD_LOG
# Send mail message about the failure
rm -f $TMPFILE
errexit "Copy failed with status $rc on table: $local_table" $rc >> $LOAD_LOG
fi
# Remove the dump file, as well as the output file from the psql command
rm -f "$filename"
# Update statistics with the ANALYZE command
dstamp N "Updating statistics for $local_table" >> $LOAD_LOG
echo -e "\\\\set ON_ERROR_STOP\\nANALYZE $local_table;" | $PSQLCMD -q -f - > $TMPFILE 2>&1
# Report errors and stop the loop if any occured
rc=$?
if [ "$rc" != "0" ]
then
# Copy the error output, properly formatted, to the log file
sed "s/^/$(date +%Y-%m-%d%t%T) E /" $TMPFILE >> $LOAD_LOG
# Send mail message about the failure
rm -f $TMPFILE
errexit "ANALYZE failed with status $rc on table: $local_table" $rc >> $LOAD_LOG
fi
done
pgsql-general by date: