BUG #16300: Text line order corruption with COPY command - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #16300: Text line order corruption with COPY command |
Date | |
Msg-id | 16300-b952db3f81f7f40d@postgresql.org Whole thread Raw |
Responses |
Re: BUG #16300: Text line order corruption with COPY command
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 16300 Logged by: Hans Buschmann Email address: buschmann@nidsa.net PostgreSQL version: 12.2 Operating system: Windows Server 2019 64bit Description: A reproducable line order corruption occurs when copying a quite large test file into Postgres. I was trying to import and parse a big .xml file (about 41 MB, 643407 lines) into a simple import table using the following sequence: create database x86db template=template0 encoding 'UTF8' lc_collate='C'; \c x86db create table uops_imp2 ( cline varchar ) ; copy uops_imp2 from 'N:/downloads/uops_info_instructions_200226.xml'; or copy uops_imp2 from '/usr/local/hb/uops_info_instructions_200226.xml'; This was tested on different machines under Windows Server 2019 64bit and Fedora 31 x86-64 under Postgres 12.2 respective 12.1: x86db=# select version (); version ------------------------------------------------------------ PostgreSQL 12.2, compiled by Visual C++ build 1914, 64-bit (1 row) x86db=# select version (); version -------------------------------------------------------------------------------------------------------- PostgreSQL 12.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 9.2.1 20190827 (Red Hat 9.2.1-1), 64-bit (1 row) The original order of the input lines from the original file was verified under 2 different editors under Windows: notepad++ 7.8.5 x64 notepad (as build in), with status line turned on to show line numbers Here are shown the line 627365 til 627392: (the correct original) <doc TP="1.0"/> </architecture> </instruction> <instruction asm="VPMADDWD" category="AVX512" cpl="3" evex="1" extension="AVX512EVEX" iclass="VPMADDWD" iform="VPMADDWD_ZMMi32_MASKmskw_ZMMi16_MEMi16_AVX512" isa-set="AVX512BW_512" mask="0" string="VPMADDWD (ZMM, ZMM, M512)" zeroing="0"> <operand idx="1" name="REG0" type="reg" w="1" width="512" xtype="i32">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand> <operand idx="2" name="REG2" r="1" type="reg" width="512" xtype="i16">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand> <operand idx="3" memory-prefix="zmmword ptr" name="MEM0" r="1" type="mem" width="512" xtype="i16"/> <architecture name="SKX"> <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1" ports="1*p05+1*p23" uops="2" version="2.3"/> <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1" ports="1*p05+1*p23" uops="2" version="3.0"/> <measurement TP="0.54" TP_ports="0.50" ports="1*p05+1*p23" uops="2" uops_retire_slots="1"> <latency cycles="5" start_op="2" target_op="1"/> <latency cycles_addr="13" cycles_addr_is_upper_bound="1" cycles_addr_same_reg="14" cycles_addr_same_reg_is_upper_bound="1" cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/> </measurement> </architecture> <architecture name="CNL"> <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2" uops_retire_slots="1"> <latency cycles="5" start_op="2" target_op="1"/> <latency cycles_addr="13" cycles_addr_is_upper_bound="1" cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/> </measurement> </architecture> <architecture name="ICL"> <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2" uops_retire_slots="1"> <latency cycles="5" start_op="2" target_op="1"/> <latency cycles_addr="13" cycles_addr_is_upper_bound="1" cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/> </measurement> <doc TP="1.0"/> </architecture> </instruction> when querying the table by select * from uops_imp2 offset 627365 limit 27; I get a different part from the original lines with another line mangled in between (see ###) x86db=# x86db=# select * from uops_imp2 offset 627365 limit 27; cline ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- <latency cycles="5" start_op="4" target_op="1"/> </measurement> <doc TP="1.0"/> </architecture> </instruction> <instruction asm="VPMADDWD" category="AVX512" cpl="3" evex="1" extension="AVX512EVEX" iclass="VPMADDWD" iform="VPMADDWD_ZMMi32_MASKmskw_ZMMi16_MEMi16_AVX512" isa-set="AVX512BW_512" mask="0" string="VPMADDWD (ZMM, ZMM, M512)" zeroing="0"> <operand idx="1" name="REG0" type="reg" w="1" width="512" xtype="i32">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand> ### <latency cycles="6" start_op="2" target_op="1"/> <operand idx="2" name="REG2" r="1" type="reg" width="512" xtype="i16">ZMM0,ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6,ZMM7,ZMM8,ZMM9,ZMM10,ZMM11,ZMM12,ZMM13,ZMM14,ZMM15,ZMM16,ZMM17,ZMM18,ZMM19,ZMM20,ZMM21,ZMM22,ZMM23,ZMM24,ZMM25,ZMM26,ZMM27,ZMM28,ZMM29,ZMM30,ZMM31</operand> <operand idx="3" memory-prefix="zmmword ptr" name="MEM0" r="1" type="mem" width="512" xtype="i16"/> <architecture name="SKX"> <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1" ports="1*p05+1*p23" uops="2" version="2.3"/> <IACA TP="0.50" TP_ports="0.50" fusion_occurred="1" ports="1*p05+1*p23" uops="2" version="3.0"/> <measurement TP="0.54" TP_ports="0.50" ports="1*p05+1*p23" uops="2" uops_retire_slots="1"> <latency cycles="5" start_op="2" target_op="1"/> <latency cycles_addr="13" cycles_addr_is_upper_bound="1" cycles_addr_same_reg="14" cycles_addr_same_reg_is_upper_bound="1" cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/> </measurement> </architecture> <architecture name="CNL"> <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2" uops_retire_slots="1"> <latency cycles="5" start_op="2" target_op="1"/> <latency cycles_addr="13" cycles_addr_is_upper_bound="1" cycles_mem="10" cycles_mem_is_upper_bound="1" start_op="3" target_op="1"/> </measurement> </architecture> <architecture name="ICL"> <measurement TP="1.00" TP_ports="1.00" ports="1*p0+1*p23" uops="2" uops_retire_slots="1"> <latency cycles="5" start_op="2" target_op="1"/> (27 rows) In all cases i tried the original order of the lines was not preserved and the disorder was the same. The count of all lines seems correct: x86db=# select count(*) from uops_imp2; count -------- 643407 (1 row) The same error occurred when using \copy on the psql client side. To reproduce, the XML-file is directly downloadable under the following address: https://uops.info/xml.html and choosing the file instructions.xml I have not further analyzed other regions of line order corruption because it is very difficult when you cant rely on postgres COPY. I fear similar problems could occur when restoring a pg_dump file, which also relies on copy commands. Thanks in advance Hans Buschmann
pgsql-bugs by date: