Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution. - Mailing list pgsql-bugs

From Abhishek Mittal
Subject Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution.
Date
Msg-id CAEHmedAWf-7LBuKNA_CDnuQq+NQo0=uzzcMyk017EAf71kxMdg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution.  (Junwang Zhao <zhjwpku@gmail.com>)
List pgsql-bugs
Hi Junwangm,

Could you please try with the program, which was provided by me.

Regards
Abhishek Mittal

On Thu, Sep 19, 2024 at 6:01 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
Hi Abhishek,

On Thu, Sep 19, 2024 at 7:04 PM Abhishek Mittal <abhmittal@gmail.com> wrote:
>
> Hi Junwang,
>
> Please find the table schema below along with the attached sample CSV file.
> test=> \d my_table
>                        Table "public.my_table"
>      Column     |          Type           | Collation | Nullable | Default
> ----------------+-------------------------+-----------+----------+---------
>  id             | integer                 |           |          |
>  uindex         | integer                 |           |          |
>  pindex         | integer                 |           |          |
>  hid            | integer                 |           |          |
>  hcode          | bigint                  |           |          |
>  len            | integer                 |           |          |
>  uname          | character varying(4096) |           |          |
>  plen           | integer                 |           |          |
>  pname          | character varying(8192) |           |          |
>
> Note: There are no indexes on any of the columns.
>
> Regards
> Abhishek Mittal
>
> On Thu, Sep 19, 2024 at 2:57 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
>>
>> On Thu, Sep 19, 2024 at 4:28 PM PG Bug reporting form
>> <noreply@postgresql.org> wrote:
>> >
>> > The following bug has been logged on the website:
>> >
>> > Bug reference:      18624
>> > Logged by:          Abhishek Mittal
>> > Email address:      abhmittal@gmail.com
>> > PostgreSQL version: 12.16
>> > Operating system:   ubuntu0.20.04.1
>> > Description:
>> >
>> > We are currently experiencing a memory leak issue with the PostgreSQL
>> > connection process while executing the COPY command multiple times on the
>> > same connection.
>> >
>> > Issue Details:
>> > 1. We establish a connection using the following command:
>> > psql <DB_NAME> <USER_NAME>
>> >
>> > 2. We retrieve the process ID associated with this connection from the
>> > pg_stat_activity table using the query:
>> > SELECT pid, query FROM pg_stat_activity;
>> >
>> > 3. We then monitor this process ID using the top command:
>> > top -p <PID>
>> >
>> > 4. We observe that the RES (resident) memory reported by the top command
>> > continuously increases. This memory increase only occurs when we run queries
>> > on this connection. If no queries are executed, the RES memory remains
>> > stable.
>> >
>> > 5. We attempted to clear the cache by executing the DISCARD ALL command
>> > after each query, but the issue persists.
>> >
>> > Additionally, we tested the process using a C program with libpg, and the
>> > issue remains the same.
>> >
>> > Additional Information:
>> > We are executing only the COPY command to copy a CSV file into a database
>> > table.
>> > In our scenario, the RES memory usage has reached approximately 10 GB.
>> >
>> > Could you please provide guidance or recommend a solution to address this
>> > memory leak issue?
>> >
>> >
>> > Please find the sample C program below.
>> > int db_copy_into_db(PGconn *conn, char *buf, int read_bytes)
>> > {
>> >   PGresult *res;
>> >   char copy_cmd[1024];
>> >   int ret;
>> >
>> >   /*If read_bytes are 0 then no need for further processing */
>> >   if(!read_bytes)
>> >     return 0;
>> >
>> >   sprintf(copy_cmd, "COPY %s FROM STDIN WITH DELIMITER ',' NULL AS '' CSV",
>> > "my_table");
>> >   res = PQexec(conn, copy_cmd);
>> >   if (PQresultStatus(res) != PGRES_COPY_IN)
>> >   {
>> >     fprintf(stderr,"COPY command for %s table failed: %s\n",
>> > "urltable_1114", PQerrorMessage(conn));
>> >     conn = NULL;
>> >     return -1;
>> >   }
>> >   //fprintf(stdout ,"COPY command is started\n");
>> >
>> >   ret = PQputCopyData(conn, buf, read_bytes);
>> >   //fprintf(stdout, "ret = %d\n", ret);
>> >   if(ret == -1)
>> >   {
>> >     fprintf(stderr, "PQputCopyData command failed: %s\n",
>> > PQerrorMessage(conn));
>> >     conn = NULL;
>> >     return -1;
>> >   }
>> >
>> >   /* Ends the COPY_IN operation successfully if errormsg is NULL.
>> >    * If errormsg is not NULL then the COPY is forced to fail,
>> >    * with the string pointed to by errormsg used as the error message. */
>> >   if(PQputCopyEnd(conn, NULL) != 1)
>> >   {
>> >     fprintf(stderr,"PQputCopyEnd() command for %s table failed: %s\n",
>> > "urltable_1114", PQerrorMessage(conn));
>> >     conn = NULL;
>> >   }
>> >   PQclear(res);
>> >   res = PQexec(conn, "DISCARD ALL");
>> >
>> >   if (PQresultStatus(res) != PGRES_COMMAND_OK)
>> >   {
>> >     fprintf(stderr, "DISCARD ALL command failed: %s",
>> > PQerrorMessage(conn));
>> >   }
>> >   PQclear(res);
>> >   return 0;
>> > }
>> >
>> > void read_csv_and_copy_inio_db( char *csv_file, PGconn *conn)
>> > {
>> >  char* read_buf = malloc(sizeof(char) * 1024 * 1024 * 20);
>> >  int read_bytes;
>> >
>> >  /* 1. open file
>> >   * read bufer
>> >   */
>> >   int fd;
>> >   int status;
>> >   fd = open(csv_file, O_RDONLY);
>> >   while (1)
>> >   {
>> >     //seek to byke 0
>> >     status = lseek(fd, 0, SEEK_SET);
>> >     if(status == -1)
>> >       fprintf(stderr, "Failed to seek offset file\n");
>> >     else
>> >       printf("Seeking to 0\n");
>> >     read_bytes = read(fd, read_buf, 1024 * 1024 * 20);
>> >     db_copy_into_db(conn, read_buf, read_bytes);
>> >
>> >     getc(stdin);
>> >   }
>> > }
>> >
>> > int main()
>> > {
>> >   char csv_file[1024 * 10];
>> >   char *conn_info = "dbname=my_db user=abhi";
>> >   PGconn *conn = PQconnectdb(conn_info);
>> >   if(PQstatus(conn) != CONNECTION_OK)
>> >   {
>> >     fprintf(stderr, "Connection to database failed: %s\n",
>> > PQerrorMessage(conn));
>> >     if(conn)
>> >     {
>> >       PQfinish(conn);
>> >       conn = NULL;
>> >     }
>> >     return -1;
>> >   }
>> >   sprintf(csv_file,"/tmp/my_table.csv");
>> >   read_csv_and_copy_inio_db(csv_file, conn);
>> > }
>> >
>>
>> Can you please provide the table schema and my_table.csv file?
>>
>> --
>> Regards
>> Junwang Zhao
>
>
>
> --
> amittal

I tried to reproduce the bug you reported against postgresql master
but no luck, I built from source with asan, I didn't observe any leak.

I did not test against 12.16 yet because I got some compiling issues.

I use the following command btw, not sure if this is enough to
reproduce the bug.

select 'COPY my_table from ''/tmp/my_table.csv'' CSV' FROM
generate_series(0,10000) \gexec

--
Regards
Junwang Zhao


--
amittal

pgsql-bugs by date:

Previous
From: Junwang Zhao
Date:
Subject: Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution.
Next
From: Andrei Lepikhov
Date:
Subject: Volatile functions under Memoize node