Re: [PATCH] Simple progress reporting for COPY command - Mailing list pgsql-hackers
From | Josef Šimánek |
---|---|
Subject | Re: [PATCH] Simple progress reporting for COPY command |
Date | |
Msg-id | CAFp7Qwop_j94NaZ4OoOSNpCiLY3m+eq6WO3RmCWCw7Y0tOL_Fw@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Simple progress reporting for COPY command (Josef Šimánek <josef.simanek@gmail.com>) |
Responses |
Re: [PATCH] Simple progress reporting for COPY command
|
List | pgsql-hackers |
I'm attaching the whole patch since commitfest failed to ingest the last incremental on CI. út 5. 1. 2021 v 2:32 odesílatel Josef Šimánek <josef.simanek@gmail.com> napsal: > > út 5. 1. 2021 v 0:46 odesílatel Tomas Vondra > <tomas.vondra@enterprisedb.com> napsal: > > > > Hi, > > > > I did take a quick look today, and I have a couple minor comments: > > Hi! Thanks for your time. > > > 1) The catalog sgml docs seem to mention bytes_processed twice (one of > > that should be bytes_total), and line_processed (should be "lines_"). > > Fixed in attached patch. > > > > > 2) I'm not quite sure about not including any info about the command. > > For example pg_stat_progress_create_index includes info about the > > command, although it's not very detailed. Not sure how useful would it > > be to show just COPY FROM / COPY TO, without more details. > > > > It's probably possible to extract this from pg_stat_activity, but that > > may be rather cumbersome (parsing arbitrary SQL and all that). Also, > > what if the COPY is called from a function etc.? > > Any idea where to discuss this? My usecase is really simple. I would > like to be able to see progress of COPY command by pid. There is a lot > of COPY info inside CopyToStateData and CopyFromStateData structs. The > only limitation I see is support of only int64 values for progress > reporting columns. I'm not sure if it is easily possible to expose for > example filename this way. > > > > > 3) I wonder if we should have something like lines_estimated. For COPY > > TO it's pretty simple to calculate it as > > > > bytes_total / (bytes_processed / lines_processed) > > > > but what about > > > > COPY (query) TO file > > > > In that case we don't know the total amount of bytes / rows, so we can't > > calculate any estimate. So maybe we could peek into the query plan? But > > I agree this is something we can add later. > > If I remember well one of the original ideas was to be able to pass > custom bytes_total (or lines_total) by COPY options to be stored in > copy progress. I can add this in some following patch if still > welcomed. > > For estimates I would prefer to add an additional column to not mix > those two together (or at least boolean estimated = true/false and > reuse bytes_total column). If query estimates are welcomed, I can take > a look at how to reach the query plan and expose those numbers when > the query is used to estimated_lines and potentially estimated_bytes > columns. It would be probably a little tricky to calculate > estimated_bytes for some column types. > > Also currently only COPY FROM file supports bytes_total (it is 0 for > all other scenarios). I think it should be possible for PostgreSQL to > know the actual amount of lines query returns for some kind of > queries, but I have no idea where to look at this. If that's possible > to get, it would be one of the next steps to introduce additional > column lines_total. > > > > > 4) This comment is a bit confusing, as it mixes "total" and "processed". > > I'd just say "number of bytes processed so far" instead. > > > Fixed in attached patch. > > > > Other than that, it seems fine. I'm sure we could add more features, but > > it seems like a good start - I plan to get this committed once I get a > > patch fixing the docs issues. > > Patch is attached, it should be applied to the top of the previous > patch. Overall patch (having both patches merged together) could be > found at https://patch-diff.githubusercontent.com/raw/simi/postgres/pull/6.patch. > > > regards > > > > -- > > Tomas Vondra > > EnterpriseDB: http://www.enterprisedb.com > > The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: