Home > mailing lists

Re: speed up full table scan using psql - Mailing list pgsql-general

From	Adrian Klaver
Subject	Re: speed up full table scan using psql
Date	May 31, 2023 21:43:09
Msg-id	d6b0d93a-e0fe-c92a-bec1-1f4de5627952@aklaver.com Whole thread Raw
In response to	Re: speed up full table scan using psql (Lian Jiang <jiangok2006@gmail.com>)
Responses	Re: speed up full table scan using psql
List	pgsql-general

Tree view

On 5/31/23 13:57, Lian Jiang wrote:
> The command is: psql $db_url -c "copy (select row_to_json(x_tmp_uniq) 
> from public.mytable x_tmp_uniq) to stdout"
> postgres version:  14.7
> Does this mean COPY and java CopyManager may not help since my psql 
> command already uses copy?

I don't think the issue is COPY itself but row_to_json(x_tmp_uniq).

This:

https://towardsdatascience.com/spark-essentials-how-to-read-and-write-data-with-pyspark-5c45e29227cd

indicates Spark can use CSV as an input source.

Given that I would just COPY the data out as CSV.

> 
> Regarding pg_dump, it does not support json format which means extra 
> work is needed to convert the supported format to jsonl (or parquet) so 
> that they can be imported into snowflake. Still exploring but want to 
> call it out early. Maybe 'custom' format can be parquet?
> 
> 
> Thanks
> Lian

-- 
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

From: Lian Jiang
Date: 31 May 2023, 20:57:39
Subject: Re: speed up full table scan using psql

From: Adrian Klaver
Date: 31 May 2023, 21:50:10
Subject: Re: speed up full table scan using psql

Re: speed up full table scan using psql - Mailing list pgsql-general

Previous

Next