Re: Make COPY extendable in order to support Parquet and other formats - Mailing list pgsql-hackers

From Aleksander Alekseev
Subject Re: Make COPY extendable in order to support Parquet and other formats
Date
Msg-id CAJ7c6TPcsFScSneXHJShZAfatcYS-VqX+TtVU8TAmHwnVoTioQ@mail.gmail.com
Whole thread Raw
In response to Re: Make COPY extendable in order to support Parquet and other formats  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: Make COPY extendable in order to support Parquet and other formats
List pgsql-hackers
Hi Ashutosh,

> An extension just for COPY to/from parquet looks limited in
> functionality. Shouldn't this be viewed as an FDW or Table AM support
> for parquet or other formats? Of course the later is much larger in
> scope compared to the first one. But there may already be efforts
> underway
> https://www.postgresql.org/about/news/parquet-s3-fdw-01-was-newly-released-2179/

Many thanks for sharing your thoughts on this!

We are using parquet_fdw [2] but this is a read-only FDW.

What users typically need is to dump their data as fast as possible in
a given format and either to upload it to the cloud as historical data
or to transfer it to another system (Spark, etc). The data can be
accessed later if needed, as read only one.

Note that when accessing the historical data with parquet_fdw you
basically have a zero ingestion time.

Another possible use case is transferring data to PostgreSQL from
another source. Here the requirements are similar - the data should be
dumped as fast as possible from the source, transferred over the
network and imported as fast as possible.

In other words, personally I'm unaware of use cases when somebody
needs a complete read/write FDW or TableAM implementation for formats
like Parquet, ORC, etc. Also to my knowledge they are not particularly
optimized for this.

[2]: https://github.com/adjust/parquet_fdw

-- 
Best regards,
Aleksander Alekseev



pgsql-hackers by date:

Previous
From: Andrey Borodin
Date:
Subject: Re: Use fadvise in wal replay
Next
From: Jakub Wartak
Date:
Subject: RE: Use fadvise in wal replay