Re: zheap: a new storage format for PostgreSQL - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: zheap: a new storage format for PostgreSQL |
Date | |
Msg-id | CAA4eK1L=mnOnJLP8WwQ5Q2PsHNGrEnwY5aZro2XDyxt15iHB4A@mail.gmail.com Whole thread Raw |
In response to | Re: zheap: a new storage format for PostgreSQL (Alexander Korotkov <a.korotkov@postgrespro.ru>) |
Responses |
Re: zheap: a new storage format for PostgreSQL
|
List | pgsql-hackers |
On Fri, Mar 2, 2018 at 4:05 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > On Fri, Mar 2, 2018 at 1:31 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> On Fri, Mar 2, 2018 at 1:50 PM, Tsunakawa, Takayuki >> <tsunakawa.takay@jp.fujitsu.com> wrote: >> > From: Amit Kapila [mailto:amit.kapila16@gmail.com] >> >> At EnterpriseDB, we (me and some of my colleagues) are working from >> >> more >> >> than a year on the new storage format in which only the latest version >> >> of >> >> the data is kept in main storage and the old versions are moved to an >> >> undo >> >> log. We call this new storage format "zheap". To be clear, this >> >> proposal >> >> is for PG-12. >> > >> > Wonderful! BTW, what "z" stand for? Ultimate? >> > >> >> There is no special meaning to 'z'. We have discussed quite a few >> names (like newheap, nheap, zheap and some more on those lines), but >> zheap sounds better. IIRC, one among Robert or Thomas has come up >> with this name. > > > I would propose "zero-bloat heap" disambiguation of zheap. Seems like fair > enough explanation for me without need to rename :) > It's been a while since we have updated the progress on this project, so here is an update. This is based on the features that were not working (as mentioned in Readme.md) when the branch was published. 1. TID Scans are working now. 2. Insert .. On Conflict is working now. 3. Tuple locking is working with a restriction that if there are more concurrent lockers on a page than the number of transaction slots on a page, then some of the lockers will wait till others get committed. We are working on a solution to extend the number of transaction slots on a separate set of pages which exist in heap, but will contain only transaction data. There are also some corner cases where it doesn't work for Rollbacks. 4. Foreign keys are working. 5. Vacuum/Autovacuum is working. 6. Rollback prepared transactions. Apart from this, we have fixed some other open issues. I think to discuss some of the designs, we need to start separate threads (like Thomas has already started a thread on undo logs[1]), but it is also okay to discuss on this thread as well. One specific thing where we need some input is about testing of this new heap. As of now, the idea we are using to test it is by having a guc parameter (storage_engine) which if set to zheap, all the regression tests will create tables in zheap and the operations are zheap specific. This basically works okay, but the results are different than expected in some cases like (a) in-place updates cause rows to be printed in different order (b) ctid based tests gives different results because zheap has a metapage and TPD pages, (c) \d+ show storage_engine as an option, etc. We workaround it by either creating a separate .out file for zheap or sometimes by masking the expected different output (like we don't allow to compare additional storage_engine option as output of \d+). I know this is not the best way to test a new storage engine, but for now it helped us a lot. I think we need some generic way to test new storage engines. I am not sure if it good to discuss it here or does this belong to Pluggable API thread. Any thoughts? [1] - https://www.postgresql.org/message-id/CAEepm%3D2EqROYJ_xYz4v5kfr4b0qw_Lq_6Pe8RTEC8rx3upWsSQ%40mail.gmail.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: