Re: [PATCH] [v8.5] Security checks on largeobjects - Mailing list pgsql-hackers
From | KaiGai Kohei |
---|---|
Subject | Re: [PATCH] [v8.5] Security checks on largeobjects |
Date | |
Msg-id | 4A5FD58B.5070201@ak.jp.nec.com Whole thread Raw |
In response to | Re: [PATCH] [v8.5] Security checks on largeobjects (KaiGai Kohei <kaigai@ak.jp.nec.com>) |
Responses |
Re: [PATCH] [v8.5] Security checks on largeobjects
|
List | pgsql-hackers |
I summarized the design proposal and issues currently we have. I would like to see any comments corresponding to the proposition. Especially, selection of the snapshot is a headach issue for me. ---------------- This project tries to solve two items listed at: http://wiki.postgresql.org/wiki/Todo#Binary_Data * Add security checks for large objects* Allow read/write into TOAST values like large objects = Introduction = We need to associate a metadata for a certain largeobject to implement security checks for largeobjects. However, the data structure of largeobjects are not suitable to manage its metadata (such as owner identifier, database acls ...) on a certain largeobject, because a largeobject is stored as separated page frames in the pg_largeobject system catalog. Thus, we need to revise the data structure to manage a certain largeobject. An interesting fact is a similarity of data structure between TOAST table and pg_lageobject. A TOAST relation is declared as follows: pg_toast_%u ( chunk_id oid, chunk_seq int4, chunk_data bytea, unique(chunk_id, chunk_seq) ) Definition of the pg_largeobject is as follows: pg_largeobject( loid oid, pageno int4, data bytea, unique(loid, pageno) ) They have an identical data structure, so it is quite natural to utilize TOAST mechanism to store pagef rames of largeobject. = Design = In my plan, role of the pg_largeobject will be changed to manage metadata of largeobjects, and it will be redefined as follows: CATALOG(pg_largeobject,2613) { Oid loowner; /* OID of the owner */ Oid lonsp; /* OID ofthe namespace */ aclitem loacl[1]; /* ACL of the largeobject */ Blob lodata; /* Contents of thelargeobject */ } FormData_pg_largeobejct; For access controls purpose, its ownership, namespace and ACLs are necessary. In addition, the Blob is a new type which support to read/write a its partial TOAST value. The current lo_xxx() interfaces will perform as a wrapper function to access a certain pg_largeobject.lodata identified by a largeobject handler. The loread(), lowrite() or similar interfaces will support partial accesses on the Blob type. It enables user defined relation to contain large data using TOAST mechanism, with reasonable resource comsumption. (Note that TOAST replaces whole of the chunks with same identifier, even if it changes just a single byte.) = Interfaces = == New type == We need a new variable length type that has the following feature, to allow users partial accesses. * It always use external TOAST table, independent from its size.If toasted data is stored as inline, we cannot update itindependent from the main table.It does not prevent partial read, but meaningless because inlined data is enough small. * It always store the data without any compression.We cannot easily compute required data offset on the compresseddata. Allthe toasted data need to be uncompressed, for both ofreader and writer access. == lo_xxx() interfaces == A new version of loread() and lowrite() are necessary to access a part of toasted data within user defined tables. It can be defined as follows: loread(Blob data, int32 offset, int32 length) lowrite(Blob data, int32 offset, Bytea data) == GRANT/REVOKE == When we access traditional largeobjects, reader permission (SELECT) or writer permission (UPDATE) should be checked on accesses. The GRANT/REVOKE statements are enhanced as follows: GRANT SELECT ON LARGE OBJECT 1234 TO kaigai; It allows "kaigai" to read the largeobject: 1234. = Issues = == New pg_type.typstorage == The variable length data is always necessary to be stored in external storage and uncompressed. The existing typstorage does not satisfies the requirement, so we need to add a new pg_type.typstorage strategy. The new typstorage strategy forces:- It always stores the given varlena data on external toast relation.- It always storesthe given varlena data without any compression. It will give us performance loss, so existing Text or Bytea will be more suitable to store variable length data being not very large. == Snapshot == The largeobject interface uses SnapshotNow for writable accesses, and GetActiveSnapshot() for read-only accesses, but toast_fetch_datum() uses SnapshotToast to scan the toast relation. It seems to me SnapshotToast depends on an assumption that tuples within TOAST relation does not have any multiple versions. When we update a toast value, TOAST mechanism inserts whole of variable length datum with a new chunk_id, and older chunks are removed at toast_delete_datum(). The TOAST pointer is updated to the new chunk_id, and its visibility is under MVCC controls. The source code comments at HeapTupleSatisfiesToast() says as follows: /** HeapTupleSatisfiesToast* True iff heap tuple is valid as a TOAST row.** This is a simplified version that only checksfor VACUUM moving conditions.* It's appropriate for TOAST usage because TOAST really doesn't want to do* its own timequal checks; if you can see the main table row that contains* a TOAST reference, you should be able to see the TOASTedvalue. However,* vacuuming a TOAST table is independent of the main table, and in case such* a vacuum fails partwaythrough, we'd better do this much checking.** Among other things, this means you can't do UPDATEs of rows in a TOAST*table.*/ If largeobjects-like interface is available to update a part of TOAST values, we cannot keep the assumption. At the beginning, I have a plan to apply the result od GetActiveSnapshot() to fetch toasted value. Is there any matter which can be expected? -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai@ak.jp.nec.com>
pgsql-hackers by date: