Re: custom function for converting human readable sizes to bytes - Mailing list pgsql-hackers
From | Vitaly Burovoy |
---|---|
Subject | Re: custom function for converting human readable sizes to bytes |
Date | |
Msg-id | CAKOSWNm+SSgGKYsv09n_6HhfWzmRr+cSjpgfhBPFf5nNPfJ5JQ@mail.gmail.com Whole thread Raw |
In response to | Re: custom function for converting human readable sizes to bytes (Pavel Stehule <pavel.stehule@gmail.com>) |
Responses |
Re: custom function for converting human readable sizes to bytes
|
List | pgsql-hackers |
Hello! I have reviewed this patch. It applies and compiles cleanly at the current master 1129c2b0ad2732f301f696ae2cf98fb063a4c1f8 and implements the behavior reached by a consensus. All size units (the same as used in the GUC) are supported. The documentation is present and describes behavior of the function. The code is mostly clear and commented enough, but I doubt about design (see p.3). Regression tests are present (see p.II). pg_proc.h has changed, so the CATALOG_VERSION_NO must be also changed. Code comments, error message string, DESCR in pg_proc.h and doc changes need proof-reading by a native English speaker, which I am not. === The last patch fixes only part of my last notes, so I can only return it to the next round with the same notes. The list of my notes has not been done (all points listed below was mentioned in my last review[1]): 1. The documentation has duplicate parts of size units (in table and in the text). I think that one of them should be deleted (from the table). 2. Test has a note about sharing(using) size units via(from) memory_unit_conversion_table and a requirement to update the documentation when new units are supported, but the note will not be seen because there is no tests for that fail for units ("PB", "EB", "ZB") which will change in the future by the "memory_unit_conversion_table". 3. Code design is looked a little odd for me. 3a. There is an attempt to check a correctness of a numeric value instead of using numeric_in which does the same thing. In the last case the code looks more compact and nice (see the full explanation in the previous review[1] and both PoC there). Possibly that way is chosen to emit its own error messages ("invalid size:" instead of "invalid input syntax for type numeric:"), but it still leads errors like 'invalid unit: "+ kB"' for input value "+912+ kB" (see "dbsize.out"). 3b. There is an extra copying from the buffer "str" to the "buffer" (it can be avoided and shown in an PoC "single_buffer.c"). 3c. There is freeing buffers "str" and "buffer" that looks as a counter-design since MemoryContext frees them at once with other expired buffers (like "num" from numeric_in and numeric_mul; also "mul_num" from int8_numeric) === While I was writing that text I found five additional notes. Both parts (above and below) are important (see concusion). I. The new version of the patch (pg-size-bytes-10.patch) has the simplest typo in an updated (from v9) row "inavalid size". II. There is no test for an empty input value, but it works as expected (an error 'invalid size: ""'). III. There is no support of 'bytes' unit, it seems such behavior got majority approval[2]. IV. Code design of the function "parse_memory_unit" is returning "bool" value and set "multiplier" via pointer. But while Tom Lane was reviewed my patch[3] in the similar case he changed code that now returns (see NonFiniteTimestampTzPart) not a flag, but a value (where zero still means an error). Be honest it looks better, but the current code is written like nearby functions. V. The documentation lacks a note that the base of the "pg_size_bytes" is 1024 whereas the base of the "pg_size_pretty" is 1000. === Concusion: I *can't* mark this patch as "Ready for Committer" with all previous notes, and I ask experienced hackers for a help. I'm not a strong Postgres hacker yet and sending to the next round with the same notes seems a nitpicking for me. So questions to the hackers: a. Are the rest of the notes from the previous review[1] (and listed in pp.1-3c) a nitpicking or should be done? b. Does the function in PoC "single_buffer.c" look more complex or easer? c. Is it worth for p.II to change error message like 'Empty input value is not a valid size for pg_size_bytes: "%s"'? d. Should the function "pg_size_bytes" handle "bytes" as a size unit (see p.III)? There was no disagreement, but if I saw that thread before I became a reviewer and saw silent approval I would have argued. Moreover supporting of negative sizes was explained[4] by a symmetry for the "pg_size_pretty", which should include all size units produced by the "pg_size_pretty". On the other hand those functions can't be symmetric due to difference in bases (see p.V and [5]) e. What way is preferred for p.IV? f. If p.V is actual, how to write it in the documentation (be honest I've no idea)? Should a description of the "pg_size_pretty" be changed too? [1]http://www.postgresql.org/message-id/CAKOSWNn=p8GX-P5Y-B4t4dg-aAHaTup+CrG41hQ9BeobZwX3KQ@mail.gmail.com [2]http://www.postgresql.org/message-id/CACACo5QW7fFsFfhKsTjtYcP4QF3Oh9zA14SC6Z3DXx2yssJjSw@mail.gmail.com [3]http://git.postgresql.org/pg/commitdiff/647d87c56ab6da70adb753c08d7cdf7ee905ea8a [4]http://www.postgresql.org/message-id/CA+TgmoZFomG4eYorZZGf7pzotG9PxpUhtQvxLfsKiM4iZH8KRQ@mail.gmail.com [5]http://www.postgresql.org/message-id/CA+TgmoYgS_xixUQy+xyw9futtMrTbC_S0c1C0_wOBwVbNf8aZA@mail.gmail.com
pgsql-hackers by date: