Re: explanation for random_page_cost is outdated - Mailing list pgsql-docs
From | yigong hu |
---|---|
Subject | Re: explanation for random_page_cost is outdated |
Date | |
Msg-id | CAOxFffcourucFqSk+tZA13ErS3XRYkDy6EeaPff4AvHGiEEuug@mail.gmail.com Whole thread Raw |
In response to | Re: explanation for random_page_cost is outdated (Олег Самойлов <splarv@ya.ru>) |
Responses |
Re: explanation for random_page_cost is outdated
|
List | pgsql-docs |
Sorry to hijack the thread, I also recently have similar observation that the statement about random_page_cost on SSD is ambiguous. The current document says that
> Storage that has a low random read cost relative to sequential, e.g. solid-state drives, might also be better modeled with a lower value for random_page_cost.
However, this statement does not clarify what values might be good. For some workload, the default value 4.0 would cause bad performance and lowering random_page_cost to a value 3.0 or 2.0 does not solve the performance problem. Only when the random_page_cost is lowered to below 1.2 will the bad performance be mitigated. Thus, I would suggest elaborating on this description further as:
> Storage that has a low random read cost relative to sequential, e.g. solid-state drives, might also be better modeled with a value that is close to 1 for random_page_cost.
Detail:
I run the PostgreSQL 11 on an SSD hardware. The database has two small tables with 6MB and 16MB separately. The pgbench runs a select join query in 1 min. The result shows that when the random_page_cost is 1, the average latency is 14ms. When the random_page_cost is 1.5, 2, 3 or 4, the average latency is 26ms. This result suggests that setting random_page_cost to a value larger than 1.5 would cause almost 2x latency. If I increase the 6MB table to 60MB and rerun the sysbench, the result shows that when the random_page_cost is 1, the average latency is 13ms. When the random_page_cost is 1.5,2,3 or 4, the average latency is 17ms.
I attached my testing script, the postgresql configuration file, and planner output.
> Storage that has a low random read cost relative to sequential, e.g. solid-state drives, might also be better modeled with a lower value for random_page_cost.
However, this statement does not clarify what values might be good. For some workload, the default value 4.0 would cause bad performance and lowering random_page_cost to a value 3.0 or 2.0 does not solve the performance problem. Only when the random_page_cost is lowered to below 1.2 will the bad performance be mitigated. Thus, I would suggest elaborating on this description further as:
> Storage that has a low random read cost relative to sequential, e.g. solid-state drives, might also be better modeled with a value that is close to 1 for random_page_cost.
Detail:
I run the PostgreSQL 11 on an SSD hardware. The database has two small tables with 6MB and 16MB separately. The pgbench runs a select join query in 1 min. The result shows that when the random_page_cost is 1, the average latency is 14ms. When the random_page_cost is 1.5, 2, 3 or 4, the average latency is 26ms. This result suggests that setting random_page_cost to a value larger than 1.5 would cause almost 2x latency. If I increase the 6MB table to 60MB and rerun the sysbench, the result shows that when the random_page_cost is 1, the average latency is 13ms. When the random_page_cost is 1.5,2,3 or 4, the average latency is 17ms.
I attached my testing script, the postgresql configuration file, and planner output.
On Mon, Apr 27, 2020 at 3:19 AM Олег Самойлов <splarv@ya.ru> wrote:
Yep. Unclear. What parameter is recommended for SSD? Lower? 3? 2? 1?
Much better will be write: if you use SSD set 1.
Олег
> 19 марта 2020 г., в 23:56, Bruce Momjian <bruce@momjian.us> написал(а):
>
> On Thu, Feb 27, 2020 at 02:48:44PM +0000, PG Doc comments form wrote:
>> The following documentation comment has been logged on the website:
>>
>> Page: https://www.postgresql.org/docs/12/runtime-config-query.html
>> Description:
>>
>> Explanation for random_page_cost is rather outdated, because it did only for
>> case of mechanical hdd. But all modern database servers, which I know, made
>> upon SSD. Do or not do default value for random_page_cost equal to 1 is the
>> question, but, IMHO, at list in the documentation about random_page_cost
>> need to add in a speculation about SSD.
>>
>> It's important because a business programming now is mostly web programming.
>> Most database is poorly designed by web programmer, tables looked like a
>> primary key and a huge json (containing all) with large gin index upon it.
>> Now I am seeing a table with a GIN index 50% of the table size. The database
>> is on SSD, of cause. With default random_page_cost=4 GIN index don't used
>> by planner, but with random_page_cost=1 the result may be not excellent, but
>> acceptable for web programmers.
>
> Does this sentence in the random_page_cost docs unclear or not have enough
> visibility:
>
> https://www.postgresql.org/docs/12/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS
>
> Storage that has a low random read cost relative to sequential, e.g.
> solid-state drives, might also be better modeled with a lower value for
> random_page_cost.
>
> --
> Bruce Momjian <bruce@momjian.us> https://momjian.us
> EnterpriseDB https://enterprisedb.com
>
> + As you are, so once was I. As I am, so you will be. +
> + Ancient Roman grave inscription +
Attachment
pgsql-docs by date: