Re: patch - per-tablespace random_page_cost/seq_page_cost - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: patch - per-tablespace random_page_cost/seq_page_cost |
Date | |
Msg-id | 603c8f070912171815s43071f77o41d9f12b3b54c31c@mail.gmail.com Whole thread Raw |
In response to | Re: patch - per-tablespace random_page_cost/seq_page_cost (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: patch - per-tablespace random_page_cost/seq_page_cost
|
List | pgsql-hackers |
On Thu, Nov 26, 2009 at 4:25 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sat, Nov 14, 2009 at 4:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I don't think there's even a >> solid consensus right now on which GUCs people would want to set at the >> tablespace level. > > This seems like an important point that we need to nail down. The > original motivation for this patch was based on seq_page_cost and > random_page_cost, to cover the case where, for example, one tablespace > is on an SSD and another tablespace is on a RAID array. > > Greg Stark proposed adding effective_io_concurrency, and that makes > plenty of sense to me, but I'm sort of disinclined to attempt to > implement that as part of this patch because I have no familiarity > with that part of the code and no hardware that I can use to test > either the current behavior or the modified behavior. Since I'm > recoding this to use the reloptions mechanism, a patch to add support > for that should be pretty easy to write as a follow-on patch once this > goes in. > > Any other suggestions? Going once... going twice... since no one has suggested anything or spoken against the proposal above, I'm just going to implement seq_page_cost and random_page_cost for now. > Current version of patch is attached. I've revised it to use the > reloptions stuff, but I don't think it's committable as-is because it > currently thinks that extracting options from a pg_tablespace tuple is > a cheap operation, which was true in the non-reloptions-based > implementation but is less true now. At least, some benchmarking > needs to be done to figure out whether and to what extent this is an > issue. Per the email that I just sent a few minutes ago, there doesn't appear to be a performance impact in doing this even in a relatively stupid way - every call that requires seq_page_cost and/or random_page_cost results in a syscache lookup and then uses the relcache machinery to parse the returned array. That leaves the question of what the most elegant design is here. Tom suggested upthread that we should tag every RelOptInfo - and, presumably, IndexOptInfo, though it wasn't discussed - with this information. I don't however much like the idea of adding identically named members in both places. Should the number of options expand in the future, this will become silly very quickly. One option is to define a struct with seq_page_cost and random_page_cost that is then included in RelOptInfo and IndexOptInfo. It would seem to make sense to make the struct, rather than a pointer to the struct, the member, because it makes the copyfuncs/equalfuncs stuff easier to handle, and there's not really any benefit in incurring more palloc overhead. However, I'm sort of inclined to go ahead and invent a mini-cache for tablespaces. It avoids the (apparently insignificant) overhead of reparsing the array multiple times, but it also avoids bloating RelOptInfo and IndexOptInfo with more members than really necessary. It seems like a good idea to add one member to those structures anyway, for reltablespace, but copying all the values into every one we create just seems silly. Admittedly there are only two values right now, but again we may want to add more someday, and caching at the tablespace level just seems like the right way to do it. Thoughts? ...Robert
pgsql-hackers by date: