Re: pgbench: new feature allowing to launch shell commands - Mailing list pgsql-hackers
From | Greg Smith |
---|---|
Subject | Re: pgbench: new feature allowing to launch shell commands |
Date | |
Msg-id | 4B160F7C.20700@2ndquadrant.com Whole thread Raw |
List | pgsql-hackers |
Here's a first round of review of the patch submitted at http://archives.postgresql.org/message-id/c64c5f8b0910252350w42f7ea13g52a6a88a86143374@mail.gmail.com This is moving along nicely, and is now working (with some hacking) for the one use case I wanted it for. High-level summary of what I think needs to get done before this is ready to commit: 1) Needs tab/space formatting cleaned up 2) "Execution of meta-command failed" errors are a small but serious problem 3) Should consider how :variable interpretation should work in a \[set]shell call 4) General code cleanup, including possible refactoring 5) Update pgbench docs to cover new calls. I hoped to find time to help with this, it looks like I'm not going to have it in the near future. 6) Should do basic performance regression testing to confirm this patch doesn't impact pgbench results that don't use the new feature. This I'll take care of, I'm not particularly worried about that based on what's been changed so far. Details: This revision is much improved over the earlier ones. No problems applying the patch, and the basics work as expected. One bit of code formatting nitpicking: there are some spots where the code doesn't line up horizontally as expected. I'm seeing lines spaced across with tabs mixed with ones where spaces were used, at least one line where I think you formatted presuming 8-character tabs. That's kind of painful to expect a committer to clean up. See http://wiki.postgresql.org/wiki/Developer_FAQ#What.27s_the_formatting_style_used_in_PostgreSQL_source_code.3F for more details about expectations here. Attached are the scripts I used for testing the feature: skewed-acct.pl : perl script that takes in the number of accounts and picks one with a skewed Pareto-ish distribution: 80% of the time, one from the first 20% of the accounts is picked. skewed-select.sql : pgbench script that calls the script skewed-test.sh : Little test program to confirm that the Perl code works as it's supposed to, for anyone who wants to hack on it for some reason (my Perl is miserable) Here's what I did to test: sudo ln -s `pwd`/skewed-acct.pl /usr/local/bin createdb pgbench pgbench -i -s 10 pgbench pgbench -T 10 -j 4 -c 8 -S pgbench pgbench -T 10 -j 4 -c 8 -f skewed-select.sql pgbench Baseline gives me about 20K selects/second, with the shell call in the middle that dropped to around 400/second. Since we know shell calls are expensive that's not too much of a surprise. It does make an interesting statement about the overhead here though--if you want to instrument something you expect to execute thousands of times a second, that's probably not going to be possible using these calls. I didn't try yet to see how small I could make the shell overhead smaller (using a simpler script instead of the current Perl one, minimizing the number of characters in the pgbench script) To watch what was happening, I needed to toggle on "log_statement=all". That will confirm that the skew was working properly, which is good to see because it didn't as I originally wrote it. There are two functional problems that jumped right out at me with the implementation. First, sometimes the call fails. On every run, I'm getting one or more of these (note that each of the 4 threads has two clients running against it, 0 and 1): setshell: error getting parameterClient 0 aborted in state 1. Execution of meta-command failed. setshell: error getting parameterClient 1 aborted in state 1. Execution of meta-command failed. Right near the end. Am guessing there's some last-transaction cleanup going awry. Don't know why that is, and will be curious if it can be reproduced. A much more insidious issues is that shell and setshell don't do interpretation of pgbench variables like ":naccounts". The way I originally wrote skewed-select.sql (which you can still see commented out in the attached version) it looked like this: \set naccounts 100000 * :scale \setshell aid skewed-acct.pl :naccounts SELECT abalance FROM pgbench_accounts WHERE aid = :aid; This didn't work through--that was calling the script with the literal text ":naccounts" rather than the value for that. Similarly, when I tried to add some debugging bits to the end like this: \shell echo :aid That just printed ":aid" rather than the value. I would like to see both of these work as written above (the versions commented out in the attached sample). Now, it's possible to work around that limitation in some cases--the attached version just hard-codes in the number of accounts given the scale I was testing at. This isn't really ideal though. Unfortunately for you, the way :variable decoding is handling inside pgbench right now doesn't even have to consider things like quoting. The variable decoding is done at the exact places it makes sense at, rather than being done more globally. That means the existing replacement logic can't be re-used for this feature. And we can certainly expect that shell commands might have a ":" in them anyway, so doing this right ends up leading toward things like making "::" be a ":" that doesn't get substituted. Maybe this whole issue can be avoided as just being more complicated than this feature justifies. I think people will be surprised, and find this not as useful as it could be, unless this is done though. Also, this doesn't work \setshell :aid skewed-acct.pl 1000000 Which I think will surprise people. If there's a : as the first character of the second field here, it should just get clipped off. As for more general code review, one thing that jumped out at me was this: if (fgets(res, 64, respipe) == NULL) This should use sizeof(res) instead of hard-coding its size like that here. So far as general structure goes, actually adding the variable substitution complexity to the shell/setshell code is going to make it even more complicated, and it already sticks out as badly fitting into doCustom as it is. I suspect this capability will make for a really unmanagable result. Maybe refactor the new shell stuff a bit into another subroutine now that you've finished the main guts? doCustom feels like it's grown way beyond its original purpose here with this patch, it was on the edge of that before you started--and these two new commands are by far the most complicated. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com \set naccounts 100000 * :scale -- This doesn't work: --\setshell aid skewed-acct.pl :naccounts -- Have to hard-code the value instead: \setshell :aid skewed-acct.pl 1000000 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; -- This doesn't work as expected either -- \shell echo :aid
Attachment
pgsql-hackers by date: