vary read_only in SPI calls? or poke at the on-entry snapshot? - Mailing list pgsql-hackers
From | Chapman Flack |
---|---|
Subject | vary read_only in SPI calls? or poke at the on-entry snapshot? |
Date | |
Msg-id | 5BA30616.70207@anastigmatix.net Whole thread Raw |
Responses |
Re: vary read_only in SPI calls? or poke at the on-entry snapshot?
|
List | pgsql-hackers |
SOME BACKGROUND: The code for PL/Java functions resides in 'class images'. These are made available, in advance, by loading them from 'jar files' into some tables in the database. When a PL/Java function is called, if the JVM hasn't loaded its class image yet, the JVM calls PL/Java's loader, which does an SPI query to retrieve the image. This SPI query, only needed on the first reference, is effectively part of the activity of the PL/Java function that has been called. Among other things, that means the read_only SPI parameter is passed as true if the PL/Java function in question is declared IMMUTABLE. (A purist reaction to this might be to say no PL/Java function ever deserves to be declared IMMUTABLE then, as the first call can generate a database query to fetch the function code. But it has been that way a long time, and as a practical matter, I'm not sure how much less "immutable" that is than whatever happens to load/map/retrieve the needed code in other PLs. It still strikes me as reasonable to declare a function immutable as long as that's a fair description of what the function itself does.) A headache does result in one corner case though. THE HEADACHE: The operation of making class images available in advance, from a jar file, is done by the install_jar() function. The function reads images from the jar file and stores them in the database. It can then also execute SQL commands contained in the file. These can be declarations of the functions and types and any other actions appropriate when installing the jar. They can include queries that use the types and functions that just got declared. A query that uses a newly-declared type will illustrate the headache, because it results in a nested call into PL/Java for the type I/O function, which is (typically) declared immutable. If it's the first use of the class, the JVM will need to call the loader. The loader does an SPI query to retrieve the image. Because this takes place 'during' a call to an 'immutable' function, read_only => true is passed to SPI. One effect of read_only => true in SPI is to rely on a snapshot from the very start of the operation---when the jar file just loaded wasn't there yet! Therefore, the loader fails to find the class. So, this headache only affects references to the classes in a jar from the install commands contained in the same jar, and only when some of those install commands refer to immutable functions, and only when such commands occur before any call to a stable or volatile function that happens to be defined in the same class. It can be worked around by reordering commands in the jar or by adding a no-op function declared stable or volatile and adding an early query that calls it. But these are hacks, and I would like to address the root issue so hacks aren't necessary. PASS read_only => false FROM THE CLASS LOADER? One obvious approach would be to just arrange for read_only always to be false in SPI queries originating from the class loader, regardless of the volatility declaration on the function on whose behalf the loader is operating. As to the convention of setting read_only based on the function's volatility, the manual says "While authors of C functions are able to violate this convention, it's unlikely to be a good idea to do so." Maybe this is an example where an exception is reasonable. So, a function declared immutable would generate read_only SPI queries for all of its normal behavior, but its first call might first generate a non-read_only query from the class loader, followed by read_only ones doing the function's own business. SOMEHOW ADVANCE THE ON-ENTRY SNAPSHOT DURING install_jar? An alternative could be to make the install_jar function smarter, since it is only during the operation of install_jar that there is any issue here. The function does two things in sequence: 1. Load the class images from the jar file 2. Execute the SQL commands in the file install_jar has not much control or insight into step 2; it is just reading query text and passing it to SPI, and we'll see the problem if one of those commands refers to a not-yet-loaded class from step 1. But install_jar knows exactly what it's doing in step 1. It populates the tables of class images, using ordinary, non-read_only SPI calls. It could even CommandCounterIncrement the snapshot those calls are using---only that's futile, because that snapshot gets popped at the end of step 1, leaving the former ActiveSnapshot, where the counter is unchanged. Any SPI calls made in step 2 will either get a new snapshot (in the read_only false case), or rely on the current ActiveSnapshot (in the read_only true case), which is the one that's too old to succeed. Would it be unprecedented / be unreasonable / break anything for the install_jar function to simply force a CommandCounterIncrement at the end of step 1 (after its temporary snapshot has been popped, so the former/on-entry ActiveSnapshot gets the increment)? This really strikes me as the less blunt, more surgical solution to the headache. The crux of the problem is just the persistence, during install_jar, of a snapshot where the jar you told it to install is still missing. This proposed fix would violate (in this one narrow circumstance) the documented visibility rule for read_only SPI calls, but the documented rule just isn't useful for this case. But is there any reason this wouldn't work, or could break something? DECISION TIME ... 1. fiddle the loader to always pass read_only => false to SPI calls, regardless of the volatility of the function it is loading for. 2. leave the loader alone, and adjust install_jar (an infrequent operation) to do something heretical with its on-entry snapshot. I'm leaning toward 2. What say you? Is there a 3 or a 4 that I'm overlooking? Thanks, -Chap
pgsql-hackers by date: