Re: Hibernate / other ORM and pg_advisory_lock - Mailing list pgsql-jdbc
From | Craig Ringer |
---|---|
Subject | Re: Hibernate / other ORM and pg_advisory_lock |
Date | |
Msg-id | 47F9D312.20200@postnewspapers.com.au Whole thread Raw |
In response to | Re: Hibernate / other ORM and pg_advisory_lock (Andrew <archa@pacific.net.au>) |
Responses |
Re: Hibernate / other ORM and pg_advisory_lock
|
List | pgsql-jdbc |
I'm sorry if I've been communicating all this poorly. Even if I've been muddling you and convincing you that I'm quite crazy, it's been a helpful conversation for me ;-) . I hope my explanations below make a bit more sense than the prior ones. Andrew wrote: >> Of course, even with advisory locking it's always *possible* that >> someone else might sneak in an update. They might not check the lock. >> It might be another app that doesn't know about the advisory locks. >> Etc. So it's always necessary to use opportunistic or pessimistic >> transaction level locking/versioning as well, to ensure there are no >> update collisions, and it's necessary to be able to handle failed >> updates. > > So back to pessimistic locking which I believe we both had agreed was > not a desirable behaviour, particularly in a user interaction model? The reference to pessimistic locking was a bit of a red herring. What I meant was simply "If advisory locking is being used, it is still necessary to rely on normal locking methods to ensure that conflicting updates are detected or prevented". In other words, advisory locking in this situation is used purely to help the user out, giving them advance warning about potential concurrent updates and letting them avoid wasting time and effort. It does not replace normal methods for ensuring data consistency and integrity, though it does interact with optimistic locking solely in that it's intended to reduce or eliminate user-visible conflicts. Advisory locking can be used to detect and avoid situations that would otherwise result in an optimistic locking conflict requiring user action like data reentry. It can't (or IMO shouldn't) replace optimistic locking. An advisory lock allows information about object "busy-ness" to be communicated among clients outside a transactional scope. This means that conflicts can be detected and prevented before they happen, instead of waiting until all the work is done and catching an optimistic locking conflict. Often we do not care about a conflict, as the operation can be re-done behind the user's back without them knowing or caring. Incrementing some value, for example. In other cases, there might be time consuming or difficult-to-repeat changes being made that the program cannot simply retry without asking the user how to handle it. Changing a notes field on a customer record, for example. In those cases, it's much nicer to catch the conflict BEFORE the user does all the work. That's where advisory locking is useful. (Yes, I know a notes/history field can just be broken down into multiple entries in a separate table. It's an *example* because the main areas of concern in the real system would require way too much background). In a system designed for advanced users I'd quite possibly just let both users proceed and tell the user who lost the race to redo their changes and try again. I might even provide a "merge/diff" style summary of the conflicts between their changes and the winning change. This is, however, not a system for advanced users, and for them it's a *lot* nicer to be told somebody else is editing something complicated and be able to wait. I wouldn't expose a normal user working with a content management system to a merge conflict, source revision control system style. I don't know what your users are like, but mine have a hard time saving email attachments let alone interpreting change conflict summaries. I'd prefer to tell them "This item is being edited by another user. You can try again later, or wait for them to finish.". The situations I'm facing in this applicatoin are similar to that problem, with similar UI implications. Additionally, in this situation almost all updates to the areas where advisory locking will be useful will be done through the tool I'm now working on, which will honour the advisory locks. There will rarely, if ever, be updates to this data from DB clients that don't know/care about the advisory lock, so users will rarely if ever get an error due to a conflicting update if I use advisory locking to help avoid it happening. The advisory locking can even be done from a separate JDBC connection, avoiding any interaction at all with the ORM tools. It's essentially the same as using a middle layer solely for inter-client messaging, but without the need for the middle layer. That's one less thing to break, to my mind. In most areas advisory locking is neither necessary nor useful. It's only some particular situations, similar to the examples I've given, that it's likely to be of use to improve the way multi-user interaction works. In all other areas using optimistic locking alone should be fine. As for mixed access by the existing apps and the new tools, I'm now pretty sure the trigger-based versioning approach will be fine. I'm going to do a fair bit of testing of course, but it all looks quite reasonable. It'll just ensure that existing apps that aren't aware of the row versioning still increment the version, and thus play well with the ORM tools. The trigger will never change the effect of a statement run by the ORM tools (because they always increment the version themselves), and is invisible to them. I'll drop a short note here when that testing is done, as I imagine it might be of interest to others. > The ideal situation would be if > the DB was only accessed through a single shared access layer by all > clients using a multi-tiered decoupled design, though I acknowledge that > is not always possible, particularly with many thick client > architectures that are tightly coupled two tiered designs at best. I see a lot of this, and I'm still not sure I understand the thinking behind it. The DB is a shared access system. It's designed to handle concurrency, and does a very good job of it. Adding another layer to hide the DB seems to be adding unnecessary complexity unless it can provide something extra like preventing / warning about concurrent changes to objects outside the scope of DB transactions. However, that's handled just as well by row versioning without adding another layer to the system. Unless the additional layer implements the "business logic", using the DB as a dumb storage engine, I just can't see much of a rationale for its existence. The proper division of work between app and DB is of course along standing philosophical argument that won't be resolved in a hurry. Ditto the utility/necessity of "n-tier" application design. All I can say is that in this specific case the DB already handles the business logic, so the midlayer would be superfluous. Furthermore, the additional layer will either be tied to a particular technology (eg Java RPC) or rely on clumsy schemes like XML-RPC for communication. I can see the point for really big and complex systems that mix many different data stores. For smaller stuff where the single DB is already the center of the system and doing a good job there I don't see the point. It strikes me as being "Enterprise-y" for the sake of it. > I can understand how you would release the advisory lock by running > another SQL statement in another thread to do so, but I do not know if > that is of much use to you. I assume you meant cancelling an attempted acquisition of the advisory lock, rather than releasing the lock? Releasing it so it can be acquired by another connection is just another ordinary SELECT pg_advisory_unlock(blah). Or closing the JDBC connection, of course. If there's an advisory lock acquisition attempt blocked, and the user wants to give up, I can just cancel the statement trying to take out the advisory lock using the JDBC Statement.cancel() method - see the attached code in my other reply. That's something that's easily triggered by user interaction in the UI thread, and involves very little fuss. It might not work well with some other JDBC drivers. I've seen reports of cancel() not doing anything with MySQL's JDBC for example. For this app that's a non-issue though, it's tied to Pg anyway. > But from my understanding, that query with the > back end database process will still be running and the DB connection > will still be in use, including all of the associated locks and DB > threads. So a subsequent attempt on a long running query will also > block. Not if you use the JDBC to cancel the query. There's no guarantee that a big, complex query will be terminated immediately by the backend, but for something simple like an attempt to acquire an advisory lock cancelling is immediate and quite clean, at least according to my testing. > You have to remember that at the ORM level (which is just > a wrapper to the JDBC), or at the JDBC level, you do not have fine grain > control of what is happening in the back end, and in my mind you should > be looking at what JDBC provides, not what the underlying DB may be able > to provide, and in using pg advisory locks, you are mixing the two. It's true that using advisory locking would involve mixing two locking models. That could get very ugly if they had to interact in any sort of complex way - say, if there were stored procedures or triggers that used advisory locking and the ordinary transaction-level locking done by Pg. To me, that says "avoid mixing advisory locking operations in with other operations that might take normal transaction-level locks" rather than "don't use advisory locking". In this case the advisory locking can even be done from a separate JDBC connection, avoiding any interaction at all with the ORM tools. Locks can be taken out, tested, and released only before any work is done or after all work is committed. It's essentially the same as using a middle layer solely for inter-client messaging, but without the need for the middle layer. -- Craig Ringer
pgsql-jdbc by date: