Re: POSIX shared memory support - Mailing list pgsql-patches
From | Chris Marcellino |
---|---|
Subject | Re: POSIX shared memory support |
Date | |
Msg-id | B667A14D-EFA2-497B-9C5F-FEC70AFD7D57@apple.com Whole thread Raw |
In response to | Re: POSIX shared memory support (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: POSIX shared memory support
|
List | pgsql-patches |
On Feb 26, 2007, at 10:43 PM, Tom Lane wrote: > Chris Marcellino <cmarcellino@apple.com> writes: >> The System V shared memory facilities provide a method to determine >> who is attached to a shared memory segment. >> This is used to prevent backends that were orphaned by crashed or >> killed database processes from corrupting the data- >> base as it is restarted. The same effect can be achieved with using >> the POSIX APIs, > > ... except that it can't ... > >> but since the POSIX library does not >> have a way to check who is attached to a segment, atomic segment >> creation must be used to ensure exclusive access to >> the database. > > How does that fix the problem? If you can't actually tell whether > someone is attached to an existing segment, then you're still up > against > the basic rock-and-a-hard-place issue: either you assume there is > no one > there (and corrupt your database if you're wrong) or you assume > there is > someone there (and force manual intervention by the DBA to recover > after > postmaster crashes). Neither of these alternatives is really > acceptable. Ignoring the case where backends are still alive in the database, since they would require intervention or patience either way, there are two options: 1) There is a postmaster/backend still running and you try to start another postmaster: the unique segment cannot be closed and atomically recreated and will fail as it does in the current implementation. 2) There are no errant processes still in the database: the segment can be closed and atomically recreated. Try making a build with the patch, then start a postmaster for a given folder, delete the lock file and start another postmaster (on a different port) in that folder. Please let me know if I am overlooking something. > >> In order for this to work, the key name used to open and create the >> shared memory segment must be unique for each >> data directory. This is done by using a strong hash of the canonical >> form of the data directory’s pathname. > > "Strong hash" is not a guarantee, even if you could promise that you > could get a unique canonical path, which I doubt you can. In any case > this fails if the DBA decides to rename the directory on the fly > (don't > laugh; not only are there instances of that in our archives, there are > people opining that we need to allow it --- even with the postmaster > still running). Strong hash is an effective guarantee that many computing paradigms are based upon. The collision rate is astronomically small, and can be made astronomically smaller with longer hashes. (For MD5 there would need to be 10^15 postmasters on a server before a collision is likely, and they all would need to have crashed and left backends in the database, etc. ) True, renaming is a problem that I had had not anticipated at all. Now that you mention it, hard links might be an issue on some machines that don't canonicalize them to a unique path, since that isn't required by the POSIX docs. Oh, the horrible degenerate cases. Good point though. Perhaps there is some other unique identifying feature of a given database. A per-database persistent UUID would fit nicely here. It could just be the shmem key. > >> This also re- >> moves any risk of other applications, or other databases’ memory >> segments colliding with the current shared memory >> segment, which conveniently simplifies the logic. > > How exactly does it remove that risk? This is fruitless due to the renaming issue, but the hash isn't an issue. I'm not sure that a hex string beginning with \pg_xxxxx is any less readable than the shmem id integers that are generated ad-hoc by the current implementation. > I think you're wishfully-thinking > that if you are creating an unreadable hash value then there will > never > be any collisions against someone else with the same touching faith > that > *his* unreadable hash values will never collide with anyone else's. I'm flattered that you hold my coding abilities with such devout conviction, but I assure you that cryptography, even in this limited use, is based in rational thought :). In addition, the astronomically unlikely collision isn't a risk as the database can't be damaged. The admin would then need to clear the lockfile, after he won the lottery twice and was stuck by lightning in his overturned car. > Doesn't give me a lot of comfort. > Not that it matters, since the > approach is broken even if this specific assumption were sustainable. Postmasters failing to load don't give me much comfort either, and that isn't a pipe dream. I suppose that the renaming issue relegates this patch to situations where the database cannot be renamed or hard linked to and started more than once, yet require this to start up databases without restarting and needing to control how many other databases are consuming shmem on the same box. Thanks for the reply, Chris Marcellino > > regards, tom lane
pgsql-patches by date: