Thread: pg_stat_database shows userid as OID
Hello hackers, In the pg_stat_activity view, the usesysid is shown as having type Oid. However pg_shadow says it's an integer. Is there a reason? Looks like a bug. This patch seems to corrects this issue, but I don't know if there's something else involved. Index: src/include/catalog/pg_proc.h =================================================================== RCS file: /projects/cvsroot/pgsql-server/src/include/catalog/pg_proc.h,v retrieving revision 1.276 diff -c -r1.276 pg_proc.h *** src/include/catalog/pg_proc.h 2002/11/08 17:27:03 1.276 --- src/include/catalog/pg_proc.h 2002/11/16 23:18:44 *************** *** 2738,2744 **** DESCR("Statistics: PID of backend"); DATA(insert OID = 1938 ( pg_stat_get_backend_dbid PGNSP PGUID12 f f t f s 1 26 "23" pg_stat_get_backend_dbid - _null_ )); DESCR("Statistics: Database ID of backend"); ! DATA(insert OID = 1939 ( pg_stat_get_backend_userid PGNSP PGUID 12 f f t f s 1 26 "23" pg_stat_get_backend_userid- _null_ )); DESCR("Statistics: User ID of backend"); DATA(insert OID = 1940 ( pg_stat_get_backend_activity PGNSP PGUID 12 f f t f s 1 25 "23" pg_stat_get_backend_activity - _null_ )); DESCR("Statistics:Current query of backend"); --- 2738,2744 ---- DESCR("Statistics: PID of backend"); DATA(insert OID = 1938 ( pg_stat_get_backend_dbid PGNSP PGUID12 f f t f s 1 26 "23" pg_stat_get_backend_dbid - _null_ )); DESCR("Statistics: Database ID of backend"); ! DATA(insert OID = 1939 ( pg_stat_get_backend_userid PGNSP PGUID 12 f f t f s 1 23 "23" pg_stat_get_backend_userid- _null_ )); DESCR("Statistics: User ID of backend"); DATA(insert OID = 1940 ( pg_stat_get_backend_activity PGNSP PGUID 12 f f t f s 1 25 "23" pg_stat_get_backend_activity - _null_ )); DESCR("Statistics:Current query of backend"); Index: src/backend/utils/adt/pgstatfuncs.c =================================================================== RCS file: /projects/cvsroot/pgsql-server/src/backend/utils/adt/pgstatfuncs.c,v retrieving revision 1.8 diff -c -r1.8 pgstatfuncs.c *** src/backend/utils/adt/pgstatfuncs.c 2002/08/20 04:47:52 1.8 --- src/backend/utils/adt/pgstatfuncs.c 2002/11/16 23:18:44 *************** *** 272,278 **** if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) PG_RETURN_NULL(); ! PG_RETURN_OID(beentry->userid); } --- 272,278 ---- if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) PG_RETURN_NULL(); ! PG_RETURN_INT32(beentry->userid); } -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "El miedo atento y previsor es la madre de la seguridad" (E. Burke)
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > In the pg_stat_activity view, the usesysid is shown as having type Oid. > However pg_shadow says it's an integer. Is there a reason? There's been disagreement for a long time over whether userids should be OIDs or ints. If you want to introduce consistency then it's going to take a lot more than a one-line patch. (First you'll need to convince the partisans involved which answer is the right one.) > Looks like a bug. Not as long as OID is 4 bytes. I'd recommend not making any piecemeal changes, especially not when there's not yet a consensus which way to converge. regards, tom lane
On Sun, Nov 17, 2002 at 01:16:29PM -0500, Tom Lane wrote: > Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > > In the pg_stat_activity view, the usesysid is shown as having type Oid. > > However pg_shadow says it's an integer. Is there a reason? > > There's been disagreement for a long time over whether userids should be > OIDs or ints. If you want to introduce consistency then it's going to > take a lot more than a one-line patch. (First you'll need to convince > the partisans involved which answer is the right one.) Oh, I see. I wasn't aware of this. I don't really know which answer is "the right one". I don't care a lot about this thing either, but I'll keep it into my list of amusements, and will probably even dig into the archives sometime. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) Criptografía: Poderosa técnica algorítmica de codificación que es empleada en la creación de manuales de computadores.
Tom Lane wrote: > Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > > In the pg_stat_activity view, the usesysid is shown as having type Oid. > > However pg_shadow says it's an integer. Is there a reason? > > There's been disagreement for a long time over whether userids should be > OIDs or ints. If you want to introduce consistency then it's going to > take a lot more than a one-line patch. (First you'll need to convince > the partisans involved which answer is the right one.) > > > Looks like a bug. > > Not as long as OID is 4 bytes. > > I'd recommend not making any piecemeal changes, especially not when > there's not yet a consensus which way to converge. Well, seems we should make it consistent at least. Let's decide and make it done. I think some wanted it to be an int so they could use the same unix uid for pg_shadow, but I think we aren't using that idea much anymore. However, right now, it looks like the super user is '1', and other users start numbering from 100. That at least suggests int rather than oid. I am not particular in what we choose, but I do think there is a good argument to make it consistent. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
In looking at the CLUSTER ALL patch I have applied, I am now wondering why the ALL keyword is used. When we do VACUUM, we don't use ALL. VACUUM vacuums all tables. Shouldn't' CLUSTER alone do the same thing. And what about REINDEX? That seems to have a different syntax from the other two. Seems there should be some consistency. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Tom Lane wrote: >> I'd recommend not making any piecemeal changes, especially not when >> there's not yet a consensus which way to converge. > Well, seems we should make it consistent at least. I think the original argument stemmed from the idea that we ought to use pg_shadow's OID column as the user identifier (eliminating usesysid per se). This seems like a good idea at first but I think it has a couple of fatal problems: * disappearance of pg_shadow.usesysid column will doubtless break some applications * if we use OIDthen it's much more difficult to support explicit assignment of userid > I think some wanted it to be an int so they could use the > same unix uid for pg_shadow, but I think we aren't using that idea much > anymore. I don't think anyone worries about making usesysid match /etc/passwd anymore, but nonetheless CREATE USER WITH SYSID is still an essential capability. What if you drop a user accidentally while he still owns objects? You *must* be able to recreate him with the same sysid as before. pg_depend cannot save us from this kind of mistake, either, since users span databases. So it seems to me that we must keep pg_shadow.usesysid as a separate column and not try to make it the OID of pg_shadow. Given that decision, the argument for making it be type OID seems very weak, so I'd lean to the "use int4" camp myself. But I'm not sure everyone agrees. I think Peter was strongly in favor of OID when he was revising the session-authorization code (that's why it all uses OID for user IDs...) As far as the actual C code goes, I'd lean to creating new typedefs UserId and GroupId (or some such names) and making all the routine and variable declarations use those, and not either OID or int4. But I'm not excited about promoting these typedefs into actual SQL types, as was done for TransactionId and CommandId; the payback seems much less than the effort needed. regards, tom lane
Bruce Momjian <pgman@candle.pha.pa.us> writes: > In looking at the CLUSTER ALL patch I have applied, I am now wondering > why the ALL keyword is used. When we do VACUUM, we don't use ALL. > VACUUM vacuums all tables. Shouldn't' CLUSTER alone do the same thing. I agree, lose the ALL. > And what about REINDEX? That seems to have a different syntax from the > other two. Seems there should be some consistency. We don't have a REINDEX ALL, and I'm not in a hurry to invent one. (Especially, I'd not want to see Alvaro spending time on that instead of fixing the underlying btree-compaction problem ;-)) regards, tom lane
On Sun, Nov 17, 2002 at 04:42:01PM -0500, Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > In looking at the CLUSTER ALL patch I have applied, I am now wondering > > why the ALL keyword is used. When we do VACUUM, we don't use ALL. > > VACUUM vacuums all tables. Shouldn't' CLUSTER alone do the same thing. > > I agree, lose the ALL. Well, in my original patch (the one submitted just when 7.3 was going into beta) there was no ALL. I decided to put it in for subsequent patches for no good reason. > > And what about REINDEX? That seems to have a different syntax from the > > other two. Seems there should be some consistency. > > We don't have a REINDEX ALL, and I'm not in a hurry to invent one. > (Especially, I'd not want to see Alvaro spending time on that instead > of fixing the underlying btree-compaction problem ;-)) Actually, I'm planning to do the freelist thing, then the btree compaction and then replace the current REINDEX code with the compaction code, probably including some means to do REINDEX ALL. It makes me really proud to hear such a note of confidence in my work. Thank you very much. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) Officer Krupke, what are we to do? Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke")
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > Actually, I'm planning to do the freelist thing, then the btree > compaction and then replace the current REINDEX code with the compaction ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > code, probably including some means to do REINDEX ALL. Uh ... no. The primary purpose of REINDEX is to recover from corrupted indexes, so it has to be based on a rebuild strategy not a compaction strategy. If you want to add a REINDEX ALL for completeness, go ahead, but I think the need for it will be vanishingly small once vacuum compacts btrees properly. regards, tom lane
I totally agree with what you have said. Peter, can you clarify your reasoning for OID for user/group id? --------------------------------------------------------------------------- Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Tom Lane wrote: > >> I'd recommend not making any piecemeal changes, especially not when > >> there's not yet a consensus which way to converge. > > > Well, seems we should make it consistent at least. > > I think the original argument stemmed from the idea that we ought to use > pg_shadow's OID column as the user identifier (eliminating usesysid per > se). This seems like a good idea at first but I think it has a couple > of fatal problems: > * disappearance of pg_shadow.usesysid column will doubtless break some > applications > * if we use OID then it's much more difficult to support explicit > assignment of userid > > > I think some wanted it to be an int so they could use the > > same unix uid for pg_shadow, but I think we aren't using that idea much > > anymore. > > I don't think anyone worries about making usesysid match /etc/passwd > anymore, but nonetheless CREATE USER WITH SYSID is still an essential > capability. What if you drop a user accidentally while he still owns > objects? You *must* be able to recreate him with the same sysid as > before. pg_depend cannot save us from this kind of mistake, either, > since users span databases. > > So it seems to me that we must keep pg_shadow.usesysid as a separate > column and not try to make it the OID of pg_shadow. > > Given that decision, the argument for making it be type OID seems very > weak, so I'd lean to the "use int4" camp myself. But I'm not sure > everyone agrees. I think Peter was strongly in favor of OID when he > was revising the session-authorization code (that's why it all uses OID > for user IDs...) > > As far as the actual C code goes, I'd lean to creating new typedefs > UserId and GroupId (or some such names) and making all the routine > and variable declarations use those, and not either OID or int4. > But I'm not excited about promoting these typedefs into actual SQL > types, as was done for TransactionId and CommandId; the payback seems > much less than the effort needed. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > In looking at the CLUSTER ALL patch I have applied, I am now wondering > > why the ALL keyword is used. When we do VACUUM, we don't use ALL. > > VACUUM vacuums all tables. Shouldn't' CLUSTER alone do the same thing. > > I agree, lose the ALL. Good. I can take care of that or someone can submit a patch. > > And what about REINDEX? That seems to have a different syntax from the > > other two. Seems there should be some consistency. > > We don't have a REINDEX ALL, and I'm not in a hurry to invent one. > (Especially, I'd not want to see Alvaro spending time on that instead > of fixing the underlying btree-compaction problem ;-)) My point for REINDEX was a little different. The man pages shows: REINDEX { DATABASE | TABLE | INDEX } <replaceable class="PARAMETER">name</replaceable> [ FORCE ] where we don't have ALL but we do have DATABASE. Do we need that tri-valued secodn field for reindex because you can reindex a table _or_ and index, and hence DATABASE makes sense? I am just asking. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Sun, Nov 17, 2002 at 06:43:38PM -0500, Bruce Momjian wrote: > Tom Lane wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > And what about REINDEX? That seems to have a different syntax from the > > > other two. Seems there should be some consistency. > > > > We don't have a REINDEX ALL, and I'm not in a hurry to invent one. > > (Especially, I'd not want to see Alvaro spending time on that instead > > of fixing the underlying btree-compaction problem ;-)) > > My point for REINDEX was a little different. The man pages shows: > > REINDEX { DATABASE | TABLE | INDEX } <replaceable > class="PARAMETER">name</replaceable> [ FORCE ] > > where we don't have ALL but we do have DATABASE. Do we need that > tri-valued secodn field for reindex because you can reindex a table _or_ > and index, and hence DATABASE makes sense? I am just asking. REINDEX DATABASE is for system indexes only, it's not the same that one would think of REINDEX alone (which is all indexes on all tables, isn't it?). What I don't understand is what are the parameters in the ReindexDatabase function for. For example, the boolean all is always false in tcop/utility.c (and there are no other places that the function is called). Also, the database name is checked to be equal to a "constant" value, the database name that the standalone backend is connected to. Why are those useful? -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "No renuncies a nada. No te aferres a nada"
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > What I don't understand is what are the parameters in the > ReindexDatabase function for. For example, the boolean all is always > false in tcop/utility.c (and there are no other places that the function > is called). Also, the database name is checked to be equal to a > "constant" value, the database name that the standalone backend is > connected to. Why are those useful? Well, passing all=true would implement REINDEX ALL ... As for the database name, we could perhaps change the syntax to just REINDEX DATABASE; not sure if it's worth the trouble. regards, tom lane
> In looking at the CLUSTER ALL patch I have applied, I am now wondering > why the ALL keyword is used. When we do VACUUM, we don't use ALL. > VACUUM vacuums all tables. Shouldn't' CLUSTER alone do the same thing. > And what about REINDEX? That seems to have a different syntax from the > other two. Seems there should be some consistency. Yeah - I agree! Chris
Alvaro Herrera wrote: > > On Sun, Nov 17, 2002 at 06:43:38PM -0500, Bruce Momjian wrote: > > Tom Lane wrote: > > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > > > And what about REINDEX? That seems to have a different > > > > syntax from the other two. Seems there should be some > > > > consistency. > > > > > > We don't have a REINDEX ALL, and I'm not in a hurry to invent one. > > > (Especially, I'd not want to see Alvaro spending time on that > > > instead of fixing the underlying btree-compaction problem ;-)) > > > > My point for REINDEX was a little different. The man pages shows: > > > > REINDEX { DATABASE | TABLE | INDEX } <replaceable > > class="PARAMETER">name</replaceable> [ FORCE ] > > > > where we don't have ALL but we do have DATABASE. Do we need that > > tri-valued secodn field for reindex because you can reindex a > > table _or_ and index, and hence DATABASE makes sense? I am just > > asking. > > REINDEX DATABASE is for system indexes only, it's not the same that one > would think of REINDEX alone (which is all indexes on all tables, isn't > it?). Probably You don't understand the initial purpose of REINDEX. It isn't an SQL standard at all and was intended to recover corrupted system indexes. It's essentially an unsafe operation and so the operation was inhibited other than under standalone postgres. I also made the command a little hard to use to avoid unexpected invocations e.g. REINDEX DATABASE requires an unnecessary database name parameter or FORCE is still needed though it's a requisite parameter now. REINDEX is also used to compact indexes now. It's good but the purpose is different from the initial one and we would have to reorganize the functionalities e.g. the table data isn't needed to compact the indexes etc. > What I don't understand is what are the parameters in the > ReindexDatabase function for. For example, the boolean all > is always false in tcop/utility.c (and there are no other > places that the function is called). I intended to implement the *true* case also then but haven't done it yet, sorry. regards, Hiroshi Inouehttp://w2422.nsk.ne.jp/~inoue/
Alvaro Herrera wrote: > On Sun, Nov 17, 2002 at 04:42:01PM -0500, Tom Lane wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > In looking at the CLUSTER ALL patch I have applied, I am now wondering > > > why the ALL keyword is used. When we do VACUUM, we don't use ALL. > > > VACUUM vacuums all tables. Shouldn't' CLUSTER alone do the same thing. > > > > I agree, lose the ALL. > > Well, in my original patch (the one submitted just when 7.3 was going > into beta) there was no ALL. I decided to put it in for subsequent > patches for no good reason. I have updated CVS to make the syntax CLUSTER rather than CLUSTER ALL. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Does anyone want userid to be an OID? Peter? Anyone? If not, I will add it to the TODO list or work on the patch myself. --------------------------------------------------------------------------- Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Tom Lane wrote: > >> I'd recommend not making any piecemeal changes, especially not when > >> there's not yet a consensus which way to converge. > > > Well, seems we should make it consistent at least. > > I think the original argument stemmed from the idea that we ought to use > pg_shadow's OID column as the user identifier (eliminating usesysid per > se). This seems like a good idea at first but I think it has a couple > of fatal problems: > * disappearance of pg_shadow.usesysid column will doubtless break some > applications > * if we use OID then it's much more difficult to support explicit > assignment of userid > > > I think some wanted it to be an int so they could use the > > same unix uid for pg_shadow, but I think we aren't using that idea much > > anymore. > > I don't think anyone worries about making usesysid match /etc/passwd > anymore, but nonetheless CREATE USER WITH SYSID is still an essential > capability. What if you drop a user accidentally while he still owns > objects? You *must* be able to recreate him with the same sysid as > before. pg_depend cannot save us from this kind of mistake, either, > since users span databases. > > So it seems to me that we must keep pg_shadow.usesysid as a separate > column and not try to make it the OID of pg_shadow. > > Given that decision, the argument for making it be type OID seems very > weak, so I'd lean to the "use int4" camp myself. But I'm not sure > everyone agrees. I think Peter was strongly in favor of OID when he > was revising the session-authorization code (that's why it all uses OID > for user IDs...) > > As far as the actual C code goes, I'd lean to creating new typedefs > UserId and GroupId (or some such names) and making all the routine > and variable declarations use those, and not either OID or int4. > But I'm not excited about promoting these typedefs into actual SQL > types, as was done for TransactionId and CommandId; the payback seems > much less than the effort needed. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073