Thread: [HACKERS] Shaky coding for vacuuming partitioned relations
Somebody inserted this into vacuum.c's get_rel_oids(): tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); if (!HeapTupleIsValid(tuple)) elog(ERROR,"cache lookup failed for relation %u", relid); apparently without having read the very verbose comment two lines above, which points out that we're not taking any lock on the target relation. So, if that relation is concurrently being dropped, you're likely to get "cache lookup failed for relation NNNN" rather than anything more user-friendly. A minimum-change fix would be to replace the elog() with an ereport that produces the same "relation does not exist" error you'd have gotten from RangeVarGetRelid, had the concurrent DROP TABLE committed a few microseconds earlier. But that feels like its's band-aiding around the problem. What I'm wondering about is changing the RangeVarGetRelid call to take ShareUpdateExclusiveLock rather than no lock. That would protect the syscache lookup, and it would also make the find_all_inheritors call a lot more meaningful. If we're doing a VACUUM, the ShareUpdateExclusiveLock would be dropped as soon as we close the caller's transaction, and then we'd acquire the same or stronger lock inside vacuum_rel(). So that seems fine. If we're doing an ANALYZE, then the lock would continue to be held and analyze_rel would merely be acquiring it an extra time, so we'd actually be removing a race-condition failure scenario for ANALYZE. This would mean a few more cycles in lock management, but since this only applies to a manual VACUUM or ANALYZE that specifies a table name, I'm not too concerned about that. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Sep 23, 2017 at 4:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Somebody inserted this into vacuum.c's get_rel_oids(): > > tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); > if (!HeapTupleIsValid(tuple)) > elog(ERROR, "cache lookup failed for relation %u", relid); > > apparently without having read the very verbose comment two lines above, > which points out that we're not taking any lock on the target relation. > So, if that relation is concurrently being dropped, you're likely to > get "cache lookup failed for relation NNNN" rather than anything more > user-friendly. This has been overlooked during the lookups of 3c3bb99, and by multiple people including me. elog() should never be things users can face, as well as cache lookups. > A minimum-change fix would be to replace the elog() with an ereport > that produces the same "relation does not exist" error you'd have > gotten from RangeVarGetRelid, had the concurrent DROP TABLE committed > a few microseconds earlier. But that feels like its's band-aiding > around the problem. Yeah, that's not right. This is a cache lookup error at the end. > What I'm wondering about is changing the RangeVarGetRelid call to take > ShareUpdateExclusiveLock rather than no lock. That would protect the > syscache lookup, and it would also make the find_all_inheritors call > a lot more meaningful. > > If we're doing a VACUUM, the ShareUpdateExclusiveLock would be dropped > as soon as we close the caller's transaction, and then we'd acquire > the same or stronger lock inside vacuum_rel(). So that seems fine. > If we're doing an ANALYZE, then the lock would continue to be held > and analyze_rel would merely be acquiring it an extra time, so we'd > actually be removing a race-condition failure scenario for ANALYZE. > This would mean a few more cycles in lock management, but since this > only applies to a manual VACUUM or ANALYZE that specifies a table > name, I'm not too concerned about that. I think that I am +1 on that. Testing such a thing I am not seeing anything wrong either. The call to find_all_inheritors should also use ShareUpdateExclusiveLock, and the lock on top of RangeVarGetRelid() needs to be reworked. Attached is a proposal of patch. > Thoughts? As long as I don't forget... Another thing currently on HEAD and REL_10_STABLE is that OIDs of partitioned tables are used, but the RangeVar of the parent is used for error reports. This leads to incorrect reports if a partition gets away in the middle of autovacuum as only information about the parent is reported to the user. I am not saying that this needs to be fixed in REL_10_STABLE at this stage though as this would require some refactoring similar to what the patch adding support for VACUUM with multiple relations does. But I digress here. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 2017/09/25 12:10, Michael Paquier wrote: > On Sat, Sep 23, 2017 at 4:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Somebody inserted this into vacuum.c's get_rel_oids(): >> >> tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); >> if (!HeapTupleIsValid(tuple)) >> elog(ERROR, "cache lookup failed for relation %u", relid); >> >> apparently without having read the very verbose comment two lines above, >> which points out that we're not taking any lock on the target relation. >> So, if that relation is concurrently being dropped, you're likely to >> get "cache lookup failed for relation NNNN" rather than anything more >> user-friendly. > > This has been overlooked during the lookups of 3c3bb99, and by > multiple people including me. elog() should never be things users can > face, as well as cache lookups. Agreed, that was an oversight. >> A minimum-change fix would be to replace the elog() with an ereport >> that produces the same "relation does not exist" error you'd have >> gotten from RangeVarGetRelid, had the concurrent DROP TABLE committed >> a few microseconds earlier. But that feels like its's band-aiding >> around the problem. > > Yeah, that's not right. This is a cache lookup error at the end. Also agreed that fixing the locking here would be a better solution. >> What I'm wondering about is changing the RangeVarGetRelid call to take >> ShareUpdateExclusiveLock rather than no lock. That would protect the >> syscache lookup, and it would also make the find_all_inheritors call >> a lot more meaningful. >> >> If we're doing a VACUUM, the ShareUpdateExclusiveLock would be dropped >> as soon as we close the caller's transaction, and then we'd acquire >> the same or stronger lock inside vacuum_rel(). So that seems fine. >> If we're doing an ANALYZE, then the lock would continue to be held >> and analyze_rel would merely be acquiring it an extra time, so we'd >> actually be removing a race-condition failure scenario for ANALYZE. >> This would mean a few more cycles in lock management, but since this >> only applies to a manual VACUUM or ANALYZE that specifies a table >> name, I'm not too concerned about that. > > I think that I am +1 on that. Testing such a thing I am not seeing > anything wrong either. The call to find_all_inheritors should also use > ShareUpdateExclusiveLock, and the lock on top of RangeVarGetRelid() > needs to be reworked. > > Attached is a proposal of patch. Hmm, I'm not sure if we need to lock the partitions, too. Locks taken by find_all_inheritors() will be gone once the already-running transaction is ended by the caller (vacuum()). get_rel_oids() should just lock the table mentioned in the command (the parent table), so that find_all_inheritors() returns the correct partition list, as Tom perhaps alluded to when he said "it would also make the find_all_inheritors call a lot more meaningful.", but... When vacuum() iterates that list and calls vacuum_rel() for each relation in the list, it might be the case that a relation in that list is no longer a partition of the parent. So, one might say that it would get vacuumed unnecessarily. Perhaps that's fine? >> Thoughts? > > As long as I don't forget... Another thing currently on HEAD and > REL_10_STABLE is that OIDs of partitioned tables are used, but the > RangeVar of the parent is used for error reports. This leads to > incorrect reports if a partition gets away in the middle of autovacuum > as only information about the parent is reported to the user. Oh, you're right. The original RangeVar (corresponding to the table mentioned in the command) refers to the parent table. > I am not > saying that this needs to be fixed in REL_10_STABLE at this stage > though as this would require some refactoring similar to what the > patch adding support for VACUUM with multiple relations does. I think it would be better to fix that independently somehow. VACUUM on partitioned tables is handled by calling vacuum_rel() on individual partitions. It would be nice if vacuum_rel() didn't have to refer to the RangeVar. Could we not use get_rel_name(relid) in place of relation->relname? I haven't seen the other patch yet though, so maybe I'm missing something. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 25, 2017 at 4:42 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On 2017/09/25 12:10, Michael Paquier wrote: > Hmm, I'm not sure if we need to lock the partitions, too. Locks taken by > find_all_inheritors() will be gone once the already-running transaction is > ended by the caller (vacuum()). get_rel_oids() should just lock the table > mentioned in the command (the parent table), so that find_all_inheritors() > returns the correct partition list, as Tom perhaps alluded to when he said > "it would also make the find_all_inheritors call a lot more meaningful.", > but... It is important to hold the lock for child tables until the end of the transaction actually to fill in correctly the RangeVar associated to each partition when scanning them. > When vacuum() iterates that list and calls vacuum_rel() for each relation > in the list, it might be the case that a relation in that list is no > longer a partition of the parent. So, one might say that it would get > vacuumed unnecessarily. Perhaps that's fine? I am personally fine with that. Any checks would prove to complicate the code for not much additional value. >> I am not >> saying that this needs to be fixed in REL_10_STABLE at this stage >> though as this would require some refactoring similar to what the >> patch adding support for VACUUM with multiple relations does. > > I think it would be better to fix that independently somehow. VACUUM on > partitioned tables is handled by calling vacuum_rel() on individual > partitions. It would be nice if vacuum_rel() didn't have to refer to the > RangeVar. Could we not use get_rel_name(relid) in place of > relation->relname? I haven't seen the other patch yet though, so maybe > I'm missing something. The RangeVar is used for error reporting, and once you reach vacuum_rel, the relation and its OID may be gone. So you need to save the information of the RangeVar beforehand when scanning the relations. That's one reason behind having the VacuumRelation stuff discussed on the thread for multiple table VACUUM, this way you store for each item to be vacuumed: - A RangeVar. - A list of columns. - An OID saved by vacuum deduced from the RangeVar. That's quite some refactoring, so my opinion is that this ship has sailed already for REL_10_STABLE, and that we had better live with that in 10. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 25, 2017 at 12:10 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Sat, Sep 23, 2017 at 4:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Somebody inserted this into vacuum.c's get_rel_oids(): >> >> tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); >> if (!HeapTupleIsValid(tuple)) >> elog(ERROR, "cache lookup failed for relation %u", relid); >> >> apparently without having read the very verbose comment two lines above, >> which points out that we're not taking any lock on the target relation. >> So, if that relation is concurrently being dropped, you're likely to >> get "cache lookup failed for relation NNNN" rather than anything more >> user-friendly. > > This has been overlooked during the lookups of 3c3bb99, and by > multiple people including me. elog() should never be things users can > face, as well as cache lookups. FWIW, the same thing can happen when specifying an invalid replication origin name to pg_replication_origin_advance() and pg_replication_origin_progress(). These probably should fixed as well. > >> A minimum-change fix would be to replace the elog() with an ereport >> that produces the same "relation does not exist" error you'd have >> gotten from RangeVarGetRelid, had the concurrent DROP TABLE committed >> a few microseconds earlier. But that feels like its's band-aiding >> around the problem. > > Yeah, that's not right. This is a cache lookup error at the end. > >> What I'm wondering about is changing the RangeVarGetRelid call to take >> ShareUpdateExclusiveLock rather than no lock. That would protect the >> syscache lookup, and it would also make the find_all_inheritors call >> a lot more meaningful. >> >> If we're doing a VACUUM, the ShareUpdateExclusiveLock would be dropped >> as soon as we close the caller's transaction, and then we'd acquire >> the same or stronger lock inside vacuum_rel(). So that seems fine. >> If we're doing an ANALYZE, then the lock would continue to be held >> and analyze_rel would merely be acquiring it an extra time, so we'd >> actually be removing a race-condition failure scenario for ANALYZE. >> This would mean a few more cycles in lock management, but since this >> only applies to a manual VACUUM or ANALYZE that specifies a table >> name, I'm not too concerned about that. > > I think that I am +1 on that. Testing such a thing I am not seeing > anything wrong either. The call to find_all_inheritors should also use > ShareUpdateExclusiveLock, and the lock on top of RangeVarGetRelid() > needs to be reworked. > > Attached is a proposal of patch. FWIW, I agreed the approach of this patch, and found no problems in the patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 25, 2017 at 7:57 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > FWIW, the same thing can happen when specifying an invalid replication > origin name to pg_replication_origin_advance() and > pg_replication_origin_progress(). These probably should fixed as well. I have spawned a thread about that stuff three weeks ago: https://www.postgresql.org/message-id/CAB7nPqQtPg%2BLKKtzdKN26judHcvPZ0s1gNigzOT4j8CYuuuBYg%40mail.gmail.com With a patch registered where it should be: https://commitfest.postgresql.org/15/1290/ So I would suggest to keep the discussions about this problem in its own thread. (Bonus: ALTER TABLE queries in parallel with multiple sessions and enjoy the result) -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 25, 2017 at 8:22 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Mon, Sep 25, 2017 at 7:57 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> FWIW, the same thing can happen when specifying an invalid replication >> origin name to pg_replication_origin_advance() and >> pg_replication_origin_progress(). These probably should fixed as well. > > I have spawned a thread about that stuff three weeks ago: > https://www.postgresql.org/message-id/CAB7nPqQtPg%2BLKKtzdKN26judHcvPZ0s1gNigzOT4j8CYuuuBYg%40mail.gmail.com > With a patch registered where it should be: > https://commitfest.postgresql.org/15/1290/ > So I would suggest to keep the discussions about this problem in its own thread. Thank you for the information! I'll look at the thread and review the patch. > (Bonus: ALTER TABLE queries in parallel with multiple sessions and > enjoy the result) Will try it :-) Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: > On 2017/09/25 12:10, Michael Paquier wrote: >> As long as I don't forget... Another thing currently on HEAD and >> REL_10_STABLE is that OIDs of partitioned tables are used, but the >> RangeVar of the parent is used for error reports. This leads to >> incorrect reports if a partition gets away in the middle of autovacuum >> as only information about the parent is reported to the user. > Oh, you're right. The original RangeVar (corresponding to the table > mentioned in the command) refers to the parent table. Yeah, I'd noticed that while reviewing the vacuum-multiple-tables patch. My thought about fixing it was to pass a null RangeVar when handling a table we'd identified through inheritance or pg_class-scanning, to indicate that this wasn't a table named in the original command. This only works conveniently if you decide that it's appropriate to silently ignore relation_open failure on such table OIDs, but I think it is. Not sure about whether we ought to try to fix that in v10. It's a mostly-cosmetic problem in what ought to be an infrequent corner case, so it might not be worth taking risks for post-RC1. OTOH, I would not be surprised to get bug reports about it down the road. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 9/24/17, 10:12 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote: > Attached is a proposal of patch. The patch seems reasonable to me, and I haven't encountered any issues in my tests, even after applying the vacuum-multiple-relations patch on top of it. + * Take a lock here for the relation lookup. If ANALYZE or VACUUM spawn + * multiple transactions, the lock taken here will be gone once the + * current transaction running commits, which could cause the relation + * to be gone, or the RangeVar might not refer to the OID looked up here. I think this could be slightly misleading. Perhaps it would be more accurate to say that the lock will be gone any time vacuum() creates a new transaction (either in vacuum_rel() or when use_own_xacts is true). Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 25, 2017 at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, I'd noticed that while reviewing the vacuum-multiple-tables patch. > My thought about fixing it was to pass a null RangeVar when handling a > table we'd identified through inheritance or pg_class-scanning, to > indicate that this wasn't a table named in the original command. This > only works conveniently if you decide that it's appropriate to silently > ignore relation_open failure on such table OIDs, but I think it is. > > Not sure about whether we ought to try to fix that in v10. It's a > mostly-cosmetic problem in what ought to be an infrequent corner case, > so it might not be worth taking risks for post-RC1. OTOH, I would > not be surprised to get bug reports about it down the road. Something like that looks like a good compromise for v10. I would rather see a more complete fix with each relation name reported correctly on HEAD though. The information provided would be useful for users when using autovacuum when skipping a relation because no lock could be taken on it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 26, 2017 at 1:47 AM, Bossart, Nathan <bossartn@amazon.com> wrote: > On 9/24/17, 10:12 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote: >> Attached is a proposal of patch. > > The patch seems reasonable to me, and I haven't encountered any issues in > my tests, even after applying the vacuum-multiple-relations patch on top > of it. Thanks for the review, Nathan! > + * Take a lock here for the relation lookup. If ANALYZE or VACUUM spawn > + * multiple transactions, the lock taken here will be gone once the > + * current transaction running commits, which could cause the relation > + * to be gone, or the RangeVar might not refer to the OID looked up here. > > I think this could be slightly misleading. Perhaps it would be more > accurate to say that the lock will be gone any time vacuum() creates a new > transaction (either in vacuum_rel() or when use_own_xacts is true). The comment of the proposed patch matches as much as possible what is currently on HEAD, so I would still go with something close to that. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 9/25/17, 6:51 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote: >> + * Take a lock here for the relation lookup. If ANALYZE or VACUUM spawn >> + * multiple transactions, the lock taken here will be gone once the >> + * current transaction running commits, which could cause the relation >> + * to be gone, or the RangeVar might not refer to the OID looked up here. >> >> I think this could be slightly misleading. Perhaps it would be more >> accurate to say that the lock will be gone any time vacuum() creates a new >> transaction (either in vacuum_rel() or when use_own_xacts is true). > > The comment of the proposed patch matches as much as possible what is > currently on HEAD, so I would still go with something close to that. Sure. This is just a minor point, and I could see the argument that your phrasing is more concise, anyway. Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 26, 2017 at 8:48 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Mon, Sep 25, 2017 at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yeah, I'd noticed that while reviewing the vacuum-multiple-tables patch. >> My thought about fixing it was to pass a null RangeVar when handling a >> table we'd identified through inheritance or pg_class-scanning, to >> indicate that this wasn't a table named in the original command. This >> only works conveniently if you decide that it's appropriate to silently >> ignore relation_open failure on such table OIDs, but I think it is. >> >> Not sure about whether we ought to try to fix that in v10. It's a >> mostly-cosmetic problem in what ought to be an infrequent corner case, >> so it might not be worth taking risks for post-RC1. OTOH, I would >> not be surprised to get bug reports about it down the road. > > Something like that looks like a good compromise for v10. I would > rather see a more complete fix with each relation name reported > correctly on HEAD though. The information provided would be useful for > users when using autovacuum when skipping a relation because no lock > could be taken on it. Actually, perhaps this should be tracked as an open item? A simple fix is leading to the path that no information is better than misleading information in this case. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/09/25 18:37, Michael Paquier wrote: > On Mon, Sep 25, 2017 at 4:42 PM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> On 2017/09/25 12:10, Michael Paquier wrote: >> Hmm, I'm not sure if we need to lock the partitions, too. Locks taken by >> find_all_inheritors() will be gone once the already-running transaction is >> ended by the caller (vacuum()). get_rel_oids() should just lock the table >> mentioned in the command (the parent table), so that find_all_inheritors() >> returns the correct partition list, as Tom perhaps alluded to when he said >> "it would also make the find_all_inheritors call a lot more meaningful.", >> but... > > It is important to hold the lock for child tables until the end of the > transaction actually to fill in correctly the RangeVar associated to > each partition when scanning them. I think that's right, although, I don't see any new RangeVar created under vacuum() at the moment. Maybe, you're referring to the Nathan's patch that perhaps does that. >> When vacuum() iterates that list and calls vacuum_rel() for each relation >> in the list, it might be the case that a relation in that list is no >> longer a partition of the parent. So, one might say that it would get >> vacuumed unnecessarily. Perhaps that's fine? > > I am personally fine with that. Any checks would prove to complicate > the code for not much additional value. Tend to agree here. >>> I am not >>> saying that this needs to be fixed in REL_10_STABLE at this stage >>> though as this would require some refactoring similar to what the >>> patch adding support for VACUUM with multiple relations does. >> >> I think it would be better to fix that independently somehow. VACUUM on >> partitioned tables is handled by calling vacuum_rel() on individual >> partitions. It would be nice if vacuum_rel() didn't have to refer to the >> RangeVar. Could we not use get_rel_name(relid) in place of >> relation->relname? I haven't seen the other patch yet though, so maybe >> I'm missing something. > > The RangeVar is used for error reporting, and once you reach > vacuum_rel, the relation and its OID may be gone. So you need to save > the information of the RangeVar beforehand when scanning the > relations. That's one reason behind having the VacuumRelation stuff > discussed on the thread for multiple table VACUUM, this way you store > for each item to be vacuumed: > - A RangeVar. > - A list of columns. > - An OID saved by vacuum deduced from the RangeVar. > That's quite some refactoring, so my opinion is that this ship has > sailed already for REL_10_STABLE, and that we had better live with > that in 10. Yeah, your simple patch seems enough for v10. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/09/26 9:51, Michael Paquier wrote: > On Tue, Sep 26, 2017 at 8:48 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Mon, Sep 25, 2017 at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Yeah, I'd noticed that while reviewing the vacuum-multiple-tables patch. >>> My thought about fixing it was to pass a null RangeVar when handling a >>> table we'd identified through inheritance or pg_class-scanning, to >>> indicate that this wasn't a table named in the original command. This >>> only works conveniently if you decide that it's appropriate to silently >>> ignore relation_open failure on such table OIDs, but I think it is. >>> >>> Not sure about whether we ought to try to fix that in v10. It's a >>> mostly-cosmetic problem in what ought to be an infrequent corner case, >>> so it might not be worth taking risks for post-RC1. OTOH, I would >>> not be surprised to get bug reports about it down the road. >> >> Something like that looks like a good compromise for v10. I would >> rather see a more complete fix with each relation name reported >> correctly on HEAD though. The information provided would be useful for >> users when using autovacuum when skipping a relation because no lock >> could be taken on it. > > Actually, perhaps this should be tracked as an open item? A simple fix > is leading to the path that no information is better than misleading > information in this case. +1. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 26, 2017 at 10:54 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > I think that's right, although, I don't see any new RangeVar created under > vacuum() at the moment. Maybe, you're referring to the Nathan's patch > that perhaps does that. Yes, you can check what it does here (last version): 766556DD-AA3C-42F7-ACF4-5DC97F41F151@amazon.com -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 26, 2017 at 10:55 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On 2017/09/26 9:51, Michael Paquier wrote: >> On Tue, Sep 26, 2017 at 8:48 AM, Michael Paquier >> <michael.paquier@gmail.com> wrote: >>> On Mon, Sep 25, 2017 at 11:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Yeah, I'd noticed that while reviewing the vacuum-multiple-tables patch. >>>> My thought about fixing it was to pass a null RangeVar when handling a >>>> table we'd identified through inheritance or pg_class-scanning, to >>>> indicate that this wasn't a table named in the original command. This >>>> only works conveniently if you decide that it's appropriate to silently >>>> ignore relation_open failure on such table OIDs, but I think it is. >>>> >>>> Not sure about whether we ought to try to fix that in v10. It's a >>>> mostly-cosmetic problem in what ought to be an infrequent corner case, >>>> so it might not be worth taking risks for post-RC1. OTOH, I would >>>> not be surprised to get bug reports about it down the road. >>> >>> Something like that looks like a good compromise for v10. I would >>> rather see a more complete fix with each relation name reported >>> correctly on HEAD though. The information provided would be useful for >>> users when using autovacuum when skipping a relation because no lock >>> could be taken on it. >> >> Actually, perhaps this should be tracked as an open item? A simple fix >> is leading to the path that no information is better than misleading >> information in this case. > > +1. Let's track it then and spawn a separate thread with a patch. Do you want to work on it or should I? The solution proposed by Tom seems like the correct answer. I am adding an item for now, we could always link it to a thread later on. Let's also track the problem that has been reported on this thread. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/09/26 11:14, Michael Paquier wrote: > On Tue, Sep 26, 2017 at 10:55 AM, Amit Langote wrote: >> On 2017/09/26 9:51, Michael Paquier wrote: >>> On Tue, Sep 26, 2017 at 8:48 AM, Michael Paquier wrote: >>>> Something like that looks like a good compromise for v10. I would >>>> rather see a more complete fix with each relation name reported >>>> correctly on HEAD though. The information provided would be useful for >>>> users when using autovacuum when skipping a relation because no lock >>>> could be taken on it. >>> >>> Actually, perhaps this should be tracked as an open item? A simple fix >>> is leading to the path that no information is better than misleading >>> information in this case. >> >> +1. > > Let's track it then and spawn a separate thread with a patch. Do you > want to work on it or should I? The solution proposed by Tom seems > like the correct answer. I am adding an item for now, we could always > link it to a thread later on. I assume you mean the Tom's solution wherein we pass a null RangeVar for tables that were not mentioned in the command (that is, for partitions of a partitioned table that was mentioned in the command or for tables read from pg_class when a user ran VACUUM without mentioning any table name). Please feel free to come up with the patch for the same, if you have time. I'll be glad to review. > Let's also track the problem that has been reported on this thread. Yes. Just to be clear, the original problem this thread was started for is that get_rel_oids() may emit a less user-friendly "cache lookup failed for relation NNNN" error, which it shouldn't. We should fix the locking like the patch you posted does, to avoid having to come across this syscache lookup error. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/09/26 11:12, Michael Paquier wrote: > On Tue, Sep 26, 2017 at 10:54 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> I think that's right, although, I don't see any new RangeVar created under >> vacuum() at the moment. Maybe, you're referring to the Nathan's patch >> that perhaps does that. > > Yes, you can check what it does here (last version): > 766556DD-AA3C-42F7-ACF4-5DC97F41F151@amazon.com Thanks, will look. Regards, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 22, 2017 at 03:13:10PM -0400, Tom Lane wrote: > Somebody inserted this into vacuum.c's get_rel_oids(): > > tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); > if (!HeapTupleIsValid(tuple)) > elog(ERROR, "cache lookup failed for relation %u", relid); > > apparently without having read the very verbose comment two lines above, > which points out that we're not taking any lock on the target relation. > So, if that relation is concurrently being dropped, you're likely to > get "cache lookup failed for relation NNNN" rather than anything more > user-friendly. > > A minimum-change fix would be to replace the elog() with an ereport > that produces the same "relation does not exist" error you'd have > gotten from RangeVarGetRelid, had the concurrent DROP TABLE committed > a few microseconds earlier. But that feels like its's band-aiding > around the problem. > > What I'm wondering about is changing the RangeVarGetRelid call to take > ShareUpdateExclusiveLock rather than no lock. That would protect the > syscache lookup, and it would also make the find_all_inheritors call > a lot more meaningful. > > If we're doing a VACUUM, the ShareUpdateExclusiveLock would be dropped > as soon as we close the caller's transaction, and then we'd acquire > the same or stronger lock inside vacuum_rel(). So that seems fine. > If we're doing an ANALYZE, then the lock would continue to be held > and analyze_rel would merely be acquiring it an extra time, so we'd > actually be removing a race-condition failure scenario for ANALYZE. > This would mean a few more cycles in lock management, but since this > only applies to a manual VACUUM or ANALYZE that specifies a table > name, I'm not too concerned about that. > > Thoughts? This thread now has two open items, both of them pertaining to VACUUM error messages involving partitioning. The pair is probably best treated as a single open item. [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Robert, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Sep 28, 2017 at 1:31 AM, Noah Misch <noah@leadboat.com> wrote: > This thread now has two open items, both of them pertaining to VACUUM error > messages involving partitioning. The pair is probably best treated as a > single open item. If I understand correctly, problem #1 is that get_rel_oids() can emit a user-visible cache lookup failure message. There is a proposed patch by Michael Paquier which appears to implement the design suggested by Tom. I think that the normal procedure would be for Tom to commit that change if he's happy with it. I don't think I understand problem #2. I think the concern is about reporting the proper relation name when VACUUM cascades from a partitioned table to its children and then some kind of concurrent DDL happens, but I don't see a clear explanation on the thread as to what exactly the failure scenario is, and I didn't see a problem in some simple tests I just ran. Furthermore, it sounds like this might get fixed as part of committing the patch to allow VACUUM to mention multiple tables, which Tom has indicated he will handle. > [Action required within three days. This is a generic notification.] If someone could specify a particular action they'd like me to take, I'll consider taking it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 9/29/17, 11:18 AM, "Robert Haas" <robertmhaas@gmail.com> wrote: > I don't think I understand problem #2. I think the concern is about > reporting the proper relation name when VACUUM cascades from a > partitioned table to its children and then some kind of concurrent DDL > happens, but I don't see a clear explanation on the thread as to what > exactly the failure scenario is, and I didn't see a problem in some > simple tests I just ran. Furthermore, it sounds like this might get > fixed as part of committing the patch to allow VACUUM to mention > multiple tables, which Tom has indicated he will handle. Yes. It looks like v10 is safe, and the vacuum-multiple-relations patch should help prevent any future logging issues caused by this. Discussion here: http://postgr.es/m/CAB7nPqRX1465FP%2Bameysxxt63tCQDDW6KvaTPMfkSxaT1TFGfw%40mail.gmail.com Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes: > If I understand correctly, problem #1 is that get_rel_oids() can emit > a user-visible cache lookup failure message. There is a proposed patch > by Michael Paquier which appears to implement the design suggested by > Tom. I think that the normal procedure would be for Tom to commit > that change if he's happy with it. Yes, I'm happy to take responsibility for this. > I don't think I understand problem #2. I think the concern is about > reporting the proper relation name when VACUUM cascades from a > partitioned table to its children and then some kind of concurrent DDL > happens, but I don't see a clear explanation on the thread as to what > exactly the failure scenario is, and I didn't see a problem in some > simple tests I just ran. Furthermore, it sounds like this might get > fixed as part of committing the patch to allow VACUUM to mention > multiple tables, which Tom has indicated he will handle. I think the conclusion was that this wouldn't actually happen in v10, but I will take a closer look and do something about it if it is possible. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 29, 2017 at 12:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> If I understand correctly, problem #1 is that get_rel_oids() can emit >> a user-visible cache lookup failure message. There is a proposed patch >> by Michael Paquier which appears to implement the design suggested by >> Tom. I think that the normal procedure would be for Tom to commit >> that change if he's happy with it. > > Yes, I'm happy to take responsibility for this. Great, thanks. >> I don't think I understand problem #2. I think the concern is about >> reporting the proper relation name when VACUUM cascades from a >> partitioned table to its children and then some kind of concurrent DDL >> happens, but I don't see a clear explanation on the thread as to what >> exactly the failure scenario is, and I didn't see a problem in some >> simple tests I just ran. Furthermore, it sounds like this might get >> fixed as part of committing the patch to allow VACUUM to mention >> multiple tables, which Tom has indicated he will handle. > > I think the conclusion was that this wouldn't actually happen in v10, > but I will take a closer look and do something about it if it is possible. Even better, thanks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Sep 29, 2017 at 12:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> If I understand correctly, problem #1 is that get_rel_oids() can emit >>> a user-visible cache lookup failure message. There is a proposed patch >>> by Michael Paquier which appears to implement the design suggested by >>> Tom. I think that the normal procedure would be for Tom to commit >>> that change if he's happy with it. >> Yes, I'm happy to take responsibility for this. > Great, thanks. After thinking more carefully about the idea I proposed upthread (viz, take and keep ShareUpdateExclusiveLock), I realized that that will be pretty horrid once the proposed patch to allow VACUUM to name multiple target tables goes in. We'd be trying to take a pretty strong lock on multiple tables at once during get_rel_oids(), creating substantial risk of deadlock. Up to now, VACUUM has always been careful to not take more than one such lock at once, and I think we need to preserve that property. The simplest solution therefore seems to be to just take AccessShareLock and then release it as soon as we're done adding the rel and its partitions to the list of target rels. There's little point in taking a stronger lock here if we are going to end the transaction and drop it later, and it's a nicely low-risk answer for v10 in any case. >>> I don't think I understand problem #2. I think the concern is about >>> reporting the proper relation name when VACUUM cascades from a >>> partitioned table to its children and then some kind of concurrent DDL >>> happens, but I don't see a clear explanation on the thread as to what >>> exactly the failure scenario is, and I didn't see a problem in some >>> simple tests I just ran. >> I think the conclusion was that this wouldn't actually happen in v10, >> but I will take a closer look and do something about it if it is possible. > Even better, thanks. After looking closer, I confirmed that there's no live bug today, though it's a closer shave than one could wish. The RangeVar passed to vacuum_rel and/or analyze_rel can be wrong, or even NULL, but it is only used when autovacuum.c requests failure logging, and for that particular use-case the RangeVar will always be correct. I felt though that I didn't want to just leave it like that, particularly in view of the existence of multiple comments claiming things that were entirely false. So I adjusted those two functions to explicitly allow for NULL RangeVars, and tried to improve the comments. Nathan Bossart wants to see logging of relation-open failures in more cases, and to get that we're going to have to work harder to track valid RangeVars for target relations. But that is not v10 material. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers