Re: old synchronized scan patch - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: old synchronized scan patch
Date
Msg-id 45783DFE.7080503@enterprisedb.com
Whole thread Raw
In response to Re: old synchronized scan patch  ("Luke Lonergan" <llonergan@greenplum.com>)
Responses Re: old synchronized scan patch
List pgsql-hackers
Matthew O'Connor wrote:
> 
> Could we have a counter in shared memory that keeps a count on the 
> number seq scanners currently working on a table?  While count = 1, we 
> report all blocks regardless if it's from disk or from buffer.  Once 
> count > 1 we only report buffer reads.  Would this have locking problems?

I don't know, but it would require some extra care to make sure that the 
counter is decremented when a for example a transaction is aborted, or a 
backend is killed.

> BTW, it seems to me that this is all based on the assumption that 
> followers will have no problem keeping up with the pack leader.  Suppose 
> my process does a lot of other processing and can't keep up with the 
> pack despite the fact that it's getting all it's data from the buffer. 
> Now we have effectively have two different seq scans going on.  Does my 
> process need to recognize that it's not keeping up and not report it's 
> blocks?

That's what I was wondering about all these schemes as well. What we 
could do, is that instead of doing a sequential scan, each backend keeps 
a bitmap of pages it has processed during the scan, and read the pages 
in the order they're available in cache. If a backend misses a page in 
the synchronized scan, for example, it could issue a random I/O after 
reading all the other pages to fetch it separately, instead of starting 
another seq scan at a different location and "derailing the train". I 
don't know what the exact algorithm used to make decisions on when and 
how to fetch each page would be, but the bitmaps would be in backend 
private memory. And presumably it could be used with bitmap heap scans 
as well.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Load distributed checkpoint
Next
From: Tom Lane
Date:
Subject: Re: [COMMITTERS] pgsql: Fix planning of SubLinks to ensure that