Re: ice-broker scan thread - Mailing list pgsql-hackers
From | Pollard, Mike |
---|---|
Subject | Re: ice-broker scan thread |
Date | |
Msg-id | 6418CC03D0FB1943A464E1FEFB3ED46B01B220E7@im01.cincom.com Whole thread Raw |
In response to | ice-broker scan thread (Qingqing Zhou <zhouqq@cs.toronto.edu>) |
Responses |
Re: ice-broker scan thread
Re: ice-broker scan thread Re: ice-broker scan thread Re: ice-broker scan thread |
List | pgsql-hackers |
First, we need a new term for a thread of execution, that could be a thread or could be a process, I don't care. When discussing anything that is to run in parallel, the first thing that pops out of someones mouth is "Don't you mean (thread/process)?" But that's an implementation detail and should not be considered during a planning phase, unless it is fundamental to the problem. Hence, the term TOE to mean "I don't really care if it is in it's own address space, or the same address space.". However, I understand that this is not in common usage, so in the following discussion I use the term thread, as it is more correct than process. I am just not defining if that thread is the only thread running in its process or not. I've implemented this on another database product, using buf reading threads to pull the data all the way into the database cache. In testing on Unix production systems (4 CPU machines, large RAID devices, 100Gb+ databases), table scans performed 5 to 7 times faster; on MVS table scans are up to 10 times faster. But, I never had much luck on getting the performance to change on Windows. Partially, I think, it's because the machine I was using was IDE, not SCSI, so I was already greatly bottlenecked. Maybe SATA would be better? I haven't tested there, either. Anyway, what I did was the following. When doing a sequential scan, we were starting at the beginning of the table and scanning forward. If I threw up some threads to read ahead, then my user thread and my read ahead threads would thrash on trying to lock the buffer slots. So, I had the read ahead threads start at some distance into the table, and work toward the beginning. The user thread would do its own I/O until it passed the read ahead threads. I also broke the read ahead section into multiple contiguous sections, and had different threads read each section, so the user thread would only have a problem with the first section; by the time it was finished with that, the other sections would be read in. When the user thread got to about 80% of the nodes that got read ahead, it would schedule another section to be read. +----------------------------------------------------------------+ | table + +----------------------------------------------------------------+ (user->) (<-readahead) (<-readahead) (<-readaehead) so above, the user threads is starting low in the table and working high; the readahead threads are starting higher (but not at the end of the table), and working low. Like I said, this worked very well for me. Mike Pollard SUPRA Server SQL Engineering and Support Cincom Systems, Inc. --------------------------------Better to remain silent and be thought a fool than to speak out and remove all doubt. Abraham Lincoln -----Original Message----- From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Qingqing Zhou Sent: Tuesday, November 29, 2005 12:56 AM To: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] ice-broker scan thread "David Boreham" <david_list@boreham.org> wrote >> > I don't think your NT overlapped I/O code is quite right. At least > I think it will issue reads at a high rate without waiting for any of them > to complete. Beyond some point that has to give the kernel gut-rot. > [also with reply to Gavin] look up dictionary for "gut-rot", got it ... Uh, this behavior is intended - I try to push enough requests shortly to kernel so that it understands that I am doing sequential scan, so it would pull the data from disk to file system cache more efficiently. Some file systems may have "free-behind" mechanism, but our main thread (who really process the query) should be fast enough before the data vanished. > > You could re-write your program to have a single thread but use aio. > In that case it should show the same read ahead benefit that you see > with the thread. > I guess this is also Gavin's point - I understand that will be two different methodologies to handle "read-ahead". If no other thread/process involved, then the main thread will be responsible to grab a free buffer page from bufferpool and ask the kernel to put the data there by sync IO (current PostgreSQL does) or async IOs. And that's what I want to avoid. I'd like to use a dedicated thread/process to "break the ice" only, i.e., pull data from disk to file system cache, so that the main thread will only issue *logical* read. Regards, Qingqing ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
pgsql-hackers by date: