Home > mailing lists

Re: Four issues why "old elephants" lack performance: Explanation sought Four issues why "old elephants" lack performance: Explanation sought - Mailing list pgsql-general

From	Andy Colson
Subject	Re: Four issues why "old elephants" lack performance: Explanation sought Four issues why "old elephants" lack performance: Explanation sought
Date	February 26, 2012 10:38:12
Msg-id	4F4A43C2.9050206@squeakycode.net Whole thread Raw
In response to	Four issues why "old elephants" lack performance: Explanation sought Four issues why "old elephants" lack performance: Explanation sought (Stefan Keller <sfkeller@gmail.com>)
Responses	Re: Four issues why "old elephants" lack performance: Explanation sought Four issues why "old elephants" lack performance: Explanation sought
List	pgsql-general

Tree view

On 02/25/2012 06:54 PM, Stefan Keller wrote:
> Hi,
>
> Recently Mike Stonebraker identified four areas where "old elephants"
> lack performance [1]:
>
> 1. Buffering/paging
> 2. Locking/Multithreading
> 3. WAL logging
> 4. Latches (aka memory locks for concurrent access of btree structures
> in buffer pool?).
>
> He claims having solved these issues while retaining SQL and ACID.
> But the only issue I understood is #1 by loading all tuples in-memory.
> =>  Are there any ideas on how to tell Postgres to aggressively load
> all data into memory (issue #1)?
> All remaining issues make me wonder.
> I actually doubt that there are alternatives even theoretically.
> =>  Can anyone help explaining me issues 2,3 and 4, their solutions,
> and why Postgres would be unable to resolve them?
>

> 1. Buffering/paging

PG, and your operating system, already do this for reads.  It also keeps things that are hit harder and lets things go
thatare not hit as much.  On the writing side, you can configure it PG from "write it! write it NOW!", to "running with
scissors"depending on how safe you want to feel. 

> 2. Locking/Multithreading

PG does have some internal structures that it needs to lock (and anytime you read lock, think single user access, or
oneat a time, or slow).  Any time you hear about lock contention, it's multiple processes waiting in a line for a lock.
If you only had one client, then you really would not need locks.  There is where multithreading comes from, but in PG
weuse multi-process instead of multi-thread, but its the same thing.  Two (or more) people are needing to lock
somethingso they can really screw with it.  PG does not need as many locks as other db's however.  It uses an MVCC
architectureso under normal operations (insert, update, select, delete) people dont block eacth other. (ie readers dont
blockwriters and visa versa). 

I don't see locking going away, but there are not many loads that are lock bound.  Most database loads are IO bound,
andthen you'd probably be CPU bound before you are lock bound.  (although it can be hard to tell if its a spin lock
that'smaking you cpu bound).  I'm sure there are loads that hit lock contention, but there are probably ways to
mitigateit.  Say you have a process that alters the table and adds a new column every two seconds, thing updates a
singlerow to add data to the new column just added.  I can see that being lock bound.  And a really stupid
implementation.

> 3. WAL logging

PG writes a transaction twice.  Once to WAL and once to the DB.  WAL is a simple and quick write, and is only ever used
ifyour computer crashes and PG has to re-play transactions to get the db into a good/known state.  Its a safety measure
thatdoesn't really take much time, and I don't think I've heard of anyone being WAL bound.  Although it does increase
IOops, it's not the biggest usage of IO.  This one falls under "lets be safe" which is something NoSQL did away with.
Itsnot something I want to give up, personally.  I like using a net. 

> 4. Latches

I can only guess at this one.  Its similar to locks I think.  Data structures come in different types.  In the old days
weonly had single user access to data structures, then when we wanted two users to access it we just locked it to
serializeaccess (one at a time mode), but that does not scale well at all, so we invented two new types: lock free and
waitfree. 

An index is stored as a btree.  To insert a new record into the index you have to reorganize it (rotate it, sort it,
add/deletenodes, etc), and while one client is doing that it can make it hard for another to try and search it.  Lock
free(and wait free) let multiple people work on a btree at the same time with much less contention.  Wikipedia does a
betterjob of explaining them than I could: 

http://en.wikipedia.org/wiki/Non-blocking_algorithm

I have no idea if PG uses single user locks or some kind of lock free structure for its internals.  I can see different
partsof the internals needing different levels. 

Maybe I'm old fashioned, but I don't see how you'll get rid of these.  You have to insert a record.  You have to have
12people hitting the db at the same time.  You have to organize that read/write access somehow so they dont blow each
otherup. 

-Andy

pgsql-general by date:

From: Jayashankar K B
Date: 26 February 2012, 08:16:33
Subject: Re: Re: [PERFORM] Disable-spinlocks while compiling postgres 9.1 for ARM Cortex A8

From: Clodoaldo Neto
Date: 26 February 2012, 11:45:32
Subject: Constant value for a partitioned table query inside a plpgsql function

Re: Four issues why "old elephants" lack performance: Explanation sought Four issues why "old elephants" lack performance: Explanation sought - Mailing list pgsql-general

Previous

Next