Thread: PostgreSQL advocacy
If anybody puts together a "just the facts" document after Oracle's attack on PostgreSQL in Russia, please make sure it's drawn to the attention of this mailing list for the benefit of those who aren't in -advocacy. I was discussing this sort of thing elsewhere in the context of MS's apparent challenge to Oracle and IBM, and the dominant feeling appeared to be that actual use of things like Oracle RAC was vanishingly uncommon. Which surprised me, and which I'm treating with caution since the fact that facilities aren't used (in a certain population of developers etc.) can in no way be interpreted as meaning that the technology is not unavailable or unreliable. -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues]
Mark Morgan Lloyd schrieb am 21.03.2016 um 14:44: > I was discussing this sort of thing elsewhere in the context of MS's > apparent challenge to Oracle and IBM, and the dominant feeling > appeared to be that actual use of things like Oracle RAC was > vanishingly uncommon. Which surprised me, and which I'm treating with > caution since the fact that facilities aren't used (in a certain > population of developers etc.) can in no way be interpreted as > meaning that the technology is not unavailable or unreliable. RAC is usually used for high-availability not for (horizontal) scaling. All nodes in a RAC cluster share the same I/O system. So I/O is still the bottleneck and you can't use a RAC to scale a systemthat is I/O bound. Back in the days when RAC was introduced multi-core, multi-CPU servers weren't that common (and and way fewer CPUs as high-serverstoday) and for systems like that, RAC _can_ indeed be used to scale the system. And the cache synchronization across the nodes can quickly become a *serious* bottleneck if the application isn't reallydesigned for it. I have seen misbehaving applications that would cause Oracle to spent over 30% of its processing time only with sending blocksback and forth between the nodes. So - at least as far as I can tell - it's usually only used where high-availability is really important, e.g. where zero-downtimeis required. If you can live with a short downtime, a hot standby is much cheaper and probably not that much slower. See e.g. here: http://www.sdmc.nl/YouProbablyDontNeedRACUSVersion.pdf and here: http://nyoug.org/Presentations/2006/September_NYC_Metro_Meeting/200609Zito_You%20Probably%20DO%20Need%20RAC.pdf Thomas
On Mon, Mar 21, 2016 at 7:44 AM, Mark Morgan Lloyd <markMLl.pgsql-general@telemetry.co.uk> wrote: > If anybody puts together a "just the facts" document after Oracle's attack > on PostgreSQL in Russia, please make sure it's drawn to the attention of > this mailing list for the benefit of those who aren't in -advocacy. > > I was discussing this sort of thing elsewhere in the context of MS's > apparent challenge to Oracle and IBM, and the dominant feeling appeared to > be that actual use of things like Oracle RAC was vanishingly uncommon. Which > surprised me, and which I'm treating with caution since the fact that > facilities aren't used (in a certain population of developers etc.) can in > no way be interpreted as meaning that the technology is not unavailable or > unreliable. I've submitted three different bug reports and had a patch within 48 hours each time. the responsiveness of this list, and the folks who code PostgreSQL is far above any level of support I've ever gotten from Oracle. I once asked Oracle to please package the newest connection libs into an RPM for RHEL5 and their response was "do it yourself." Yeah, I know which database has REAL, USEFUL support for a DBA and it isn't Oracle.
On 03/21/2016 10:57 AM, Thomas Kellerer wrote: > So - at least as far as I can tell - it's usually only used where high-availability is really important, e.g. where zero-downtimeis required. > If you can live with a short downtime, a hot standby is much cheaper and probably not that much slower. Even the above statement can be challenged , given the rising popularity of nosql databases which are all based on eventual consistency (aka async replication). A PG with BDR and an application designed to read/write only one node via connection mapping can match the high availability requirement of RAC. BTW disk is a single point of failure in RAC.
On 3/21/16, 9:10 AM, "pgsql-general-owner@postgresql.org on behalf of Rakesh Kumar" <pgsql-general-owner@postgresql.org onbehalf of rakeshkumar464a3@gmail.com> wrote: >On 03/21/2016 10:57 AM, Thomas Kellerer wrote: > >> So - at least as far as I can tell - it's usually only used where high-availability is really important, e.g. where zero-downtimeis required. >> If you can live with a short downtime, a hot standby is much cheaper and probably not that much slower. > >Even the above statement can be challenged , given the rising popularity >of nosql databases which are all based on >eventual consistency (aka async replication). > >A PG with BDR and an application designed to read/write only >one node via connection mapping can match the high availability >requirement of RAC. > >BTW disk is a single point of failure in RAC. > > >-- >Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >To make changes to your subscription: >http://www.postgresql.org/mailpref/pgsql-general Disk is only a single point of failure in RAC if you configure non-redundant storage. In general, Oracle recommends triplemirroring to protect against disk failures, as they have had many experiences over the years where customers with mirroreddisks would see consecutive disk failures within short periods of time. And RAC is widely used by Oracle’s larger customers, not only for HA, but also in some cases for scale-out. Having said that,it’s very true that any application running on Oracle RAC must be configured to avoid hot block contention across RACnodes, so it’s not a completely transparent solution for scale out. -KJ (original product manager for Oracle Parallel Server, the distant ancestor of RAC)
On Mon, Mar 21, 2016 at 04:46:51PM +0000, Jernigan, Kevin wrote: > Disk is only a single point of failure in RAC if you configure > non-redundant storage. In general, Oracle recommends triple mirroring > to protect against disk failures, as they have had many experiences > over the years where customers with mirrored disks would see > consecutive disk failures within short periods of time. > > And RAC is widely used by Oracle’s larger customers, not only > for HA, but also in some cases for scale-out. Having said that, > it’s very true that any application running on Oracle RAC must be > configured to avoid hot block contention across RAC nodes, so it’s > not a completely transparent solution for scale out. I get asked about Oracle RAC often. My usual answer is that Oracle RAC gives you 50% of high reliability (storage is shared, mirroring helps) and 50% of scaling (CPU/memory is scaled, storage is not). The requirement to partition applications to specific nodes to avoid cache consistency overhead is another downside. (Slide 24 of my scaling presentation shows Oracle RAC, http://momjian.us/main/writings/pgsql/scaling.pdf .) I said the community is unlikely to go the Oracle RAC direction because it doesn't fully solve a single problem, and it is overly complex. The community prefers fully-solved problems and simpler solutions. For me, streaming replication fully solves the high reliability problem and sharding fully solves the scaling problem. Of course, if you need both, you have to deploy both, which gives you 100% of two solutions, rather than Oracle RAC which gives you 50% of each. However, I do think database upgrades are easier with Oracle RAC, and I think it is much easier to add/remove nodes than with sharding. For me, this chart summarizes it: HA Scaling Upgrade Add/Remove Oracle RAC 50% 50% easy easy Streaming Rep. 100% 25%* hard easy Sharding 0% 100% hard hard * Allows read scaling -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Roman grave inscription +
Bruce Momjian schrieb am 22.03.2016 um 16:07: > For me, streaming replication fully solves the high reliability problem > and sharding fully solves the scaling problem. Of course, if you need > both, you have to deploy both, which gives you 100% of two solutions, > rather than Oracle RAC which gives you 50% of each. > > However, I do think database upgrades are easier with Oracle RAC, and I > think it is much easier to add/remove nodes than with sharding. For me, > this chart summarizes it: > > HA Scaling Upgrade Add/Remove > Oracle RAC 50% 50% easy easy > Streaming Rep. 100% 25%* hard easy > Sharding 0% 100% hard hard > > * Allows read scaling To be fair: you don't need RAC in Oracle to get streaming replication. You can use a hot-standby in Oracle the same way you do in Postgres And if you use a "cold-standby" (where only the archive logs are applied, but the instance is not started) you don't evenhave to pay for the second license. > However, I do think database upgrades are easier with Oracle RAC I think you can do a rolling upgrade with a standby, but I'm not entirely sure. Thomas
On Tue, Mar 22, 2016 at 9:15 AM, Thomas Kellerer <spam_eater@gmx.net> wrote: > Bruce Momjian schrieb am 22.03.2016 um 16:07: >> >> However, I do think database upgrades are easier with Oracle RAC > > I think you can do a rolling upgrade with a standby, but I'm not entirely sure. I find Slony good for upgrading versions with minimal downtime, including major version changes. It's very nature allows you to migrate pieces and parts for testing etc, in ways that any kind of byte streaming just can't do.
On Tue, Mar 22, 2016 at 10:16:22AM -0600, Scott Marlowe wrote: > On Tue, Mar 22, 2016 at 9:15 AM, Thomas Kellerer <spam_eater@gmx.net> wrote: > > Bruce Momjian schrieb am 22.03.2016 um 16:07: > >> > >> However, I do think database upgrades are easier with Oracle RAC > > > > I think you can do a rolling upgrade with a standby, but I'm not entirely sure. > > I find Slony good for upgrading versions with minimal downtime, > including major version changes. It's very nature allows you to > migrate pieces and parts for testing etc, in ways that any kind of > byte streaming just can't do. Yes, and I assume logical replication will allow similar easy upgrades. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Roman grave inscription +
Jernigan, Kevin wrote: > Disk is only a single point of failure in RAC if you configure non-redundant storage. > In general, Oracle recommends triple mirroring to protect against disk failures, > as they have had many experiences over the years where customers with mirrored disks > would see consecutive disk failures within short periods of time. The single point of failure in Oracle RAC is the ASM file system. Yours, Laurenz Albe
On 3/24/16, 3:09 PM, "Albe Laurenz" <laurenz.albe@wien.gv.at> wrote: >Jernigan, Kevin wrote: >> Disk is only a single point of failure in RAC if you configure non-redundant storage. >> In general, Oracle recommends triple mirroring to protect against disk failures, >> as they have had many experiences over the years where customers with mirrored disks >> would see consecutive disk failures within short periods of time. > >The single point of failure in Oracle RAC is the ASM file system. > >Yours, >Laurenz Albe Only if you misconfigure ASM for RAC: with RAC, an ASM instance will run on every RAC node, and if the ASM instance failson any one node, the RAC instance on that node will go down, but the RAC instances on the other nodes will continueto run - so the database will remain accessible, though with fewer processors available. If you configure ASM to implement at least dual mirroring for storage - and I’m pretty sure Oracle intentionally makes ithard to configure ASM without mirroring - then ASM will continue run through any single disk failure. -KJ
On 3/22/16, 8:07 AM, "Bruce Momjian" <bruce@momjian.us> wrote: >On Mon, Mar 21, 2016 at 04:46:51PM +0000, Jernigan, Kevin wrote: >> Disk is only a single point of failure in RAC if you configure >> non-redundant storage. In general, Oracle recommends triple mirroring >> to protect against disk failures, as they have had many experiences >> over the years where customers with mirrored disks would see >> consecutive disk failures within short periods of time. >> >> And RAC is widely used by Oracle’s larger customers, not only >> for HA, but also in some cases for scale-out. Having said that, >> it’s very true that any application running on Oracle RAC must be >> configured to avoid hot block contention across RAC nodes, so it’s >> not a completely transparent solution for scale out. > >I get asked about Oracle RAC often. My usual answer is that Oracle RAC >gives you 50% of high reliability (storage is shared, mirroring helps) >and 50% of scaling (CPU/memory is scaled, storage is not). The >requirement to partition applications to specific nodes to avoid cache >consistency overhead is another downside. (Slide 24 of my scaling >presentation shows Oracle RAC, >http://momjian.us/main/writings/pgsql/scaling.pdf .) > >I said the community is unlikely to go the Oracle RAC direction because >it doesn't fully solve a single problem, and it is overly complex. The >community prefers fully-solved problems and simpler solutions. > >For me, streaming replication fully solves the high reliability problem >and sharding fully solves the scaling problem. Of course, if you need >both, you have to deploy both, which gives you 100% of two solutions, >rather than Oracle RAC which gives you 50% of each. > >However, I do think database upgrades are easier with Oracle RAC, and I >think it is much easier to add/remove nodes than with sharding. For me, >this chart summarizes it: > > HA Scaling Upgrade Add/Remove > Oracle RAC 50% 50% easy easy > Streaming Rep. 100% 25%* hard easy > Sharding 0% 100% hard hard > > * Allows read scaling > >-- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > >+ As you are, so once was I. As I am, so you will be. + >+ Roman grave inscription + Implementing RAC-equivalent functionality is extremely hard, as evidenced by the lack of any directly comparable capabilityfrom any other relational db engine, until the release of IBM DB2 Shareplex a few years ago. And given the improvementof PostgreSQL and other open source solutions over the past 20 years, it’s not clear that it makes sense to gothrough the initial design and implementation work and then the ongoing maintenance overhead - most of what RAC providescan be achieved through other existing capabilities. While I’m not sure that the percentage breakdowns in your chart are totally accurate, I agree with the general assessment,except for the highest-end applications which have zero-downtime requirements which can’t be met with streamingreplication: the overhead of synchronous replication limits scalability, and the failover time for moving from primaryto a failover target is significantly slower than RAC - which can be literally zero if configured correctly. The higher-level point that I think is important is that while I may be able to win technical arguments that RAC is betterfor certain high-end extreme workloads - and maybe I can’t even win those arguments ;-) - the real issue is that therearen’t very many of those workloads, and the PostgreSQL community shouldn’t care: the vast majority of Oracle (and SQLServer etc) workloads don’t need all the fancy high-end RAC capabilities, or many of the other high-end commercial databasecapabilities. And those workloads can relatively easily be migrated to PostgreSQL, with minor disruption / changeto schemas, data, triggers, constraints, procedural SQL… -KJ
Jernigan, Kevin wrote: > On 3/22/16, 8:07 AM, "Bruce Momjian" <bruce@momjian.us> wrote: >> >> HA Scaling Upgrade Add/Remove >> Oracle RAC 50% 50% easy easy >> Streaming Rep. 100% 25%* hard easy >> Sharding 0% 100% hard hard >> >> * Allows read scaling >> >> -- >> Bruce Momjian <bruce@momjian.us> http://momjian.us >> EnterpriseDB http://enterprisedb.com >> >> + As you are, so once was I. As I am, so you will be. + >> + Roman grave inscription + > > Implementing RAC-equivalent functionality is extremely hard, as evidenced by the lack of any directly comparable capabilityfrom any other relational db engine, until the release of IBM DB2 Shareplex a few years ago. And given the improvementof PostgreSQL and other open source solutions over the past 20 years, it’s not clear that it makes sense to gothrough the initial design and implementation work and then the ongoing maintenance overhead - most of what RAC providescan be achieved through other existing capabilities. Hearing what IBM's strong points are is always useful, since the various flavours of DB2 obviously have facilities to which other databases should aspire. As with Oracle, DB2's strong points aren't really well-publicised, and things are further complicated by the variant terminology which IBM has evolved over the half century they've been building mainframes. > While I’m not sure that the percentage breakdowns in your chart are totally accurate, I agree with the general assessment,except for the highest-end applications which have zero-downtime requirements which can’t be met with streamingreplication: the overhead of synchronous replication limits scalability, and the failover time for moving from primaryto a failover target is significantly slower than RAC - which can be literally zero if configured correctly. > > The higher-level point that I think is important is that while I may be able to win technical arguments that RAC is betterfor certain high-end extreme workloads - and maybe I can’t even win those arguments ;-) - the real issue is that therearen’t very many of those workloads, and the PostgreSQL community shouldn’t care: the vast majority of Oracle (and SQLServer etc) workloads don’t need all the fancy high-end RAC capabilities, or many of the other high-end commercial databasecapabilities. And those workloads can relatively easily be migrated to PostgreSQL, with minor disruption / changeto schemas, data, triggers, constraints, procedural SQL… What I've seen so far suggests that if MS is positioning SQL Server to challenge Oracle, it's basically looking for low-hanging fruit: in particular supplementary databases which corporates have put onto Oracle out of habit but which quite simply don't need some of the higher-end facilities for which Oracle is harvesting revenue. Just because a corporate has a hundred sites cooperating for inventory management doesn't mean that the canteen menus have to be stored on Oracle RAC :-) -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues]
On 3/25/16, 4:37 AM, "pgsql-general-owner@postgresql.org on behalf of Mark Morgan Lloyd" <pgsql-general-owner@postgresql.orgon behalf of markMLl.pgsql-general@telemetry.co.uk> wrote: >Jernigan, Kevin wrote: >> On 3/22/16, 8:07 AM, "Bruce Momjian" <bruce@momjian.us> wrote: > >>> >>> HA Scaling Upgrade Add/Remove >>> Oracle RAC 50% 50% easy easy >>> Streaming Rep. 100% 25%* hard easy >>> Sharding 0% 100% hard hard >>> >>> * Allows read scaling >>> >>> -- >>> Bruce Momjian <bruce@momjian.us> http://momjian.us >>> EnterpriseDB http://enterprisedb.com >>> >>> + As you are, so once was I. As I am, so you will be. + >>> + Roman grave inscription + >> >> Implementing RAC-equivalent functionality is extremely hard, as evidenced by the lack of any directly comparable capabilityfrom any other relational db engine, until the release of IBM DB2 Shareplex a few years ago. And given the improvementof PostgreSQL and other open source solutions over the past 20 years, it’s not clear that it makes sense to gothrough the initial design and implementation work and then the ongoing maintenance overhead - most of what RAC providescan be achieved through other existing capabilities. > >Hearing what IBM's strong points are is always useful, since the various >flavours of DB2 obviously have facilities to which other databases >should aspire. As with Oracle, DB2's strong points aren't really >well-publicised, and things are further complicated by the variant >terminology which IBM has evolved over the half century they've been >building mainframes. > >> While I’m not sure that the percentage breakdowns in your chart are totally accurate, I agree with the general assessment,except for the highest-end applications which have zero-downtime requirements which can’t be met with streamingreplication: the overhead of synchronous replication limits scalability, and the failover time for moving from primaryto a failover target is significantly slower than RAC - which can be literally zero if configured correctly. >> >> The higher-level point that I think is important is that while I may be able to win technical arguments that RAC is betterfor certain high-end extreme workloads - and maybe I can’t even win those arguments ;-) - the real issue is that therearen’t very many of those workloads, and the PostgreSQL community shouldn’t care: the vast majority of Oracle (and SQLServer etc) workloads don’t need all the fancy high-end RAC capabilities, or many of the other high-end commercial databasecapabilities. And those workloads can relatively easily be migrated to PostgreSQL, with minor disruption / changeto schemas, data, triggers, constraints, procedural SQL… > >What I've seen so far suggests that if MS is positioning SQL Server to >challenge Oracle, it's basically looking for low-hanging fruit: in >particular supplementary databases which corporates have put onto Oracle >out of habit but which quite simply don't need some of the higher-end >facilities for which Oracle is harvesting revenue. > >Just because a corporate has a hundred sites cooperating for inventory >management doesn't mean that the canteen menus have to be stored on >Oracle RAC :-) > Right, but often the customer has paid for a site license, in which case the IT department will just keep spinning up moreOracle (or SQL Server or DB2) databases when requests come in - even if it’s overkill for the proposed use case / workload,it’s less work if IT only has one database technology to support. For all kinds of often cloud-y reasons, there have been recent stories in the press of many enterprise customers not renewingtheir site licenses, in favor of cherry-picking their biggest / hardest workloads for the commercial databases, andthen moving the rest to open source, often though not always to PostgreSQL, and often in the cloud.
On Fri, Mar 25, 2016 at 4:15 PM, Jernigan, Kevin <kmj@amazon.com> wrote: > On 3/25/16, 4:37 AM, "pgsql-general-owner@postgresql.org on behalf of Mark Morgan Lloyd" <pgsql-general-owner@postgresql.orgon behalf of markMLl.pgsql-general@telemetry.co.uk> wrote: >> Just because a corporate has a hundred sites cooperating for inventory >> management doesn't mean that the canteen menus have to be stored on >> Oracle RAC :-) > > Right, but often the customer has paid for a site license, in > which case the IT department will just keep spinning up more Oracle > (or SQL Server or DB2) databases when requests come in - even if > it’s overkill for the proposed use case / workload, it’s less work > if IT only has one database technology to support. I worked with one company that was running just about everything on RAC, and thought that would be a barrier to moving from Oracle. When we talked about how and why they were using RAC, it turned out they were basically just using it for elastic resource allocation -- they were always bringing up new applications and never really knew which ones would grow to need lots of resources and which would fail to catch on and would wither away. With that perspective on it, they realized that VMs and containers handled that need better than RAC, and an apparently large obstacle to moving away from Oracle just fell away. There is always some inertia involved in a change like that, even with a move to a technology that serves the company better in the long term; but if they can start down that road they are likely to find the desire to eliminate different ways to do the same thing a reason to move away from RAC or similar "lock in" technologies. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Jernigan, Kevin wrote: > On 3/25/16, 4:37 AM, "pgsql-general-owner@postgresql.org on behalf of Mark Morgan Lloyd" <pgsql-general-owner@postgresql.orgon behalf of markMLl.pgsql-general@telemetry.co.uk> wrote: >> Just because a corporate has a hundred sites cooperating for inventory >> management doesn't mean that the canteen menus have to be stored on >> Oracle RAC :-) >> > Right, but often the customer has paid for a site license, in which case the IT department will just keep spinning up moreOracle (or SQL Server or DB2) databases when requests come in - even if it’s overkill for the proposed use case / workload,it’s less work if IT only has one database technology to support. OTOH, if the license takes the number of CPUs/cores into account then adding even unsophisticated unrelated databases will, eventually, cost. -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues]
Jernigan, Kevin wrote: >On 3/24/16, 3:09 PM, "Albe Laurenz" <laurenz.albe@wien.gv.at> wrote: >>> Disk is only a single point of failure in RAC if you configure non-redundant storage. >>> In general, Oracle recommends triple mirroring to protect against disk failures, >>> as they have had many experiences over the years where customers with mirrored disks >>> would see consecutive disk failures within short periods of time. >> >>The single point of failure in Oracle RAC is the ASM file system. > > Only if you misconfigure ASM for RAC: with RAC, an ASM instance will run on every RAC node, > and if the ASM instance fails on any one node, the RAC instance on that node will go down, > but the RAC instances on the other nodes will continue to run - so the database will remain > accessible, though with fewer processors available. > > If you configure ASM to implement at least dual mirroring for storage - and I’m pretty sure > Oracle intentionally makes it hard to configure ASM without mirroring - then ASM will continue > run through any single disk failure. I think you missed my point. I am not talking about disk failure, but about some failure (possibly a software bug or a combination of hardware problem and software weakness) that causes the on-disk data to be corrupted. File system corruption. Mirroring will only mirror such a corruption, and multiple ASM instances that all access the same corrupted data won't help either. Of course Oracle says that ASM is so simple and bullet-proof that this cannot happen, but claiming that something cannot fail is not good enough. RAC is a shared storage system, and that shared storage is a single point of failure. Yours, Laurenz Albe