Thread: Postgres and Oracle differences and questions
I have recently been working on several projects that use MSAccess as a front end to both a Postgres and Oracle DB. Unfortunately, I have noticed some differences and limitations in Oracle that seem to make Postgres look like a better choice for the application. My problem is with the block sizing differences in PG and Oracle. It seems that when mapping memo types from Access to Postgres you could create a text type in Postgres that emulates the Access memo type, and you could define a table with any number of these types of fields. In Oracle you can only have one memo field mapped to a table with a max size equal I believe to the block size, and if you use the max block size then you cannot define any other fields in the same table. I am curious how Postgres handles text types, is a var char or does it allocate the full 8K for the text type. In Oracle, the size of the table definition has to be within the block boundary, is the same restriction true in Postgres. D. Johnson Spectra Data Systems Inc
D. Johnson wrote: > I have recently been working on several projects that use MSAccess as a > front end to both a Postgres and Oracle DB. Unfortunately, I have > noticed some differences and limitations in Oracle that seem to make > Postgres look like a better choice for the application. My problem is > with the block sizing differences in PG and Oracle. It seems that when > mapping memo types from Access to Postgres you could create a text type > in Postgres that emulates the Access memo type, and you could define a > table with any number of these types of fields. In Oracle you can only > have one memo field mapped to a table with a max size equal I believe to > the block size, and if you use the max block size then you cannot define > any other fields in the same table. > > I am curious how Postgres handles text types, is a var char or does it > allocate the full 8K for the text type. In Oracle, the size of the table > definition has to be within the block boundary, is the same restriction > true in Postgres. The size limitations (8K by default, 32K max) are gone with 7.1. Well, you shouldn't really use 100+ MB sized rows, because the resulting INSERT already needs to travel from the frontend, through the parser down into theexecutor. And on SELECT the client needs to buffer all the data at once in memory. But if you really need to do it, swap space is cheap... Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Jan Wieck wrote: > D. Johnson wrote: > >> I have recently been working on several projects that use MSAccess as a >> front end to both a Postgres and Oracle DB. Unfortunately, I have >> noticed some differences and limitations in Oracle that seem to make >> Postgres look like a better choice for the application. My problem is >> with the block sizing differences in PG and Oracle. It seems that when >> mapping memo types from Access to Postgres you could create a text type >> in Postgres that emulates the Access memo type, and you could define a >> table with any number of these types of fields. In Oracle you can only >> have one memo field mapped to a table with a max size equal I believe to >> the block size, and if you use the max block size then you cannot define >> any other fields in the same table. >> >> I am curious how Postgres handles text types, is a var char or does it >> allocate the full 8K for the text type. In Oracle, the size of the table >> definition has to be within the block boundary, is the same restriction >> true in Postgres. > > > The size limitations (8K by default, 32K max) are gone with > 7.1. Well, you shouldn't really use 100+ MB sized rows, > because the resulting INSERT already needs to travel from the > frontend, through the parser down into the executor. And on > SELECT the client needs to buffer all the data at once in > memory. Well, you could roll your own client that streams parts of a query from file. Or you can INSERT the results of a user-defined function that reads from a file/socket/URL or just makes up the field contents ;) > But if you really need to do it, swap space is > cheap... What are the performance implications - is it 1.01, 10 or 1000 times slower than accessing the same file from fs for files in the range of typical Office documents (0.2-20M) ? Will updating one non-toasted field in a tuple copy the toasted one as well, or is only the reference copied ? ------------------ Hannu
Hannu Krosing <hannu@tm.ee> writes: > What are the performance implications - is it 1.01, 10 or 1000 times > slower than accessing the same file from fs for files in the range of > typical Office documents (0.2-20M) ? I think that's a meaningless question unless you specify a particular usage pattern --- but in any case, I doubt anyone has tried to characterize it that way. Feel free to run some experiments and tell us about your results... > Will updating one non-toasted field in a tuple copy the toasted one > as well, or is only the reference copied ? It copies the reference, see toast_insert_or_update in src/backend/access/heap/tuptoaster.c. Speaking of which, though, it looks like an update or insert will forcibly uncompress (and later recompress) a compressed-in-line datum, which seems like a waste of cycles to me. Jan, shouldn't the test for VARATT_IS_EXTENDED at line 357 instead read VARATT_IS_EXTERNAL? regards, tom lane
Tom Lane wrote: > > Speaking of which, though, it looks like an update or insert will > forcibly uncompress (and later recompress) a compressed-in-line datum, > which seems like a waste of cycles to me. Jan, shouldn't the test for > VARATT_IS_EXTENDED at line 357 instead read VARATT_IS_EXTERNAL? Not without some more logic added. We don't have any admin commands yet that can modify the toasters strategy on the attribute level, but the config attributes in the tuple descriptor are already there. So you can tell the toaster per attribute if it should try to compress or not, if it should try to keep the attribute in the main tuple harder and the like. You have to modify pg_attribute yourself for now, where we might want to have some ALTER TABLE, don't we? IIRC the above should only be invoked if you do something like INSERT ... SELECT, where the already toasted valueis coming from another tuple than the one you're actually creating/updating. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Jan Wieck <janwieck@Yahoo.com> writes: > Tom Lane wrote: >> Speaking of which, though, it looks like an update or insert will >> forcibly uncompress (and later recompress) a compressed-in-line datum, >> which seems like a waste of cycles to me. Jan, shouldn't the test for >> VARATT_IS_EXTENDED at line 357 instead read VARATT_IS_EXTERNAL? > Not without some more logic added. > We don't have any admin commands yet that can modify the > toasters strategy on the attribute level, but the config > attributes in the tuple descriptor are already there. So you > can tell the toaster per attribute if it should try to > compress or not, if it should try to keep the attribute in > the main tuple harder and the like. You have to modify > pg_attribute yourself for now, where we might want to have > some ALTER TABLE, don't we? What's your point here? I wouldn't think that changing the strategy for a column to "plain" should mean that already-stored values get uncompressed when they're not being modified. Someone who did expect that would probably want the ALTER TABLE command to go through and redo the representation of each row immediately, anyway. ISTM what's really at stake is simply how fast does a strategy change propagate to the individual rows of a table. Given that the strategy values are mostly just hints anyway, it's not clear to me why you insist that changing "x" to "p" must cause decompression at the first touch of a row containing a value, and not either earlier (immediately upon strategy-altering command) or later (when the value in question is actually replaced with something different). > IIRC the above should only be invoked if you do something > like INSERT ... SELECT, where the already toasted value is > coming from another tuple than the one you're actually > creating/updating. No, the problem comes up in a plain UPDATE, if it's altering other fields in the same row. Look again at the code: the comment claims that the UPDATE case has been taken care of above, but that is true only for an externally stored value. So a compressed-in-line field that has been copied from the old tuple will be uncompressed and (presumably) recompressed by the current logic. I say that's silly; we should not pay a performance penalty that large just to have a very subtly different speed of response to strategy-altering commands that don't exist yet. regards, tom lane
Tom Lane wrote: > Jan Wieck <janwieck@Yahoo.com> writes: > > Tom Lane wrote: > >> Speaking of which, though, it looks like an update or insert will > >> forcibly uncompress (and later recompress) a compressed-in-line datum, > >> which seems like a waste of cycles to me. Jan, shouldn't the test for > >> VARATT_IS_EXTENDED at line 357 instead read VARATT_IS_EXTERNAL? > > > Not without some more logic added. > > We don't have any admin commands yet that can modify the > > toasters strategy on the attribute level, but the config > > attributes in the tuple descriptor are already there. So you > > can tell the toaster per attribute if it should try to > > compress or not, if it should try to keep the attribute in > > the main tuple harder and the like. You have to modify > > pg_attribute yourself for now, where we might want to have > > some ALTER TABLE, don't we? > > What's your point here? I wouldn't think that changing the strategy > for a column to "plain" should mean that already-stored values get > uncompressed when they're not being modified. Someone who did expect > that would probably want the ALTER TABLE command to go through and > redo the representation of each row immediately, anyway. > > ISTM what's really at stake is simply how fast does a strategy change > propagate to the individual rows of a table. Given that the strategy > values are mostly just hints anyway, it's not clear to me why you > insist that changing "x" to "p" must cause decompression at the first > touch of a row containing a value, and not either earlier (immediately > upon strategy-altering command) or later (when the value in question > is actually replaced with something different). > > > IIRC the above should only be invoked if you do something > > like INSERT ... SELECT, where the already toasted value is > > coming from another tuple than the one you're actually > > creating/updating. > > No, the problem comes up in a plain UPDATE, if it's altering other > fields in the same row. Look again at the code: the comment claims > that the UPDATE case has been taken care of above, but that is true > only for an externally stored value. So a compressed-in-line field > that has been copied from the old tuple will be uncompressed and > (presumably) recompressed by the current logic. I say that's silly; > we should not pay a performance penalty that large just to have a very > subtly different speed of response to strategy-altering commands that > don't exist yet. It does? Uh - you're right. I wouldn't want to change it now, but ASAP in the 7.2 cycle. Bruce, please add * Don't decompress untouched toast values on UPDATE to TODO. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Added to TODO. > It does? > > Uh - you're right. I wouldn't want to change it now, but ASAP > in the 7.2 cycle. Bruce, please add > > * Don't decompress untouched toast values on UPDATE > > to TODO. > > > Jan > > -- > > #======================================================================# > # It's easier to get forgiveness for being wrong than for being right. # > # Let's break this rule - forgive me. # > #================================================== JanWieck@Yahoo.com # > > > > _________________________________________________________ > Do You Yahoo!? > Get your free @yahoo.com address at http://mail.yahoo.com > > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Jan Wieck <janwieck@yahoo.com> writes: > Uh - you're right. I wouldn't want to change it now, but ASAP > in the 7.2 cycle. That seems overly conservative. It's a one-line, easily tested change... regards, tom lane