Re: Performance problem with query - Mailing list pgsql-general
From | Christian Rengstl |
---|---|
Subject | Re: Performance problem with query |
Date | |
Msg-id | 44BE3BB1.0AD0.0080.0@klinik.uni-regensburg.de Whole thread Raw |
In response to | Re: Performance problem with query (Q <qdolan@gmail.com>) |
Responses |
Re: Performance problem with query
|
List | pgsql-general |
So here's the master table including the rules: entry_no int8 NOT NULL DEFAULT nextval('public.master_seq'::regclass), pid varchar(15) NOT NULL, val_1 varchar(1), val_2 varchar(1), chr int2 NOT NULL, aendat timestamp DEFAULT now(), aennam varchar(8), CONSTRAINT "PK_ENTRY" PRIMARY KEY (entry_no), CONSTRAINT "UNIQUE_MASTER" UNIQUE (pid, entry_no) CREATE OR REPLACE RULE "INSERT_INTO_1" AS ON INSERT TO public.master WHERE new.chr = 1 DO INSTEAD INSERT INTO public.table_1 (entry_no, pid, val_1, val_2, chr, aendat, aennam) VALUES (new.entry_no, new.pid, new.val_1, new.val_2, new.chr, new.aendat, new.aennam); Like this i have around 20 rules so far, but there might be more later on. The children tables are so far exactly as themaster table. Q <qdolan@gmail.com> wrote on 07/19/06 1:52 pm: > On 19/07/2006, at 8:49 PM, Christian Rengstl wrote: > >> Obviously it had something to do with the rule, because now >> everything finished within 20 minutes. the problem is just that i >> don't really want to give up the inheritance design. is there a way >> to maintain the inheritance that doesn't cause this huge >> performance problem? > > That is hard to say unless you post the rule and table schema you are > currently using. > >> Q <qdolan@gmail.com> wrote on 07/19/06 11:54 am: >>> On 19/07/2006, at 6:32 PM, Christian Rengstl wrote: >>> >>>> The analyze is from the exact query and i dropped the indexes >>>> before the insert as well without imrpvement. The target table is >>>> as well completely empty and the insert is supposed to write, in >>>> this case, more or less 8 million lines in the table. There is a >>>> rule though, because i have inheritance table structure with one >>>> master table and around 20 child tables. >>> >>> I would say the problem is in the rule. Try doing the insert into a >>> duplicate table with no rules or inheritance and see how long it >>> takes. >>> >>> Perhaps you should provide the actual schema of tables and rules that >>> are involved in the query in question. >>> >>>> Q <qdolan@gmail.com> wrote on 07/19/06 4:37 am: >>>>> >>>>> On 19/07/2006, at 4:24 AM, Christian Rengstl wrote: >>>>> >>>>>> now finally after a long time i have the query plan for the whole >>>>>> filled table. I hope somebody can explain me why it takes so much >>>>>> longer... >>>>> >>>>> >>>>> These explain analyze results don't appear to be from the >>>>> queries you >>>>> posted previously. For these results to mean anything you need to >>>>> include the EXACT queries you used to generate them. >>>>> >>>>>> Result (cost=0.06..254643.06 rows=8044000 width=39) (actual >>>>>> time=0.056..655772 >>>>>> .273 rows=8044000 loops=1) >>>>>> One-Time Filter: ((split_part(($1)::text, '_'::text, >>>>>> 2))::smallint = 1) >>>>>> InitPlan >>>>>> -> Limit (cost=0.00..0.02 rows=1 width=10) (actual >>>>>> time=0.003..0.003 rows >>>>>> =1 loops=1) >>>>>> -> Seq Scan on temp_table (cost=0.00..194313.00 >>>>>> rows=8044000 width= >>>>>> 10) (actual time=0.002..0.002 rows=1 loops=1) >>>>>> -> Limit (cost=0.00..0.02 rows=1 width=10) (actual >>>>>> time=0.006..0.007 rows >>>>>> =1 loops=1) >>>>>> -> Seq Scan on temp_table (cost=0.00..194313.00 >>>>>> rows=8044000 width= >>>>>> 10) (actual time=0.004..0.004 rows=1 loops=1) >>>>>> -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 >>>>>> width=39) (act >>>>>> ual time=0.002..191672.344 rows=8044000 loops=1) >>>>>> Total runtime: 62259544.896 ms >>>>> >>>>> >>>>> This is the query you want to be interested in, the others took no >>>>> time at all. >>>>> >>>>> As a guess I would say the query is an INSERT INTO ... FROM >>>>> SELECT ... WHERE (split_part(???, '_', 2))::smallint = 1 statement. >>>>> The majority of the time appears to be taken up on the actual >>>>> INSERT >>>>> and not the SELECT part. >>>>> >>>>> How many rows are in the target table and what indexes does it >>>>> have? >>>>> Does it have any triggers, check constraints, or rules applied >>>>> to it? >>>>> All these things can make the insert take longer as the number of >>>>> rows you have already in the table increases. >>>>> >>>>> More than likely you have a target table with a LOT of rows and a >>>>> bunch of indexes on it and your disks are being thrashed because >>>>> the >>>>> indexes are not able to stay cached in RAM. At this point you >>>>> should >>>>> ensure your machine is not swapping do disk, and at the very least >>>>> you should go through one of the many tuning guidelines >>>>> available and >>>>> ensure you have allocated the appropriate amount of memory to >>>>> postgresql for your needs. >>>>> >>>>> You may also want to consider dropping the indexes before you do >>>>> the >>>>> INSERT and recreate them afterwards. >>>>> >>>>> >>>>> >>>>>>>>> "Christian Rengstl" <Christian.Rengstl@klinik.uni- >>>>>>>>> regensburg.de> >>>>>>>>> 13.07.06 8.37 Uhr >>> >>>>>> Good morning list, >>>>>> >>>>>> the following query takes about 15 to 20 minutes for around 2 >>>>>> million lines in the file myfile.txt, but with 8 million lines it >>>>>> takes around 5 hours and i just don't understand why there is such >>>>>> a huge discrepancy in performance. >>>>>> >>>>>> COPY public.temp_table FROM 'myfile.txt' WITH DELIMITER '\t'; >>>>>> >>>>>> INSERT INTO public.master(pid,smid, val1, val2, chr) >>>>>> SELECT pid, smid, val1, val12, CAST(split_part((SELECT >>>>>> chr from public.temp_table LIMIT 1), '_', 2) as int2) >>>>>> FROM public.temp_table; >>>>>> >>>>>> INSERT INTO public.values(smid, pos, chr) >>>>>> SELECT DISTINCT smid, pos, CAST(split_part((SELECT chr >>>>>> from >>>>>> public.temp_table LIMIT 1), '_', 2) as int2) >>>>>> FROM public.temp_table; >>>>>> >>>>>> I came up with this query, because i wanted to use the COPY >>>>>> command >>>>>> to load huge files into the db, but i don't want to load all the >>>>>> columns contained in the file in only one table but copy some of >>>>>> them into one table and some in a second table. As i found out >>>>>> with >>>>>> logging, the data is loaded into temp_table within 15 minutes, but >>>>>> to transfer it from the temp_table toagain only something like 10 >>>>>> minutes. Can it be that the cast takes up so much more time than >>>>>> when reading and transferring 2 million lines? >>>>> >>>>> >>>>> -- >>>>> Seeya...Q >>>>> >>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>>> >>>>> _____ / Quinton Dolan - >>>>> qdolan@gmail.com >>>>> __ __/ / / __/ / / >>>>> / __ / _/ / / Gold Coast, QLD, Australia >>>>> __/ __/ __/ ____/ / - / Ph: +61 419 729 806 >>>>> _______ / >>>>> _\ >>>>> >>>>> >>>>> >>>>> >>>>> ---------------------------(end of >>>>> broadcast)--------------------------- >>>>> TIP 2: Don't 'kill -9' the postmaster >>>> >>>> >>>> -- >>>> Christian Rengstl M.A. >>>> Klinik und Poliklinik für Innere Medizin II >>>> Kardiologie - Forschung >>>> Universitätsklinikum Regensburg >>>> B3 1.388 >>>> Franz-Josef-Strauss-Allee 11 >>>> 93053 Regensburg >>>> Tel.: +49-941-944-7230 >>> >>> >>> -- >>> Seeya...Q >>> >>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>> >>> _____ / Quinton Dolan - qdolan@gmail.com >>> __ __/ / / __/ / / >>> / __ / _/ / / Gold Coast, QLD, Australia >>> __/ __/ __/ ____/ / - / Ph: +61 419 729 806 >>> _______ / >>> _\ >>> >>> >> >> >> -- >> Christian Rengstl M.A. >> Klinik und Poliklinik für Innere Medizin II >> Kardiologie - Forschung >> Universitätsklinikum Regensburg >> B3 1.388 >> Franz-Josef-Strauss-Allee 11 >> 93053 Regensburg >> Tel.: +49-941-944-7230 > > > -- > Seeya...Q > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > _____ / Quinton Dolan - qdolan@gmail.com > __ __/ / / __/ / / > / __ / _/ / / Gold Coast, QLD, Australia > __/ __/ __/ ____/ / - / Ph: +61 419 729 806 > _______ / > _\ > > -- Christian Rengstl M.A. Klinik und Poliklinik für Innere Medizin II Kardiologie - Forschung Universitätsklinikum Regensburg B3 1.388 Franz-Josef-Strauss-Allee 11 93053 Regensburg Tel.: +49-941-944-7230
pgsql-general by date: