Re: Subject: bool / vacuum full bug followup part 2 - Mailing list pgsql-general
From | Scott Marlowe |
---|---|
Subject | Re: Subject: bool / vacuum full bug followup part 2 |
Date | |
Msg-id | Pine.LNX.4.33.0205031702200.1471-100000@css120.ihs.com Whole thread Raw |
In response to | Re: Subject: bool / vacuum full bug followup part 2 (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Subject: bool / vacuum full bug followup part 2
|
List | pgsql-general |
On Fri, 3 May 2002, Tom Lane wrote: > Scott Marlowe <scott.marlowe@ihs.com> writes: > > Well, my keys aren't changing and the index is growing like they are. > > Could we see the exact details of your test case? Sure. I think I posted most of it here already... Here's my table: scott.marlowe=# \d test Table "test" Column | Type | Modifiers --------+---------+----------- buf | text | yn | boolean | id | integer | Indexes: test_id_dx, test_yn_dx Here's the indexes: scott.marlowe=# \di test_id_dx List of relations Name | Type | Owner ------------+-------+--------------- test_id_dx | index | scott.marlowe scott.marlowe=# \di test_yn_dx List of relations Name | Type | Owner ------------+-------+--------------- test_yn_dx | index | scott.marlowe (1 row) Here's the php script that creates the test data, it makes $count number of rows and sets the bool to true or false randomly: set_time_limit(3600); $conn = pg_connect("dbname=scott.marlowe host=css120.ihs.com"); $count = 100000; pg_exec($conn,"begin"); $flag[0]="false"; $flag[1]="true"; for ($i=0;$i<$count;$i++){ if ($i%1000==0) { pg_exec($conn,"end"); pg_exec($conn,"begin"); } $letter = chr(rand(65,91)); $tf = rand(0,1); $query = "insert into test (buf,yn) values ('"; $query.= $letter."',".$flag[$tf].")"; pg_exec($conn,$query); } pg_exec($conn,"end"); Here's the files in my database directory, and their size by du in kbytes after vacuum full; vacuum; reindex index test_id_dx; reindex index test_yn_dx; [postgres@css120 16556]$ oid2name -d scott.marlowe All tables from database "scott.marlowe": --------------------------------- 126572 = accounts 126574 = accounts_pkey 126566 = branches 126568 = branches_pkey 126575 = history 126569 = tellers 126571 = tellers_pkey 16557 = test 1126687 = test_id_dx 1126688 = test_yn_dx [postgres@css120 16556]$ du -s 16557 1126687 1126688 11448 16557 (test) 1772 1126687 (test_id_dx) 1772 1126688 (test_yn_dx) WHAT I DID: scott.marlowe=# update test set yn=true where yn=true; UPDATE 50080 [postgres@css120 16556]$ du -s 16557 1126687 1126688 17176 16557 3516 1126687 2924 1126688 scott.marlowe=# vacuum; VACUUM [postgres@css120 16556]$ du -s 16557 1126687 1126688 17176 16557 3516 1126687 2924 1126688 scott.marlowe=# vacuum full; VACUUM [postgres@css120 16556]$ du -s 16557 1126687 1126688 11448 16557 3516 1126687 4052 1126688 <-- Notice that the index here just GREW scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# update test set yn=true where yn=true; [postgres@css120 16556]$ du -s 16557 1126687 1126688 68744 16557 13980 1126687 15660 1126688 scott.marlowe=# vacuum; VACUUM [postgres@css120 16556]$ du -s 16557 1126687 1126688 68736 16557 13964 1126687 15652 1126688 scott.marlowe=# vacuum full; VACUUM [postgres@css120 16556]$ du -s 16557 1126687 1126688 11448 16557 13964 1126687 16808 1126688 *************** So, now thinking the problem might be just vacuum full, I try plain old vacuums *************** scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# vacuum; VACUUM [postgres@css120 16556]$ du -s 16557 1126687 1126688 22908 16557 13964 1126687 20088 1126688 scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# update test set yn=true where yn=true; scott.marlowe=# vacuum; VACUUM [postgres@css120 16556]$ du -s 16557 1126687 1126688 22908 16557 13964 1126687 22380 1126688 ***************************** Nope, the index on the bools just keeps growing and growing. Given a few million updates and it will be bigger than the data it is supposed to index. scott.marlowe=# reindex index test_yn_dx; REINDEX scott.marlowe=# reindex index test_id_dx; REINDEX oid2name -d scott.marlowe |grep dx 1126690 = test_id_dx 1126689 = test_yn_dx [postgres@css120 16556]$ du -s 16557 1126690 1126689 22908 16557 1772 1126690 1772 1126689 and now they're small again. It would at least be nice if reindex was operational in a transaction so it would be safe to use on a live database, since it appears to not be intended for this purpose really, but for fixing broken indexes. Til then I'll write a script that asks pg_indexes that drops the index and recreates it in a transaction to keep my data store svelt and clean. Thanks for the attention to this. Scott.
pgsql-general by date: