Re: Query on indexed table too slow - Mailing list pgsql-novice
From | Fábio Moreira |
---|---|
Subject | Re: Query on indexed table too slow |
Date | |
Msg-id | CANQddpOTBZz+HJm0ttugYPAMKmyro6OA+YWNe0XrhYkvydN9Sg@mail.gmail.com Whole thread Raw |
In response to | Query on indexed table too slow (Shmagi Kavtaradze <kavtaradze.s@gmail.com>) |
Responses |
Re: Query on indexed table too slow
|
List | pgsql-novice |
Hi Shmagi,
Your table is tiny -- tiny enough that the actual working set (the rows of chunks2, without the "doc" field) easily fits in memory; you're unlikely to get a large improvement here, specially considering how, by the looks of the query planner, you have relatively few distinct chunkid (is that so? can you post SELECT COUNT(DISTINCT chunkid) FROM chunks2?).Length(Replace(Cast(i::BIT(10) AS TEXT), '0', '')) AS popcount
On Tue, Mar 8, 2016 at 2:12 AM, Shmagi Kavtaradze <kavtaradze.s@gmail.com> wrote:
I have a table with 46230 rows(1 doc, 4623 sentences and 10 chunks for each sentence):create table chunks(doc varchar,sentenceid int,chunkid int,chunk bit(10));With the query I want to compare sentence chunks with other sentence chunks that have same chunkid:SELECTa.sentenceid,b.sentenceid, a.chunkid,Length(Replace(Cast(a.chunk & b.chunk AS TEXT), '0', ''))::float / Length(a.chunk)::floatFROM chunks2 aINNER JOIN chunks2 bON a.sentenceid < b.sentenceid and a.chunkid = b.chunkid;I ran explain analyze on unindexed table, composite index and both indexed separately,but time is the same for all:Indexed on (sentenceid, chunkid):Hash Join (cost=1335.17..4549476.28 rows=71249559 width=26) (actual time=144.376..1156178.110 rows=106837530 loops=1)Hash Cond: (a.chunkid = b.chunkid)Join Filter: (a.sentenceid < b.sentenceid)Rows Removed by Join Filter: 106883760-> Seq Scan on chunks2 a (cost=0.00..757.30 rows=46230 width=15) (actual time=0.039..77.275 rows=46230 loops=1)-> Hash (cost=757.30..757.30 rows=46230 width=15) (actual time=142.954..142.954 rows=46230 loops=1)Buckets: 65536 Batches: 1 Memory Usage: 2680kB-> Seq Scan on chunks2 b (cost=0.00..757.30 rows=46230 width=15) (actual time=0.031..64.340 rows=46230 loops=1)Planning time: 1.209 msExecution time: 1212779.012 msI know they have the same operations and no index was used. Where is my mistake and how to speed up query with index? Or how to use indexes efficiently in my case?
--
Fábio Dias Moreira
pgsql-novice by date: