Thread: Searching http://www.postgresql.org ...
Well, finally have a good portion of the web site now searchable...the following "sections" of other web site is available: Server http://www.postgresql.org/docs Server http://www.postgresql.org/mhonarc/pgsql-admin Server http://www.postgresql.org/mhonarc/pgsql-announce Server http://www.postgresql.org/mhonarc/pgsql-bugs Server http://www.postgresql.org/mhonarc/pgsql-docs Server http://www.postgresql.org/mhonarc/pgsql-general Server http://www.postgresql.org/mhonarc/pgsql-hackers Server http://www.postgresql.org/mhonarc/pgsql-mirrors Server http://www.postgresql.org/mhonarc/pgsql-novice Server http://www.postgresql.org/mhonarc/pgsql-sql On Tues/Weds of this week, we're having a 9.1gb drive installed that will be dedicated to UdmSearch, so the rest of the site will be indexed at that time... For now, if you go to: http://www.postgresql.org/search.cgi You can make use of the search engine... Just to give you an idea of the size of the tables that are currently loaded: dict: 6856951 tuples in ~373Meg url: 46416 tuples in ~ 23Meg Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker wrote: > > Well, finally have a good portion of the web site now searchable...the > following "sections" of other web site is available: Having implemented a full-text index myself a few years ago, i tried some more taxing queries (those that return most of the pages, and that could (should) be banned by stop words or some smarter techniques (like partitioning the search space by date or some other attribute first) the results of 31220 matches <10 sec at 16002 matches <5 sec PostgreSQL 24807 matches <10 sec I infinite, (maybe computer crash ?, www.postgresql.org also unreachable) please dont sue me ;) ---------- Hannu
Hannu Krosing wrote: > > The Hermit Hacker wrote: > > > > Well, finally have a good portion of the web site now searchable...the > > following "sections" of other web site is available: > > Having implemented a full-text index myself a few years ago, i tried some > more taxing queries (those that return most of the pages, and that could > (should) be banned by stop words or some smarter techniques (like > partitioning the search space by date or some other attribute first) > > the results > > of 31220 matches <10 sec > at 16002 matches <5 sec > PostgreSQL 24807 matches <10 sec > I infinite, (maybe computer crash ?, www.postgresql.org also > unreachable) Seems it survived (result after ~5 min) 35827 matches. During the search www.postgresql.org was unreachable at least from my computer. I will stop testing for now. ---------------- Hannu
On Mon, 10 Jan 2000, Hannu Krosing wrote: > The Hermit Hacker wrote: > > > > Well, finally have a good portion of the web site now searchable...the > > following "sections" of other web site is available: > > Having implemented a full-text index myself a few years ago, i tried some > more taxing queries (those that return most of the pages, and that could > (should) be banned by stop words or some smarter techniques (like > partitioning the search space by date or some other attribute first) > > the results > > of 31220 matches <10 sec > at 16002 matches <5 sec > PostgreSQL 24807 matches <10 sec > I infinite, (maybe computer crash ?, www.postgresql.org also > unreachable) > > please dont sue me ;) You didn't crash anything, no worry...:) As for stopwords, they do provide a mechanism for doing this, and, ummm, I forgot to load it :) Loaded now, so it should reduce as it gets recycled/expired...I don't want to start this all from scratch again :( Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Hannu Krosing <hannu@tm.ee> writes: >> the results >> of 31220 matches <10 sec >> at 16002 matches <5 sec >> PostgreSQL 24807 matches <10 sec >> I infinite, (maybe computer crash ?, www.postgresql.org also >> unreachable) > Seems it survived (result after ~5 min) 35827 matches. > During the search www.postgresql.org was unreachable at least from my > computer. Sounds to me like you were seeing a transient network outage. The 35k-row query probably didn't take *that* much longer than the 31k-row query --- but maybe www.postgresql.org's packets couldn't get to you for awhile. regards, tom lane