Re: Hardware requirements for a PostGIS server - Mailing list pgsql-general
From | Mathieu Basille |
---|---|
Subject | Re: Hardware requirements for a PostGIS server |
Date | |
Msg-id | 54DBE7F2.4010202@ase-research.org Whole thread Raw |
In response to | Hardware requirements for a PostGIS server (Mathieu Basille <basille.web@ase-research.org>) |
Responses |
Re: Hardware requirements for a PostGIS server
Re: Hardware requirements for a PostGIS server Re: Hardware requirements for a PostGIS server |
List | pgsql-general |
Thanks to everyone who contributed to this thread, either on the PostGIS [1] or the PostgreSQL [2] mailing lists. I will try to summarize everything in this message, which I will actually post on both lists to give an update to everyone. I hope it can be useful for other people interested. Please feel free to add more advice and other experiences, this is always useful! Performance =========== * CPU Good CPU required for faster processing. Number of cores helps in parallel processing, but number of users != number of queries (Example: with no more than 4 concurrent users, it should be fine with a single quad-core CPU). * Memory Examples go from 8 to >32 GB RAM. * Disks Lots of I/0 with geoprocessing requires fast disks: best with SSD, otherwise 10k/15k RPM. An alternative would be to store indexes on faster disks and data on slower disks (need to tune PostgreSQL). Better to have direct-attached storage (DAS), i.e. on the server directly (direct transfer between RAM and disks); external storage requires good network (additional RAM increases performance). * Massive multi-user environment (lot of simultaneous connections): pgpool [3] (Linux/UNIX only). pgpool can be added later on, no need to worry about it as a start. Platform ======== Linux is the platform of choice: * Easier administration (install/configuration/upgrade), which is also true for addons/dependencies (starting with PostGIS, but also GEOS, GDAL, PL/R); * Better performance [4]; * More tuning options (limited with MS systems); There is still the possibility of a virtualbox on a MS server. Other considerations ==================== * Backup: integrate a script that runs daily pg_dump daily to export and upload DB to storage box (which is part of the backup system) * Integration with R: a dedicated R server brings more flexibility / extensions (e.g. Shiny) / performance (more cores and memory available for PostGIS) except if data transfer is the bottleneck. Use Pl/R for small functions (also if it fits naturally into PostgreSQL workflow) / otherwise in R with PostgreSQL connector. Example setups ============== * Dell Precision 2×6 cores, 20 GB RAM, SSD for indexes, 7200 HDD for big tables [Rémi Cura]: Various usages, from visualization (few users) to complex queries with a lot of reading/writing (several users). * Bare metal machine with ESXI; PostgreSQL machine with 8Gb RAM; 2 quad-core processors; PostgreSQL tuned for fast reads, with large cache; pgPool; disks: 2 7.200RPM disks - with RAID 1 [George Silva]: 12 concurrent QGIS users, editing around 50.000 km² of land use coverage in 1:5000 scale with lot of detail (in two separate DB). More editing than processing; some heavy queries (e.g. "complete feature" tool from QGIS) can take some time. * Nominatim (OpenStreetMap data) [5]: > 1 GB RAM necessary, >32 GB recommended; 700 GB HDD; SSD recommended; example machine: 12-core with 32GB RAM and standard SATA disks, I/O limiting factor. Thanks again for the good feedback! This gives me very useful information to get started (I think this is still going to be a long process). Mathieu Basille. [1] http://lists.osgeo.org/pipermail/postgis-users/2015-February/040120.html [2] http://www.postgresql.org/message-id/54DAA7D9.8020908@ase-research.org [3] http://www.pgpool.net/ [4] https://stackoverflow.com/questions/8368924/postgresql-performance-on-windows-using-latest-versions [5] http://wiki.openstreetmap.org/wiki/Nominatim/Installation Le 10/02/2015 19:52, Mathieu Basille a écrit : > Dear PostgreSQL users, > > I am posting here a question that I initially asked on the PostGIS list > [1], where I was advised to try here too (I will keep both lists updated > about the developments on this issue). > > I am currently planning to set up a PostgreSQL + PostGIS instance for my > lab. Turns out I believe this would be useful for the whole center, so that > I'm now considering setting up the server for everyone—if interest is > shared of course. At the moment, I am however struggling with what would be > required in terms of hardware, and of course, the cost will depend on > that—at the end of the day, it's really a matter of money well spent. I > have then a series of questions/remarks, and I would welcome any feedback > from people with existing experience on setting up a multi-user PostGIS > server. I'm insisting on the PostGIS aspect, since the most heavy requests > will be GIS requests (intersections, spatial queries, etc.). However, > people with similar PostgreSQL setup may have very relevant comments about > their own configuration. > > * My own experience about servers is rather limited: I used PostGIS quite a > bit, but only on a desktop, with only 2 users. The desktop was quite good > (quad-core Xeon, 12 Go RAM, 500 GB hd), running Debian, and we never had > any performance issue (although some queries were rather long, but still > acceptable). > > * The use case I'm envisioning would be (at least in the foreseeable future): > - About 10 faculty users (which means potentially a little bit more > students using it); I would have hard time considering more than 4 > concurrent users; > - Data would primarily involve a lot (hundreds/thousands) of high > resolution (spatial and temporal) raster and vector maps, possibly over > large areas (Florida / USA / continental), as well as potentially millions > of GPS records (animals individually monitored); > - Queries will primarily involve retrieving points/maps over given > areas/time, as well as intersecting points over environmental layers [from > what I understand, a lot of I/O, with many intermediary tables involved]; > other use cases will involve working with steps, i.e. the straight line > segment connecting two successive locations, and intersecting them with > environmental layers; > > * I couldn't find comprehensive or detailed guidelines on-line about > hardware, but from what I could see, it seems that memory wouldn't be the > main issue, but the number of cores would be (one core per database > connection if I'm not mistaken). At the same time, we want to make sure > that the experience is smooth for everyone... I was advised on the PostGIS > list to give a look at pgpool (however, UNIX only). > > * Is there a difference in terms of possibilities, performance and > usability between a Linux-based and a MS-based server (from the user > perspective)? My center is unfortunately MS-centered, and existing > equipment runs with MS systems... It would thus be easier for them to set > up a MS-based server. Does it change anything for the user? (I insist on > the user perspective, since I and others will not admin the system, but > only use it) > > * Does anyone have worked with a server running the DB engine, while the DB > itself was stored on another box/server? That would likely be the case here > since we already have a dedicated box for file storage. Along these lines, > does the system of the file storage box matter (Linux vs. MS)? > > * We may also use the server as a workstation to streamline PostGIS > processing with further R analyses/modeling (or even use R from within the > database using PL/R). Again, does anyone have experience doing it? Is a > single workstation the recommended way to work with such workflow? Or would > it be better (but more costly) to have one server dedicated to PostGIS and > another one, with different specs, dedicated to analyses (R)? > > I realize my questions and comments may be a confusing, likely because of > the lack of experience about these issues on my side. I really welcome any > feedback of people working with PostgreSQL servers (+ PostGIS ideally!) in > a small unit, or any similar setting that could be informative! > > In advance, thank you very much! > > Sincerely, > Mathieu Basille. > > > [1] Start of the thread here: > http://lists.osgeo.org/pipermail/postgis-users/2015-February/040120.html > -- ~$ whoami Mathieu Basille http://ase-research.org/basille ~$ locate --details University of Florida \\ Fort Lauderdale Research and Education Center (+1) 954-577-6314 ~$ fortune « Le tout est de tout dire, et je manque de mots Et je manque de temps, et je manque d'audace. » -- Paul Éluard
pgsql-general by date: