Thread: horo(r)logy test fail on solaris (again and solved)
I tried regression test with Postgres Beta and horology test field. See attached log. It appears few month ago - see http://archives.postgresql.org/pgsql-ports/2006-06/msg00004.php I used Sun Studio 11 with -fast flag and SPARC platform. I played little bit with cc flags and following flags work fine for me: export CFLAGS="-fast" export LDFLAGS="-lm -fast" The fast switch for compiler is very important too, because it links "fast" library. Could anybody confirm that it works on his machine? But the question is if the "-fast" flag is good for postgres. The -fast flag sets "brutal" floating point optimization and some operation should have less precision. Is possible verify that floating point operation works well? I read postgres documentation about floating point datatypes and that implementation is platform specific. Developer must take care about it discrepancies, but should there any other part of postgres code where "-fast" switch generate some computing defect - it means that result must be platform independent? The cc flags are describes in http://docs.sun.com/source/819-3688/cc_ops.app.html. Zdenek *** ./expected/horology.out Tue Jul 25 05:51:22 2006 --- ./results/horology.out Tue Sep 26 14:19:10 2006 *************** *** 2466,2472 **** SELECT '' AS ten, f1 AS interval, reltime(f1) AS reltime FROM INTERVAL_TBL; ten | interval | reltime ! -----+-------------------------------+------------------------------- | @ 1 min | @ 1 min | @ 5 hours | @ 5 hours | @ 10 days | @ 10 days --- 2466,2472 ---- SELECT '' AS ten, f1 AS interval, reltime(f1) AS reltime FROM INTERVAL_TBL; ten | interval | reltime ! -----+-------------------------------+---------------------------------- | @ 1 min | @ 1 min | @ 5 hours | @ 5 hours | @ 10 days | @ 10 days *************** *** 2474,2480 **** | @ 3 mons | @ 3 mons | @ 14 secs ago | @ 14 secs ago | @ 1 day 2 hours 3 mins 4 secs | @ 1 day 2 hours 3 mins 4 secs ! | @ 6 years | @ 6 years | @ 5 mons | @ 5 mons | @ 5 mons 12 hours | @ 5 mons 12 hours (10 rows) --- 2474,2480 ---- | @ 3 mons | @ 3 mons | @ 14 secs ago | @ 14 secs ago | @ 1 day 2 hours 3 mins 4 secs | @ 1 day 2 hours 3 mins 4 secs ! | @ 6 years | @ 5 years 12 mons 5 days 6 hours | @ 5 mons | @ 5 mons | @ 5 mons 12 hours | @ 5 mons 12 hours (10 rows) ====================================================================== parallel group (13 tests): text varchar name char boolean oid int8 int4 int2 float4 float8 bit numeric boolean ... ok char ... ok name ... ok varchar ... ok text ... ok int2 ... ok int4 ... ok int8 ... ok oid ... ok float4 ... ok float8 ... ok bit ... ok numeric ... ok test strings ... ok test numerology ... ok parallel group (20 tests): lseg point box comments abstime reltime timetz circle time polygon tinterval inet path intervaltimestamp date timestamptz type_sanity oidjoins opr_sanity point ... ok lseg ... ok box ... ok path ... ok polygon ... ok circle ... ok date ... ok time ... ok timetz ... ok timestamp ... ok timestamptz ... ok interval ... ok abstime ... ok reltime ... ok tinterval ... ok inet ... ok comments ... ok oidjoins ... ok type_sanity ... ok opr_sanity ... ok test geometry ... ok test horology ... FAILED test insert ... ok test create_function_1 ... ok test create_type ... ok test create_table ... ok test create_function_2 ... ok parallel group (2 tests): copyselect copy copy ... ok copyselect ... ok parallel group (8 tests): create_aggregate constraints create_operator drop_if_exists triggers vacuum create_misc inherit constraints ... ok triggers ... ok create_misc ... ok create_aggregate ... ok create_operator ... ok inherit ... ok vacuum ... ok drop_if_exists ... ok parallel group (2 tests): create_view create_index create_index ... ok create_view ... ok test sanity_check ... ok test errors ... ok test select ... ok parallel group (20 tests): select_implicit select_distinct_on select_distinct select_into case update random namespace deleteselect_having btree_index union hash_index aggregates transactions join arrays portals subselect prepared_xacts select_into ... ok select_distinct ... ok select_distinct_on ... ok select_implicit ... ok select_having ... ok subselect ... ok union ... ok case ... ok join ... ok aggregates ... ok transactions ... ok random ... ok portals ... ok arrays ... ok btree_index ... ok hash_index ... ok update ... ok namespace ... ok prepared_xacts ... ok delete ... ok test privileges ... ok test misc ... ok parallel group (7 tests): select_views portals_p2 guc cluster dependency rules foreign_key select_views ... ok portals_p2 ... ok rules ... ok foreign_key ... ok cluster ... ok dependency ... ok guc ... ok parallel group (15 tests): limit rangefuncs temp copy2 polymorphism prepare conversion without_oid returning sequence truncatedomain alter_table plpgsql rowtypes limit ... ok plpgsql ... ok copy2 ... ok temp ... ok domain ... ok rangefuncs ... ok prepare ... ok without_oid ... ok conversion ... ok truncate ... ok alter_table ... ok sequence ... ok polymorphism ... ok rowtypes ... ok returning ... ok test stats ... ok test tablespace ... ok
Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: > But the question is if the "-fast" flag is good for postgres. The -fast > flag sets "brutal" floating point optimization and some operation should > have less precision. Is possible verify that floating point operation > works well? That's a pretty good way to guarantee that you'll break the datetime code. It might be acceptable if you use --enable-integer-datetimes. regards, tom lane
Zdenek Kotala wrote: > I tried regression test with Postgres Beta and horology test field. See > attached log. It appears few month ago - see > http://archives.postgresql.org/pgsql-ports/2006-06/msg00004.php > I used Sun Studio 11 with -fast flag and SPARC platform. Are you looking for ways to contort Solaris to make PostgreSQL fail? That doesn't prove much about PostgreSQL, but rather about Solaris. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Tom Lane wrote: > Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: > >> But the question is if the "-fast" flag is good for postgres. The -fast >> flag sets "brutal" floating point optimization and some operation should >> have less precision. Is possible verify that floating point operation >> works well? >> > > That's a pretty good way to guarantee that you'll break the datetime > code. > > ! | @ 6 years | @ 5 years 12 mons 5 days 6 hours Doesn't this look odd regardless of what bad results come back from the FP library? cheers andrew
Tom Lane napsal(a): > Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: >> But the question is if the "-fast" flag is good for postgres. The -fast >> flag sets "brutal" floating point optimization and some operation should >> have less precision. Is possible verify that floating point operation >> works well? > > That's a pretty good way to guarantee that you'll break the datetime > code. > > It might be acceptable if you use --enable-integer-datetimes. I suggest to remove mention about -fast flag from FAQ.Solaris or add warning about usage of this. Josh do you have any cc flags suggestion? regards, Zdenek
Bruce Momjian napsal(a): > Zdenek Kotala wrote: >> I tried regression test with Postgres Beta and horology test field. See >> attached log. It appears few month ago - see >> http://archives.postgresql.org/pgsql-ports/2006-06/msg00004.php >> I used Sun Studio 11 with -fast flag and SPARC platform. > > Are you looking for ways to contort Solaris to make PostgreSQL fail? > That doesn't prove much about PostgreSQL, but rather about Solaris. > It is not about Solaris, It is about recommended setting for Sun Studio in the FAQ.Solaris. regards Zdenek
Andrew Dunstan <andrew@dunslane.net> writes: > ! | @ 6 years | @ 5 years 12 mons 5 days 6 hours > Doesn't this look odd regardless of what bad results come back from the > FP library? It looks exactly like the sort of platform-dependent rounding issue that Bruce and Michael Glaesemann spent a lot of time on recently. It might be interesting to see if CVS HEAD works any better under these conditions ... but if it doesn't, that doesn't mean I'll be interested in fixing it. Getting the float datetime code to work is hard enough without having a compiler that thinks it can take shortcuts. regards, tom lane
Zdenek, >> Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: >>> But the question is if the "-fast" flag is good for postgres. The >>> -fast flag sets "brutal" floating point optimization and some >>> operation should have less precision. Is possible verify that >>> floating point operation works well? >> >> That's a pretty good way to guarantee that you'll break the datetime >> code. >> >> It might be acceptable if you use --enable-integer-datetimes. > > I suggest to remove mention about -fast flag from FAQ.Solaris or add > warning about usage of this. > > Josh do you have any cc flags suggestion? Using Sun Studio? I'm hardly the expert. Maybe Jignesh? --Josh Berkus
Tom, On 9/26/06 9:15 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> ! | @ 6 years | @ 5 years 12 mons 5 days 6 hours > >> Doesn't this look odd regardless of what bad results come back from the >> FP library? > > It looks exactly like the sort of platform-dependent rounding issue that > Bruce and Michael Glaesemann spent a lot of time on recently. It might > be interesting to see if CVS HEAD works any better under these > conditions ... but if it doesn't, that doesn't mean I'll be interested > in fixing it. Getting the float datetime code to work is hard enough > without having a compiler that thinks it can take shortcuts. How about fixing the compilation so that the routines in adt that are sensitive to FP optimizations are isolated from aggressive optimization? - Luke
Zdenek, Hmmm ... we're not using the -fast option for the standard PostgreSQL packages. Where did you start using it? -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco
Josh Berkus napsal(a): > Zdenek, > > Hmmm ... we're not using the -fast option for the standard PostgreSQL > packages. Where did you start using it? Yes, I know. The -fast option generates architecture depending code and it is not possible use in common packages. I found out this option when I analyzed BUG #2651. I tried regression test and it's fail. I found that same problem was described with Match Grun few month ago and the -fast option is mentioned in the FAQ.Solaris for performance tunning. That is all. regards Zdenek
Andrew Dunstan napsal(a): > > > Tom Lane wrote: >> Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: >> >>> But the question is if the "-fast" flag is good for postgres. The >>> -fast flag sets "brutal" floating point optimization and some >>> operation should have less precision. Is possible verify that >>> floating point operation works well? >>> >> >> That's a pretty good way to guarantee that you'll break the datetime >> code. >> >> > > ! | @ 6 years | @ 5 years 12 mons 5 days 6 hours > > > > Doesn't this look odd regardless of what bad results come back from the > FP library? The problem was generated, because -fast option was set only for the compiler and not for the linker. Linker takes wrong version of libraries. If -fast is set for both then horology test is OK, but question was if float optimalization should generate some problems. regards, Zdenek
Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: > The problem was generated, because -fast option was set only for the > compiler and not for the linker. Linker takes wrong version of > libraries. If -fast is set for both then horology test is OK, but > question was if float optimalization should generate some problems. So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and LDFLAGS? regards, tom lane
Tom Lane napsal(a): > Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: >> The problem was generated, because -fast option was set only for the >> compiler and not for the linker. Linker takes wrong version of >> libraries. If -fast is set for both then horology test is OK, but >> question was if float optimalization should generate some problems. > > So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and > LDFLAGS? Exactly, but I want to sure, that float optimalization is safe and should be applied for postgres, because -fast breaks IEE754 standard. If it is OK I will adjust FAQ_Solaris. Zdenek
On Wed, Sep 27, 2006 at 04:09:18PM +0200, Zdenek Kotala wrote: > Tom Lane napsal(a): > >Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: > >>The problem was generated, because -fast option was set only for the > >>compiler and not for the linker. Linker takes wrong version of > >>libraries. If -fast is set for both then horology test is OK, but > >>question was if float optimalization should generate some problems. > > > >So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and > >LDFLAGS? > > Exactly, but I want to sure, that float optimalization is safe and > should be applied for postgres, because -fast breaks IEE754 standard. If > it is OK I will adjust FAQ_Solaris. > > Zdenek > Unless the packager understands the floating point usage of every piece and module included and the effect that the -fast option will have on them, please do not recommend it for anything but extremely well tested dedicated use-cases. When it causes problems, it can be terrible if the problems are not detected immediately. Massive data corruption could occur. Given these caveats, in a well tested use-case the -fast option can squeeze a bit more from the CPU and could be used. I have had to debug the fallout from the -fast option in other software in the past. Let's just say, backups are a good thing. I would vote not to recommend it without very strong cautions similar to was Sun includes in the compiler manual pages. Ken
Thanks for the analysis. I have removed mention of the -fast option from the Solaris FAQ. --------------------------------------------------------------------------- Kenneth Marshall wrote: > On Wed, Sep 27, 2006 at 04:09:18PM +0200, Zdenek Kotala wrote: > > Tom Lane napsal(a): > > >Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: > > >>The problem was generated, because -fast option was set only for the > > >>compiler and not for the linker. Linker takes wrong version of > > >>libraries. If -fast is set for both then horology test is OK, but > > >>question was if float optimalization should generate some problems. > > > > > >So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and > > >LDFLAGS? > > > > Exactly, but I want to sure, that float optimalization is safe and > > should be applied for postgres, because -fast breaks IEE754 standard. If > > it is OK I will adjust FAQ_Solaris. > > > > Zdenek > > > Unless the packager understands the floating point usage of every > piece and module included and the effect that the -fast option will > have on them, please do not recommend it for anything but extremely > well tested dedicated use-cases. When it causes problems, it can > be terrible if the problems are not detected immediately. Massive > data corruption could occur. > > Given these caveats, in a well tested use-case the -fast option can > squeeze a bit more from the CPU and could be used. I have had to > debug the fallout from the -fast option in other software in the > past. Let's just say, backups are a good thing. > > I would vote not to recommend it without very strong cautions similar > to was Sun includes in the compiler manual pages. > > Ken > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +