From: CSBVAX::MRGATE!mcvax!corto.inria.fr!shapiro@uunet.uu.net@SMTP 27-APR-1988 12:18 To: ARISIA::EVERHART Subj: D(h)rystone benchmarks on C++, G++ Received: from uunet.UU.NET by prep.ai.mit.edu; Wed, 27 Apr 88 10:33:10 EST Received: from mcvax.UUCP by uunet.UU.NET (5.54/1.14) with UUCP id AA05092; Wed, 27 Apr 88 11:51:29 EDT Received: by mcvax.cwi.nl; Wed, 27 Apr 88 16:19:27 +0200 (MET) Received: from blueberry.inria.fr (blueberry.uucp) by inria.inria.fr; Tue, 26 Apr 88 18:30:04 +0200 (MET) Received: by blueberry.inria.fr (3.2/client.23-dec-86) id AA28425; Tue, 26 Apr 88 18:29:27 +0200 Date: Tue, 26 Apr 88 18:29:27 +0200 Message-Id: <8804261629.AA28425@blueberry.inria.fr> Organization: INRIA, BP 105, 78153 Le Chesnay Cedex, France telephone +33(1)39-63-55-11, telex 697033 F, telecopy +33(1)39-63-53-30 From: Philippe Gautron Sender: mcvax!corto.inria.fr!shapiro@uunet.uu.net To: info-g++@prep.ai.mit.edu Subject: D(h)rystone benchmarks on C++, G++ Drystone is a benchmark program which measures processor+compiler efficiency in executing a 'typical' program. My purpose is NOT to know if this program is good, good enough, bad for these measures but to compare them on a same site (and same conditions) : - pcc as reference, - gcc, the GNU C compiler - C++ (ATT version 1.1), cfront translator and pcc: C++.pcc - C++ (ATT version 1.1), cfront translator and gcc: C++.gcc [cfront compiled by C++.gcc itself] - G++, the GNU C++ compiler (version 1.18) Machine: SUN 3/260, STANDALONE, 8M RAM All compilations with -O. All compiles include the standard Sun libraries, not gnulib. Two tests: with and without register declarations. dry.c: (version C/1.1, 12/01/84) * Date: PROGRAM updated 01/06/86, RESULTS updated 03/31/86 * Compile: cc -O dry.c -o drynr : No registers * cc -O -DREG=register dry.c -o dryr : Registers 1) First, a bug in dry.c: procedure Proc3 struct Record { struct Record *PtrComp; ... }; typedef struct Record RecordType; typedef RecordType * RecordPtr; Proc1(PtrParIn) REG RecordPtr PtrParIn; { #define NextRecord (*(PtrParIn->PtrComp)) ... Proc3(NextRecord.PtrComp); <== call with struct Record* ... } Proc3(PtrParOut) RecordPtr *PtrParOut; <== called with struct Record** This bug does not abort the execution. I have translate dry.c in C++ syntax, and the first compilation aborts on the Proc3 declaration. 2) Second, the results (average): Uptime 9:36am up 2 mins, 0 user, load average: 0.08, 0.00, 0.00 - pcc, drynr Dhrystone(1.1) time for 500000 passes = 84 This machine benchmarks at 5919 dhrystones/second - pcc, dryr Dhrystone(1.1) time for 500000 passes = 78 This machine benchmarks at 6378 dhrystones/second - gcc, drynr Dhrystone(1.1) time for 500000 passes = 73 This machine benchmarks at 6815 dhrystones/second - gcc, dryr Dhrystone(1.1) time for 500000 passes = 73 This machine benchmarks at 6811 dhrystones/second - C++ 1.1 with pcc, drynr Dhrystone(1.1 (C++ syntax)) time for 500000 passes = 85 This machine benchmarks at 5850 dhrystones/second - C++ 1.1 with pcc, drynr Dhrystone(1.1 (C++ syntax)) time for 500000 passes = 86 This machine benchmarks at 5759 dhrystones/second - C++ 1.1 with gcc, drynr Dhrystone(1.1 (C++ syntax)) time for 500000 passes = 73 This machine benchmarks at 6761 dhrystones/second - C++ 1.1 with gcc, dryr Dhrystone(1.1 (C++ syntax)) time for 500000 passes = 73 This machine benchmarks at 6765 dhrystones/second - G++ 1.18, drynr Dhrystone(1.1 (C++ syntax)) time for 500000 passes = 71 This machine benchmarks at 7038 dhrystones/second - G++ 1.18, dryr Dhrystone(1.1 (C++ syntax)) time for 500000 passes = 71 This machine benchmarks at 7040 dhrystones/second 3) Conclusion: register declaration are interesting with pcc, less with gcc but pcc is good for loops. G++ > gcc > C++.gcc > pcc > C++.pcc There is a interesting gap between C++.pcc and C++.gcc. If you have some comments.. P. Gautron ------ Here is my C++ source ------ /* EVERBODY: Please read "APOLOGY" below. -rick 01/06/85 * * "DHRYSTONE" Benchmark Program * * Version: C/1.1, 12/01/84 * * Date: PROGRAM updated 01/06/86, RESULTS updated 03/31/86 * * Author: Reinhold P. Weicker, CACM Vol 27, No 10, 10/84 pg. 1013 * Translated from ADA by Rick Richardson * Every method to preserve ADA-likeness has been used, * at the expense of C-ness. * * Compile: cc -O dry.c -o drynr : No registers * cc -O -DREG=register dry.c -o dryr : Registers * * Defines: Defines are provided for old C compiler's * which don't have enums, and can't assign structures. * The time(2) function is library dependant; Most * return the time in seconds, but beware of some, like * Aztec C, which return other units. * The LOOPS define is initially set for 50000 loops. * If you have a machine with large integers and is * very fast, please change this number to 500000 to * get better accuracy. Please select the way to * measure the execution time using the TIME define. * For single user machines, time(2) is adequate. For * multi-user machines where you cannot get single-user * access, use the times(2) function. If you have * neither, use a stopwatch in the dead of night. * Use a "printf" at the point marked "start timer" * to begin your timings. DO NOT use the UNIX "time(1)" * command, as this will measure the total time to * run this program, which will (erroneously) include * the time to malloc(3) storage and to compute the * time it takes to do nothing. * * Run: drynr; dryr * * Results: If you get any new machine/OS results, please send to: * * ihnp4!castor!pcrat!rick * * and thanks to all that do. * * Note: I order the list in increasing performance of the * "with registers" benchmark. If the compiler doesn't * provide register variables, then the benchmark * is the same for both REG and NOREG. * * PLEASE: Send complete information about the machine type, * clock speed, OS and C manufacturer/version. If * the machine is modified, tell me what was done. * On UNIX, execute uname -a and cc -V to get this info. * * 80x8x NOTE: 80x8x benchers: please try to do all memory models * for a particular compiler. * * APOLOGY (1/30/86): * Well, I goofed things up! As pointed out by Haakon Bugge, * the line of code marked "GOOF" below was missing from the * Dhrystone distribution for the last several months. It * *WAS* in a backup copy I made last winter, so no doubt it * was victimized by sleepy fingers operating vi! * * The effect of the line missing is that the reported benchmarks * are 15% too fast (at least on a 80286). Now, this creates * a dilema - do I throw out ALL the data so far collected * and use only results from this (corrected) version, or * do I just keep collecting data for the old version? * * Since the data collected so far *is* valid as long as it * is compared with like data, I have decided to keep * TWO lists- one for the old benchmark, and one for the * new. This also gives me an opportunity to correct one * other error I made in the instructions for this benchmark. * My experience with C compilers has been mostly with * UNIX 'pcc' derived compilers, where the 'optimizer' simply * fixes sloppy code generation (peephole optimization). * But today, there exist C compiler optimizers that will actually * perform optimization in the Computer Science sense of the word, * by removing, for example, assignments to a variable whose * value is never used. Dhrystone, unfortunately, provides * lots of opportunities for this sort of optimization. * * I request that benchmarkers re-run this new, corrected * version of Dhrystone, turning off or bypassing optimizers * which perform more than peephole optimization. Please * indicate the version of Dhrystone used when reporting the * results to me. * * * The following program contains statements of a high-level programming * language (C) in a distribution considered representative: * * assignments 53% * control statements 32% * procedure, function calls 15% * * 100 statements are dynamically executed. The program is balanced with * respect to the three aspects: * - statement type * - operand type (for simple data types) * - operand access * operand global, local, parameter, or constant. * * The combination of these three aspects is balanced only approximately. * * The program does not compute anything meaningfull, but it is * syntactically and semantically correct. * */ /* Accuracy of timings and human fatigue controlled by next two lines */ //const LOOPS = 50000; /* Use this for slow or 16 bit machines */ const LOOPS = 500000; /* Use this for faster machines */ /* Compiler dependent options */ #undef NOENUM /* Define if compiler has no enum's */ #undef NOSTRUCTASSIGN /* Define if compiler can't assign structures */ /* define only one of the next two defines */ #define TIMES /* Use times(2) time function */ /*#define TIME /* Use time(2) time function */ /* define the granularity of your times(2) function (when used) */ const HZ = 60; /* times(2) returns 1/60 second (most) */ //const HZ = 100; /* times(2) returns 1/100 second (WECo) */ /* for compatibility with goofed up version */ /*#undef GOOF /* Define if you want the goofed up version */ #ifdef GOOF char Version[] = "1.0"; #else char Version[] = "1.1 (C++ syntax)"; #endif #ifdef NOSTRUCTASSIGN #define structassign(d, s) memcpy(&(d), &(s), sizeof(d)) #else #define structassign(d, s) d = s #endif #ifdef NOENUM const Ident1 = 1; const Ident2 = 2; const Ident3 = 3; const Ident4 = 4; const Ident5 = 5; #else enum Enumeration {Ident1, Ident2, Ident3, Ident4, Ident5}; #endif typedef int OneToThirty; typedef int OneToFifty; typedef char CapitalLetter; typedef char String30[31]; typedef int Array1Dim[51]; typedef int Array2Dim[51][51]; struct Record { Record *PtrComp; Enumeration Discr; Enumeration EnumComp; OneToFifty IntComp; String30 StringComp; }; typedef int boolean; const NULL = 0; const TRUE = 1; const FALSE = 0; #ifndef REG #define REG #endif extern Enumeration Func1( CapitalLetter, CapitalLetter); extern boolean Func2( String30, String30 ); // C++ declararions extern boolean Func3( Enumeration ); extern void Proc0(), Proc1( Record* ), Proc2( OneToFifty* ), Proc3( Record** ), Proc4(), Proc5(), Proc6( Enumeration, Enumeration* ), Proc7( OneToFifty, OneToFifty, OneToFifty* ), Proc8( Array1Dim, Array2Dim, OneToFifty, OneToFifty ); extern int exit( int ), printf( const char*, ... ); extern char* strcpy( const char*, const char* ), strcmp( const char*, const char* ); #ifdef TIMES #include #include #endif main() { Proc0(); exit(0); } /* * Package 1 */ int IntGlob; boolean BoolGlob; char Char1Glob; char Char2Glob; Array1Dim Array1Glob; Array2Dim Array2Glob; Record* PtrGlb; Record* PtrGlbNext; void Proc0() { OneToFifty IntLoc1; REG OneToFifty IntLoc2; OneToFifty IntLoc3; REG char CharLoc; REG char CharIndex; Enumeration EnumLoc; String30 String1Loc; String30 String2Loc; // extern char *malloc(); register unsigned int i; #ifdef TIME long time( long* ); long starttime; long benchtime; long nulltime; starttime = time(0); for (i = 0; i < LOOPS; ++i); nulltime = time(0) - starttime; /* Computes o'head of loop */ #endif #ifdef TIMES time_t starttime; time_t benchtime; time_t nulltime; struct tms tms; times(&tms); starttime = tms.tms_utime; for (i = 0; i < LOOPS; ++i); times(&tms); nulltime = tms.tms_utime - starttime; /* Computes overhead of looping */ #endif PtrGlbNext = new Record; PtrGlb = new Record; PtrGlb->PtrComp = PtrGlbNext; PtrGlb->Discr = Ident1; PtrGlb->EnumComp = Ident3; PtrGlb->IntComp = 40; strcpy(PtrGlb->StringComp, "DHRYSTONE PROGRAM, SOME STRING"); #ifndef GOOF strcpy(String1Loc, "DHRYSTONE PROGRAM, 1'ST STRING"); /*GOOF*/ #endif Array2Glob[8][7] = 10; /* Was missing in published program */ /***************** -- Start Timer -- *****************/ #ifdef TIME starttime = time(0); #endif #ifdef TIMES times(&tms); starttime = tms.tms_utime; #endif for (i = 0; i < LOOPS; ++i) { Proc5(); Proc4(); IntLoc1 = 2; IntLoc2 = 3; strcpy(String2Loc, "DHRYSTONE PROGRAM, 2'ND STRING"); EnumLoc = Ident2; BoolGlob = ! Func2(String1Loc, String2Loc); while (IntLoc1 < IntLoc2) { IntLoc3 = 5 * IntLoc1 - IntLoc2; Proc7(IntLoc1, IntLoc2, &IntLoc3); ++IntLoc1; } Proc8(Array1Glob, Array2Glob, IntLoc1, IntLoc3); Proc1(PtrGlb); for (CharIndex = 'A'; CharIndex <= Char2Glob; ++CharIndex) if (EnumLoc == Func1(CharIndex, 'C')) Proc6(Ident1, &EnumLoc); IntLoc3 = IntLoc2 * IntLoc1; IntLoc2 = IntLoc3 / IntLoc1; IntLoc2 = 7 * (IntLoc3 - IntLoc2) - IntLoc1; Proc2(&IntLoc1); } /***************** -- Stop Timer -- *****************/ #ifdef TIME benchtime = time(0) - starttime - nulltime; printf("Dhrystone(%s) time for %ld passes = %ld\n", Version, (long) LOOPS, benchtime); printf("This machine benchmarks at %ld dhrystones/second\n", ((long) LOOPS) / benchtime); #endif #ifdef TIMES times(&tms); benchtime = tms.tms_utime - starttime - nulltime; printf("Dhrystone(%s) time for %ld passes = %ld\n", Version, (long) LOOPS, benchtime/HZ); printf("This machine benchmarks at %ld dhrystones/second\n", ((long) LOOPS) * HZ / benchtime); #endif } void Proc1 (REG Record* PtrParIn) { #define NextRecord (*(PtrParIn->PtrComp)) structassign(NextRecord, *PtrGlb); PtrParIn->IntComp = 5; NextRecord.IntComp = PtrParIn->IntComp; NextRecord.PtrComp = PtrParIn->PtrComp; Proc3((Record**)NextRecord.PtrComp); if (NextRecord.Discr == Ident1) { NextRecord.IntComp = 6; Proc6(PtrParIn->EnumComp, &NextRecord.EnumComp); NextRecord.PtrComp = PtrGlb->PtrComp; Proc7(NextRecord.IntComp, 10, &NextRecord.IntComp); } else structassign(*PtrParIn, NextRecord); #undef NextRecord } void Proc2 (OneToFifty* IntParIO) { REG OneToFifty IntLoc; REG Enumeration EnumLoc; IntLoc = *IntParIO + 10; for(;;) { if (Char1Glob == 'A') { --IntLoc; *IntParIO = IntLoc - IntGlob; EnumLoc = Ident1; } if (EnumLoc == Ident1) break; } } void Proc3 (Record** PtrParOut) { if (PtrGlb != 0) *PtrParOut = PtrGlb->PtrComp; else IntGlob = 100; Proc7(10, IntGlob, &PtrGlb->IntComp); } void Proc4() { REG boolean BoolLoc; BoolLoc = Char1Glob == 'A'; BoolLoc |= BoolGlob; Char2Glob = 'B'; } void Proc5() { Char1Glob = 'A'; BoolGlob = FALSE; } // extern boolean Func3(); void Proc6( REG Enumeration EnumParIn, REG Enumeration* EnumParOut ) { *EnumParOut = EnumParIn; if (! Func3(EnumParIn) ) *EnumParOut = Ident4; switch (EnumParIn) { case Ident1: *EnumParOut = Ident1; break; case Ident2: if (IntGlob > 100) *EnumParOut = Ident1; else *EnumParOut = Ident4; break; case Ident3: *EnumParOut = Ident2; break; case Ident4: break; case Ident5: *EnumParOut = Ident3; } } void Proc7 (OneToFifty IntParI1, OneToFifty IntParI2, OneToFifty* IntParOut ) { REG OneToFifty IntLoc; IntLoc = IntParI1 + 2; *IntParOut = IntParI2 + IntLoc; } void Proc8 (Array1Dim Array1Par, Array2Dim Array2Par, OneToFifty IntParI1, OneToFifty IntParI2) { REG OneToFifty IntLoc; REG OneToFifty IntIndex; IntLoc = IntParI1 + 5; Array1Par[IntLoc] = IntParI2; Array1Par[IntLoc+1] = Array1Par[IntLoc]; Array1Par[IntLoc+30] = IntLoc; for (IntIndex = IntLoc; IntIndex <= (IntLoc+1); ++IntIndex) Array2Par[IntLoc][IntIndex] = IntLoc; ++Array2Par[IntLoc][IntLoc-1]; Array2Par[IntLoc+20][IntLoc] = Array1Par[IntLoc]; IntGlob = 5; } Enumeration Func1 (CapitalLetter CharPar1, CapitalLetter CharPar2) { REG CapitalLetter CharLoc1; REG CapitalLetter CharLoc2; CharLoc1 = CharPar1; CharLoc2 = CharLoc1; if (CharLoc2 != CharPar2) return (Ident1); else return (Ident2); } boolean Func2 (String30 StrParI1, String30 StrParI2) { REG OneToThirty IntLoc; REG CapitalLetter CharLoc; IntLoc = 1; while (IntLoc <= 1) if (Func1(StrParI1[IntLoc], StrParI2[IntLoc+1]) == Ident1) { CharLoc = 'A'; ++IntLoc; } if (CharLoc >= 'W' && CharLoc <= 'Z') IntLoc = 7; if (CharLoc == 'X') return(TRUE); else { if (strcmp(StrParI1, StrParI2) > 0) { IntLoc += 7; return (TRUE); } else return (FALSE); } } boolean Func3( REG Enumeration EnumParIn ) { REG Enumeration EnumLoc; EnumLoc = EnumParIn; if (EnumLoc == Ident3) return (TRUE); return (FALSE); } #ifdef NOSTRUCTASSIGN memcpy(d, s, l) register char *d; register char *s; register int l; { while (l--) *d++ = *s++; } #endif