Path: cf-cm!uknet!root44!hrc63!mrcu!paj
From: paj@uk.co.gec-mrc (Paul Johnson)
Newsgroups: comp.lang.eiffel
Subject: Review of SIG Eiffel/S
Message-ID: <5024@snap>
Date: 17 Aug 94 08:31:16 GMT
Organization: GEC-Marconi Research Centre, Great Baddow, UK
Lines: 341
X-Newsreader: TIN [version 1.2 PL2]

This was to have been published elsewhere.  That now seems unlikely to
happen in the near future.  I've posted it here instead to complement
the evaluation of ISE Eiffel.

If anyone wants a copy of the benchmark programs then please send me
email.


Eiffel/S 1.3
============

(( # means a pound sign, <c> means a code font (e.g. courier), <i>
means italic, <t> means normal text.))

((Introduction))

Eiffel/S from Sig Computer Gmbh in Germany was the first commercially
available Eiffel 3 compiler.  Now Sig have released Eiffel/S 1.3.

((Introduction ends))

((Main Text))

This review was carried out using a Sparcstation 1 running Unix.
Eiffel/S will also run on PCs under MS-DOS and Windows.  The compiler
and basic libraries only take 2.5 Mbytes of disk, but any reasonable
project will need 10-20 Mb per developer for the generated code and
object files.

The Compiler
------------

Like all other Eiffel compilers, Eiffel/S generates C code.  The
compilation driver calls a C compiler and linker automatically, and
informs the user of the progress of the C compilation.  PC users can
use a range of C compilers, including the free GNU <c>gcc<t>.

Each class in the system (including library classes) is compiled into
a C file in the project directory, and each of these is compiled into
an object file.  These files are then linked to produce the
executable.  Hence each developer needs enough disk space to compile
the entire project.  Dynamically linked libraries are not possible in
Eiffel because any feature call within a library routine may be
compiled as a dynamic "virtual" call, a static "non virtual" call, or
an inline expansion, depending on whether or not the feature has been
redefined.

The compiler can run in three modes.  By default it produces
non-optimised code with all the assertion checks.  With the "-O" flag
it generates optimised code with no assertion checks.  With the "-F"
flag it also generates inline expansions under certain conditions.

I tested the compiler by using it to debug an experimental data
structure library containing about 133 Kbytes of code in 61 classes
(excluding a block comment and index clause at the start of each
class).  Compiling the test harness resulted in an executable of 844
Kbytes, or 229 Kbytes after optimisation.  Note that these executables
included parts of the Eiffel/S run-time library.

The quality of the Eiffel/S 1.3 compiler is much higher than 1.0.  It
never crashed, although on one occasion it generated some incorrect
code.  It turned out I was (wrongly) trying to use the feature
"is_equal" in a class which did not export it.  Instead of generating
an error message the compiler produced a call to an undefined C
function.  This caused the linker to fail.

Debugging
---------

Eiffel/S has a system for turning assertion checking and tracing on
and off at run-time, via its "Run-time Control Language".  When an
assertion is violated, the run-time system puts a stack trace in a
file for later perusal.  This usually allows the programmer to locate
problems.  However there is no debugger.  This is not the disaster it
would be in another language because the Eiffel assertion system makes
debugging much easier than conventional languages.  However it is
still a major failing.

The generated C code is very obscure.  The generated C function names
contain large decimal numbers which give no clue as to their origin.
I'd like to see a filter to convert these codes into Eiffel feature
names.  This would allow standard Unix tools, such as the "gprof"
profiling system, to be used on Eiffel programs.

Documentation Generation
------------------------

One of the central ideas behind Eiffel is the generation of detailed
reference documents from the source code itself.  Thus the reference
manual for a library can easily be kept up to date.  Eiffel/S includes
the tool <c>edoc<t> for this.  Unfortunately I could not persuade
<c>edoc<t> to produce a reference manual for my library.  It kept
stopping and generating incomprehensible error messages.

Libraries
---------

Eiffel/S comes with a class library.  In version 1.0 the class names
had to be the same as their file names.  This limited class names to 8
characters so that the files could be stored under MS-DOS.  This
annoying restriction has now been removed.  Class names can be any
length, and are mapped on to files by a clause in the configuration
file.

In general the data structure cluster is well designed and well
implemented.  The interfaces have been deliberately kept simple.  All
the container objects can be traversed by "iterators", and each
container can have arbitrarily many of these iterators traversing it.
This avoids the problem where a container object has a single iterator
associated with it, and two client routines need to traverse its
contents.  In the single iterator case the client routines must
co-operate in sharing the iterator.  With the Sig library the two
routines simply have two separate iterators.

One useful feature of iterators is that a routine which processes a
sequence of objects can take an iterator as an argument.  The iterator
can come from any kind of container, including the "graph" cluster.
This makes routines independent of the implementation of their
arguments.

Each collection class has a creation procedure which takes a boolean
argument called <c>unique<t>.  This indicates whether a single object
may be put into the collection more than once.  If this is set to true
then it effectively makes each container a set.  I am puzzled by this
feature, since I would have thought it more efficient and
straightforward to have sets as a separate hierarchy of classes.
These could then implement other set operations such as `union' and
`intersection'.

More complicated structures of objects are managed by the "graph"
cluster.  This manipulates graphs of "vertex" objects, each of which
is linked by "edge" objects to other vertices.  A graph can be
traversed by iterators in either depth-first of breadth-first order,
and the list of edges between vertices can also be traversed.  The
manual includes a small demonstration program which computes the
shortest and longest path between two nodes in a network.

The "matcher" cluster provides facilities for searching strings.
Keyword and regular expression matchers are included.

Files
-----

Eiffel/S provides an interesting system for persistent storage.  The
"FILE" class has been designed by analogy to the "ARRAY" class.  Once
a file has been opened, objects can be stored and retrieved by
numerical index.  This store operation is "deep" in that if a stored
object contains references to other objects then those objects will
also be stored.  When an object is retrieved from a file, the result
is therefore a deep clone of the original.  This has the disadvantage
that if two objects are stored in a file and then retreived, any third
object to which they both refer will have been duplicated.  The
solution is to store both objects in an array or list, and then store
that.

These persistent object files are designed for the random access of
objects.  The class TEXTFILE provides conventional Unix or MS-DOS
files which consist of streams of characters.  Other classes manage
"clusters" (i.e directories) of files.

Garbage Collection
------------------

Sig have provided a simple but fast "mark-sweep" garbage collector.
Whenever the Eiffel program has allocated a certain amount of space
the collector is run automatically.  This has the advantage of being
faster than competing "incremental" algorithms, but has the
disadvantage that the Eiffel program is stopped for a significant
amount of time during the collection.

The "Shuffle" benchmark was written to exercise the garbage collector.
For each test the main loop was executed 1,000 times.  On average 200
(1/5) of these iterations caused a new object to be created.  The
garbage collector was triggered about once every eight tests.  This
suggests that the collector had to mark a hundred or so live objects
and sweep up about 1,600 dead objects.  This took about 0.4 of a
second, most of which was probably spent on the dead objects.  With
the optimised version this was very obvious.  The program was taking
about 230 milliseconds for each test, but once every eight tests it
would pause for an extra 400 milliseconds or so.  These results have
been shown in the table for Shuffle.  The "Average" column shows the
overall average time per test.  The "non-GC" and "GC" columns show the
average times for those tests where GC was or was not invoked.  From
these figures the GC overhead (as a percentage of CPU time) and the
time taken per GC cycle have been computed.  Note that the time to
perform GC is roughly constant irrespective of the compiler
optimisation.

For interactive applications this stop can be annoying.  Users will
not be happy with a system which occasionally freezes for no readily
apparent reason.  Statistically most of these events will occur during
long calculations where the user expects to wait, but occasionally
they will happen when the user is expecting a quick response.

For anything even vaguely real-time (e.g data logging, process
control, robotics, video games) this is a disaster.  Eiffel/S is not
suitable for these application areas.

The collector can be controlled by setting the trigger threshold or by
manually instituting collections at certain times (e.g after a large
computation has just finished).  This provides a partial solution to
these problems.

Benchmark Results
-----------------

These benchmarks were taken on a Sun Sparcstation 1 with local swap
space and network disks.  CPU usage during the Eiffel compilation was
low (20 - 30%), suggesting that a faster file system would give a
significant increase in speed.  The figures were obtained from the
Unix "time" command.

The "Application" compilation yielded some interesting results.  When the
source files were merely touched, Eiffel/S skipped the C compilation.
Apparently it is smart enough to spot if nothing has actually
changed.  The figures below were obtained by changing <c> x+y <t>
into <c> y+x <t> in each file.  This did not affect the operation of
the code, but was enough to fool Eiffel/S into repeating the C
compilation.



"Multiply" Compilation

                                  Real (m:s)  CPU (secs)  Utilisation (%)
Full compilation (Eiffel):        2:30         44.7       29
Full compilation (C):             7:29        302.8       67
Total:                            9:59        347.5       58

Dummy compilation (Eiffel):       0:48          9.1       28
Dummy compilation (C):               -            -        -
Total:                            0:48          9.1       28

Application compilation (Eiffel): 0:47         13.1       28
Application compilation (C):      0:28         17.3       62
Total:                            1:15         30.4       41


"Shuffle" Compilation


Full compilation (Eiffel):        4:29         53.4       19
Full compilation (C):             6:56        363.8       92
Total:                           11:25        417.2       60

Dummy compilation (Eiffel):       0:41         10.3       25
Dummy compilation (C):               -            -        -
Total:                            0:41         10.3       25

Application compilation (Eiffel): 0:50         14.3       28
Application compilation (C):      0:30         13.6       45
Total:                            1:20         27.9       34


"Multiply" Run
                     mSecs

All assertions:      5,632
No assertions:       3,558
Optimised:             213



"Shuffle" Run
                      Average     Non-GC        GC  GC overhead  GC time

All assertions:       1,585.9    1,521.0   1,928.5         4.3%    407.5
No assertions:          819.3      755.5    1182.6         7.3%    427.1
Optimsed:               283.3      230.5     666.0        22.9%    435.5


Conclusion
----------

Eiffel/S is the cheapest Eiffel compiler, but you don't get very much
for your money.  Sig <i> really <t> need to produce a debugger if they
are to compete effectively.  The compiler itself is reliable, but the
documentation generator "edoc" could do with some more testing.

The efficiency of the mark-sweep garbage collector is an advantage in
batch applications, but for interactive applications it is a nuisance.
For anything real-time it is useless.

The "educational" version (which compiles up to 75 classes) is very
small.  By the time it has compiled the kernel classes and some data
structures, the user has space left for very few application classes.
However it is very good value for money as an introduction to the
language.

The data structure libraries are well worth looking at.  Buyers of
other compilers should consider using these libraries.

((Begin Box))

The Benchmarks
--------------

We used two benchmark programs for this review.

1: Multiply.  This declared a class "MATRIX" and tested the time taken
   to fill two 20x20 matrices with floating point values and then
   multiply them together.  The MATRIX class was written using an
   "array of arrays" structure in order to exercise the compiler's
   ability to optimise array dereferencing.  A simple translation would
   require 4 function calls for each reference.  An optimised
   translation should be able to use inlining to reduce that to one or
   none.

2: Shuffle.  This tested the garbage collection routines.  It declared
   a class "TWO_WAY" with attributes "left" and "right", each of which
   referenced another instance of TWO_WAY.  The test loop set up an
   array of TWO_WAY objects and then set the references at random to
   produce a tangled network of objects and references.  The garbage
   collector would have to traverse this network in order to determine
   which objects had dropped off the web.

Unbiased benchmarks are difficult things to write, especially for
garbage collectors.  For example, most programs have many short-lived
objects and a few long-lived objects.  "Generational" garbage
collectors exploit this by tagging older objects and then checking
them less frequently.  A benchmark where all objects have a similar
lifespan can penalise a generational collector because its tagging
stratagy does not work in the artificial conditions of a benchmark.

For this reason the figures given here should be taken with a large
pinch of salt.  They give some idea of performance, but a real
application might give very different figures, and even a different
ordering.  There is no "best" algorithm.  Also, the Shuffle benchmark
produced a great deal of garbage while doing very little "work".
Figures for the overhead of garbage collection will therefore be much
higher than for typical application software.

((End Box))




-- 
Paul Johnson (paj@gec-mrc.co.uk).           | Tel: +44 245 473331 ext 3245
--------------------------------------------+----------------------------------
You are lost in a twisty maze of little     | GEC-Marconi Research is not
standards, all different.                   | responsible for my opinions