A couple of months back, I posted a program named fsx, which purported to
benchmark performance of various aspects of Unix filesystems.  I got a *lot*
of mail from people, which fell into the following equivalence classes:

 1. Thank you
 2. Here are the numbers for our new whizbang mark X
 3. Not bad, but you forgot Y
 4. The whole idea is totally bankrupt because of Z
 5. There's an obvious bug in your program (this was true; there was an
    obvious bug, thankfully of the blow-you-up rather than distort-the-results
    type.  Amazing, the number of systems it worked on).

Eventually, I was convinced that it was worthwhile to go back and work on
improving it.  Herewith the results:

1. CHANGES

The name's changed.  Someone at DEC not only stole the name fsx from me, but
did it some years before I invented it.  The nerve.  It's now called Bonnie,
because it plays with bottlenecks.

The obvious bug is fixed.

It no longer uses BUFSIZ.  I thought using BUFSIZ was sort of defensible,
since it's not unreasonable to use that in an application and expect that to
be a pretty good size for filesystem I/O, but I succumbed to the shouting-down
on this one.

There's a few more tweaks aimed at defeating compiler optimizers.

The seek testing, in attempt to further improve its cache-busting
effectiveness, takes a multiprocess approach now.

The output format is better.

2. DISCUSSION AND PREACHING

From the feedback to the last posting:

From: mo@messy.UUCP (Michael O'Dell)
>Note: the combination of disk and controllers used for these tests
>can make a LOT of difference in the performance.
>Your mileage can vary an astonishing amount.
Right, that's why it exists.

From: haas@frith.uucp (Paul R. Haas)
>Remember, this sort of benchmark result is worthless if you don't report
>the model, operating system version, disk controller, disk drive
>type, and configuration information (asynch, or synch SCSI, block size,
>etc...).
Amen, amen, amen.

From: mash@mips.COM (John Mashey)
>If there is enough CPU performance to seek the disk
>at full speed, it is irrelevant how much faster it is: the benchmark
>doesn't get faster.  Put another way, on single-threaded kinds of benchmarks,
>the only thing that counts is whatever is the bottleneck.
Right on.  Bonnie still shows CPU-limited I/O on a few classes of bottleneck,
even on fast systems.  I suspect this will be less and less the case as time
goes by, but it's interesting.

>3) However, note that the random benchmark doesn't read the same amount of
>data on each machine, as it uses BUFSIZ from stdio.h, and that
OK, OK, OK.

> a) Even trying hard, (and Tim was) it is hard to do I/O benchmarks without
> weird quirks.  In particular, you are constantly fighting to outsmart the
> UNIX buffer cache (or whatever it uses).
Right.  And that's why it would be foolish and wrong to try to distill
Bonnie's results into a single number & call it an fs-stone or something.  All
you can do is try to establish deterministic limits for certain capabilities
in certain environments.  Which is much better than operating in a vacuum.  I
claim that Bonnie does a good enough job of this to be useful.

From: lm@snafu.Sun.COM (Larry McVoy)
>Another comment on Tim's test and I/O tests in general  (check out Hennessy
>& Patterson's chapter on I/O - they say this better than I do):  the
>performance of I/O is going to be limited by the slowest part of the path,
>be it memory, processor, software, I/O bus, or I/O device.
>
>That said, it's my belief that the IBM numbers are apples to oranges.  The
>drives that IBM puts in the 6000 are *FAST*.
Yes and No.  The first comment is correct.  The second is questionable; IBM is
selling boxes with fairly similar functionality in a fairly similar price
range to everybody else.  Bonnie's results suggest that this machine achieves
remarkable I/O performance.  Where are the apples & oranges?

From: cdinges@estevax (Hr Dinges Clemens )
>Proposal:
>'fsx' should be expanded for measuring the reorganization features
>of the filesystem, espec. the effects of a certain number of predefined file 
>creation/write/deletion operations that simulate the use of the filesystem.
>A known effect (SysV filesystems) is the disordering of the freelist 
>entries which might greatly affect the performance of sequential filesystem 
>operations (see below).
>In other words, benchmarking with 'fsx' should always be done like that:
>1. Generate a fresh filesystem; 2. simulate the use of the filesystem; 
>3. run the benchmark.
>To illustrate the effects of freelist disordering we ran fsx with and 
>without a (very) simple freelist scrambler on the XENIX V/386 2.3.2
>filesystem.
Good idea.  I would point out Bonnie might also provide a metric for the
effects of many other tunable and hard-wired filesystem parameters.

From: kcollins@convex1.convex.com (Kirby L. Collins)
>What we've found is that on large systems UNIX filesystem performance is
>usually CPU bound.  This has several interesting ramifications.
You bet.  It's sort of disappointing that this is still the case given the
iron we're using now.

3. SUMMARIES AND PROPOSALS

Here are some Bonnie results.  Unfortunately, it's end of term here, and I
had great trouble finding machines with hundreds of Mb free, and even more
important, with unloaded CPUs.  Therefore, I deliberately omit all details of
disk type, OS version and so on, along with discussion.  These are mostly
there to provide a sample of Bonnie output and perhaps provoke thought.  

I don't have time to run this any more.  That is a job for the vendors and for
SPEC and so on.  However, I will cheerfully act as a clearinghouse for Bonnie
results, bugfixes, improvements and so on, and if lots of results come in,
will post them occasionally.

              -------Sequential Output-------- ---Sequential Input-- --Random--
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
Sun4/260   78   285 90.9   650 26.4   305 20.0   282 95.5   803 28.3  38.5 15.2
Mips M2k  180   387 44.0   442  6.1   360  5.4   416 48.1  1701 14.8  26.3  3.1
Sequent    95   140 97.8  1017 66.2   251 17.7   122 95.7   713 34.9  31.0 15.9
4M i386    25   130 90.5   199 26.1   179 46.1   121 93.5   400 59.7  10.5 23.8
NeXT      125   241 94.7   347 39.3   253 31.7   246 94.8   772 49.6  27.7 20.8
VAX 8650  200   208 89.0   232  8.3   143  7.4   197 65.0   373  8.6  17.3  4.6
