- Rick Bunt
- Derek Eager
- Darryl Willick, research assistant (now at Cal Tech)
- Kerhong Chen, M.Sc. student (now at Nortel)
- Kevin Froese, undergraduate student (now at Open Text)
The client-server environment has supplanted the traditional mainframe
environment in many organizations. While this new paradigm
presents many exciting computational opportunities, it also gives rise
to new challenges concerning system design and performance. As both
workstation clients and servers become increasingly well-resourced,
and both users and their applications become increasingly demanding,
a number of system design decisions need to be rethought if service
expectations are to be met. Our research in this area
has addressed a broad range of performance and design issues in
client-server clusters.
The overall theme of
our investigation of disk block caching in distributed client-server
file systems was that the
caching at clients dramatically alters the workload presented to the file
server, and thus traditional approaches to cache management may
perform poorly at servers. The investigation was carried out in
a number of stages, using as input to simulations detailed traces of file
reference activity obtained through specifically designed probes we inserted
into a Unix (HP-UX) kernel.
We published three papers from this project, all of which are available on the
DISCUS
ftp Server.
The first stage of the investigation,
described in a paper entitled
Disk Cache Replacement Policies for Network Fileservers,
dealt with the caching of read requests.
Because the request
stream presented to the server cache is a stream of misses from
the client cache, the temporal locality on which traditional cache
management approaches depend is filtered out by the client cache,
and therefore traditional locality-based cache management strategies
(such as LRU), while suitable at client caches, are not suitable at
server caches where frequency-based approaches (such as LFU) may be
better choices.
A second paper, entitled
Write Caching in Distributed File Systems,
extended this work to include write requests using special write caches.
Concerns for data integrity have traditionally led to the use of very
conservative ``write through'' approaches to propagate
the results of updates to cached blocks to the file server.
This means that the benefits of caching are often
not available to write requests. The increasing availability of
non-volatile memory technologies now provides the opportunity to delay
the write-back of changed blocks without fear of compromising the
integrity of the stored data. This means performance improvements in
a number of ways: first, the effects
of a series of operations on the same block can be reflected in a
single write-back and, second, it is
possible to apply optimization techniques to amortize the cost of
disk access over multiple write-backs.
Again, our results showed that approaches that perform well at client
caches may not be a good choice at server caches.
At the client caches, locality-based approaches perform well once again,
while at the server it is
more important to consider the cost of disk access through a
purging-based approach.
A third paper, entitled
The Effect of Client Caching on File Server Workloads,
addressed several remaining questions, including the issue of
scalability. In a real client-server
system, two effects contribute to the disruption of the stream of
requests presented to the server cache: the ``filtering'' effect
due to the presence of client caches, and an ``interleaving'' effect
due to the presence of multiple clients.
Both factors were shown to effect the server workload significant, with
filtering the dominant effect.