Author: Francis A. Gardner
Source: GZipped PostScript (201kb); Adobe PDF (1104kb)
This thesis addresses automated performance management in a distributed system. The specific area of performance management being explored is the detection of resource contention.
This thesis describes methods for detecting symptoms of performance problems. The methods include ideas from statistical process control and time series modelling. This thesis also presents a framework which is used to detect performance problems in a distributed system. The framework allows the exploration of different methods for problem detection, and builds on another project, called Control Room, which displays information about resource use in a distributed system. The framework consists of a management language which is used to describe the system under management and symptoms of problems associated with the system, and an alarm manager which detects and reports these problems. An implementation of the framework is also described.