PhD Thesis

"Application Level RunTime Load Management:
A Bayesian Approach"

Luís Paulo Peixoto dos Santos

June, 2001

Up Teaching Publications Presentations PhD Thesis Hobbies

horizontal rule

Complete Thesis: (PDF, 1899 KBytes)

horizontal rule

Abstract

"Affordable parallel computing on distributed shared systems requires novel approaches to manage the runtime load distribution, since current algorithms fall below expectations. The efficient execution of irregular parallel applications, on dynamically shared computing clusters, has an unpredictable dynamic behaviour, due both to the application requirements and to the available system's resources. This thesis addresses the explicit inclusion of the uncertainty an application level scheduling agent has about the environment, on its internal model of the world and on its decision making mechanism. 

Bayesian decision networks are introduced and a generic framework is proposed for application level scheduling, where a probabilistic inference algorithm helps the scheduler to efficiently make decisions with improved predictions, based on available incomplete and aged measured data. An application level performance model and associated metrics (performance, environment and overheads) are proposed to obtain application and system behaviour estimates, to include in the scheduling agent's model and to help the evaluation. 

To verify that this novel approach improves the overall application execution time and the scheduling efficiency, a parallel ray tracer was developed as a message passing irregular data parallel application, and an execution model prototype was built to run on a seven time--shared nodes computing cluster, with dynamically variable synthetic workloads. To assess the effectiveness of the
load management, the stochastic scheduler was evaluated rendering several complex scenes, and compared with three reference
scheduling strategies: a uniform work distribution, a demand driven work allocation and a sensor based deterministic scheduling strategy. 

The evaluation results show considerable performance improvements over blind strategies, and stress the decision network based scheduler improvements over the sensor based deterministic approach of identical complexity."

horizontal rule