A FLEXIBLE APPROACH FOR ORCHESTRATING ADAPTIVE SCIENTIFIC WORKFLOWS FOR SCALABLE COMPUTING

Loading...
Thumbnail Image

Files

Publication or External Link

Date

2022

Citation

Abstract

Modern scientific workflows are becoming complex with the incorporation of non-traditionalcomputation methods, and advances in technologies enabling on-the-fly analysis. These work- flows exhibit unpredictable runtime behaviors and have dynamic requirements. For example, such workflows must maintain overall performance and throughput while dealing with undesired events, adapting to failures, and supporting data-driven adaptive analysis. A fixed, predetermined resource assignment common to HPC machines is inefficient for overall performance, throughput, and data-driven adaptive analysis. While solutions exist to enable elastic resource management, there is no support that can manage the workflows at runtime to determine when the resource assignment and/or the runtime state of tasks (i.e. stopping, starting, changing the task parameters for adapting analysis, or changing how data is sent/received by the workflow tasks) needs to be revised, and perform the feasible changes at runtime accordingly.

This dissertation provides a flexible and portable model, DYFLOW, with strategies to auto-mate the management of scalable and adaptive workflows. The model gathers runtime statistics, tracks the occurrence of important events, and finalizes a plan of action to execute in response to events that occurred, by mediating between suggested actions with respect to the running state of the workflow tasks and resource availability. Further, the model supports a wide range of con- structs and tunable parameters that allow users to express events of interest, select prospective responses, and set various preferences to set the service expectation, e.g., throughput, performance, resilience to failures, or quality of results. To showcase that the DYFLOW model supports adaptive functionality desired for emerging workflows, several examples of problematic behavior are demonstrated where DYFLOW accommodates the specific requirements and automates the runtime management process for scientists while delivering the quality of service desired.

Notes

Rights