Using POMDP as Modeling Framework for Network Fault Management
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
For highדּpeed networks, it is important that fault management be proactive--i.e., detect, diagnose, and mitigate problems before they result in severe degradation of network performance. Proactive fault manageשּׂent depends on monitoring the network to obtain the data on which to base manager decisions. However, monitoring introduces additional overhead that may itself degrade network performance especially when the network is in a stressed state. Thus, a tradeoff must be made between the amount of data collected and transferred on one hand, and the speed and accuracy of fault detection and diagnosis on the other hand. Such a tradeoff can be naturally formulated as a Partially Observable Markov decision process (POMDP).
Since exact solution of POMDPs for a realistic number of states is computationally prohibitive, we develop a reinforcementשּׁearningﬢased fast algorithm which learns the decisionגּule in an approximate network simulator and makes it fast deployable to the real network. Simulation results are given to diagnose a switch fault in an ATM network. This approach can be applied to centralized fault management or to construct intelligent agents for distributed fault management.