APPLYING POLICY GRADIENT METHODS TO OPEN-ENDED DOMAINS

dc.contributor.advisorDickerson, John Pen_US
dc.contributor.authorSullivan, Ryanen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2025-08-08T12:21:22Z
dc.date.issued2025en_US
dc.description.abstractDeep reinforcement learning (RL) has been successfully used to train agents in complex video game environments including Starcraft 2, Dota 2, Minecraft, and Gran Turismo. Each of these projects utilized curriculum learning to train agents more efficiently. However, systematic investigations of curriculum learning are limited and it is rarely studied outside of toy research environments. Modern RL methods still struggle in stochastic, sparse-reward environments with long planning horizons. This thesis studies these challenges from multiple perspectives to develop a stronger empirical understanding of curriculum learning in complex environments. By introducing novel visualization techniques for reward surfaces and empirically investigating key implementation details, it explores why policy gradient methods alone are insufficient for sparse-reward tasks. These findings motivate the use of curriculum learning to decompose problems into learnable subtasks and to prioritize learnable objectives. Building on these insights, this dissertation presents a general-purpose library for curriculum learning and uses it to evaluate popular automatic curriculum learning algorithms on challenging RL environments. Curricula have historically been effective for training reinforcement learning agents, and a fundamental understanding of automatic curriculum learning is an essential step toward developing generally capable agents in open-ended environments.en_US
dc.identifierhttps://doi.org/10.13016/scud-hx7z
dc.identifier.urihttp://hdl.handle.net/1903/34302
dc.language.isoenen_US
dc.subject.pqcontrolledArtificial intelligenceen_US
dc.subject.pquncontrolledCurriculum Learningen_US
dc.subject.pquncontrolledOpen-Endednessen_US
dc.subject.pquncontrolledReinforcement Learningen_US
dc.titleAPPLYING POLICY GRADIENT METHODS TO OPEN-ENDED DOMAINSen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Sullivan_umd_0117E_25171.pdf
Size:
52.52 MB
Format:
Adobe Portable Document Format