APPLYING POLICY GRADIENT METHODS TO OPEN-ENDED DOMAINS

Sullivan, Ryan

APPLYING POLICY GRADIENT METHODS TO OPEN-ENDED DOMAINS

dc.contributor.advisor	Dickerson, John P	en_US
dc.contributor.author	Sullivan, Ryan	en_US
dc.contributor.department	Computer Science	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2025-08-08T12:21:22Z
dc.date.issued	2025	en_US
dc.description.abstract	Deep reinforcement learning (RL) has been successfully used to train agents in complex video game environments including Starcraft 2, Dota 2, Minecraft, and Gran Turismo. Each of these projects utilized curriculum learning to train agents more efficiently. However, systematic investigations of curriculum learning are limited and it is rarely studied outside of toy research environments. Modern RL methods still struggle in stochastic, sparse-reward environments with long planning horizons. This thesis studies these challenges from multiple perspectives to develop a stronger empirical understanding of curriculum learning in complex environments. By introducing novel visualization techniques for reward surfaces and empirically investigating key implementation details, it explores why policy gradient methods alone are insufficient for sparse-reward tasks. These findings motivate the use of curriculum learning to decompose problems into learnable subtasks and to prioritize learnable objectives. Building on these insights, this dissertation presents a general-purpose library for curriculum learning and uses it to evaluate popular automatic curriculum learning algorithms on challenging RL environments. Curricula have historically been effective for training reinforcement learning agents, and a fundamental understanding of automatic curriculum learning is an essential step toward developing generally capable agents in open-ended environments.	en_US
dc.identifier	https://doi.org/10.13016/scud-hx7z
dc.identifier.uri	http://hdl.handle.net/1903/34302
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Artificial intelligence	en_US
dc.subject.pquncontrolled	Curriculum Learning	en_US
dc.subject.pquncontrolled	Open-Endedness	en_US
dc.subject.pquncontrolled	Reinforcement Learning	en_US
dc.title	APPLYING POLICY GRADIENT METHODS TO OPEN-ENDED DOMAINS	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Sullivan_umd_0117E_25171.pdf
Size:: 52.52 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Computer Science Theses and Dissertations