MANIPULATION ACTION UNDERSTANDING FOR OBSERVATION AND EXECUTION

Yang, Yezhou

MANIPULATION ACTION UNDERSTANDING FOR OBSERVATION AND EXECUTION

dc.contributor.advisor	Aloimonos, Yiannis	en_US
dc.contributor.advisor	Fermuller, Cornelia	en_US
dc.contributor.author	Yang, Yezhou	en_US
dc.contributor.department	Computer Science	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2016-02-06T06:37:59Z
dc.date.available	2016-02-06T06:37:59Z
dc.date.issued	2015	en_US
dc.description.abstract	Modern intelligent agents will need to learn the actions that humans perform. They will need to recognize these actions when they see them and they will need to perform these actions themselves. We want to propose a cognitive system that interprets human manipulation actions from perceptual information (image and depth data) and consists of perceptual modules and reasoning modules that are in interaction with each other. The contributions of this work are given along two core problems at the heart of action understanding: a.) the grounding of relevant information about actions in perception (the perception - action integration problem), and b.) the organization of perceptual and high-level symbolic information for interpreting the actions (the sequencing problem). At the high level, actions are represented with the Manipulation Action Context-free Grammar (MACFG) , a syntactic grammar and associated parsing algorithms, which organizes actions as a sequence of sub-events. Each sub-event is described by the hand (as well as grasp type), movements (actions) and the objects and tools involved, and the relevant information about these quantities is obtained from biological-inspired perception modules. These modules track the hands and objects and recognize the hand grasp, actions, segmentation, and action consequences. Furthermore, a probabilistic semantic parsing framework based on CCG (Combinatory Categorial Grammar) theory is adopted to model the semantic meaning of human manipulation actions. Additionally, the lesson from the findings on mirror neurons is that the two processes of interpreting visually observed action and generating actions, should share the same underlying cognitive process. Recent studies have shown that grammatical structures underlie the representation of manipulation actions, which are used both to understand and to execute these actions. Analogically, understanding manipulation actions is like understanding language, while executing them is like generating language. Experiments on two tasks, 1) a robot observing people performing manipulation actions, and 2) a robot then executing manipulation actions accordingly, are presented to validate the formalism. The technical parts of this thesis are devoted to the experimental setting of task (1), while the task (2) is given as a live demonstration.	en_US
dc.identifier	https://doi.org/10.13016/M2VH90
dc.identifier.uri	http://hdl.handle.net/1903/17262
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Computer science	en_US
dc.subject.pqcontrolled	Robotics	en_US
dc.subject.pqcontrolled	Information technology	en_US
dc.subject.pquncontrolled	Action Consequence	en_US
dc.subject.pquncontrolled	Action Grammar	en_US
dc.subject.pquncontrolled	Action Recognition	en_US
dc.subject.pquncontrolled	Grasping Type Recognition	en_US
dc.subject.pquncontrolled	Manipulation Actions	en_US
dc.subject.pquncontrolled	Procedural Learning	en_US
dc.title	MANIPULATION ACTION UNDERSTANDING FOR OBSERVATION AND EXECUTION	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Yang_umd_0117E_16672.pdf
Size:: 34.19 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Computer Science Theses and Dissertations