MANIPULATION ACTION UNDERSTANDING FOR OBSERVATION AND EXECUTION

dc.contributor.advisorAloimonos, Yiannisen_US
dc.contributor.advisorFermuller, Corneliaen_US
dc.contributor.authorYang, Yezhouen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2016-02-06T06:37:59Z
dc.date.available2016-02-06T06:37:59Z
dc.date.issued2015en_US
dc.description.abstractModern intelligent agents will need to learn the actions that humans perform. They will need to recognize these actions when they see them and they will need to perform these actions themselves. We want to propose a cognitive system that interprets human manipulation actions from perceptual information (image and depth data) and consists of perceptual modules and reasoning modules that are in interaction with each other. The contributions of this work are given along two core problems at the heart of action understanding: a.) the grounding of relevant information about actions in perception (the perception - action integration problem), and b.) the organization of perceptual and high-level symbolic information for interpreting the actions (the sequencing problem). At the high level, actions are represented with the Manipulation Action Context-free Grammar (MACFG) , a syntactic grammar and associated parsing algorithms, which organizes actions as a sequence of sub-events. Each sub-event is described by the hand (as well as grasp type), movements (actions) and the objects and tools involved, and the relevant information about these quantities is obtained from biological-inspired perception modules. These modules track the hands and objects and recognize the hand grasp, actions, segmentation, and action consequences. Furthermore, a probabilistic semantic parsing framework based on CCG (Combinatory Categorial Grammar) theory is adopted to model the semantic meaning of human manipulation actions. Additionally, the lesson from the findings on mirror neurons is that the two processes of interpreting visually observed action and generating actions, should share the same underlying cognitive process. Recent studies have shown that grammatical structures underlie the representation of manipulation actions, which are used both to understand and to execute these actions. Analogically, understanding manipulation actions is like understanding language, while executing them is like generating language. Experiments on two tasks, 1) a robot observing people performing manipulation actions, and 2) a robot then executing manipulation actions accordingly, are presented to validate the formalism. The technical parts of this thesis are devoted to the experimental setting of task (1), while the task (2) is given as a live demonstration.en_US
dc.identifierhttps://doi.org/10.13016/M2VH90
dc.identifier.urihttp://hdl.handle.net/1903/17262
dc.language.isoenen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pqcontrolledRoboticsen_US
dc.subject.pqcontrolledInformation technologyen_US
dc.subject.pquncontrolledAction Consequenceen_US
dc.subject.pquncontrolledAction Grammaren_US
dc.subject.pquncontrolledAction Recognitionen_US
dc.subject.pquncontrolledGrasping Type Recognitionen_US
dc.subject.pquncontrolledManipulation Actionsen_US
dc.subject.pquncontrolledProcedural Learningen_US
dc.titleMANIPULATION ACTION UNDERSTANDING FOR OBSERVATION AND EXECUTIONen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
Yang_umd_0117E_16672.pdf
Size:
34.19 MB
Format:
Adobe Portable Document Format