From Form to Function: Detecting the Affordance of Tool Parts using Geometric Features and Material Cues

Thumbnail Image
Publication or External Link
Myers, Austin Oliver
Aloimonos, Yiannis
Fermuller, Cornelia
With recent advances in robotics, general purpose robots like Baxter are quickly becoming a reality. As robots begin to collaborate with humans in everyday workspaces, they will need to understand the functions of objects and their parts. To cut an apple or hammer a nail, robots need to not just know a tool’s name, but they must find its parts and identify their potential functions, or affordances. As Gibson remarked, “If you know what can be done with a[n] object, what it can be used for, you can call it whatever you please.” We hypothesize that the geometry of a part is closely related to its affordance, since its geometric properties govern the possible physical interactions with the environment. In the first part of this thesis, we investigate how the affordances of tool parts can be predicted using geometric features from RGB-D sensors like Kinect. We develop several approaches to learn affordance from geometric features: using superpixel based hierarchical sparse coding, structured random forests, and convolutional neural networks. To evaluate the proposed methods, we construct a large RGB-D dataset where parts are labeled with multiple affordances. Experiments over sequences containing clutter, occlusions, and viewpoint changes show that the approaches provide precise predictions that can be used in robotics applications. In addition to geometry, the material properties of a part also determine its potential functions. In the second part of this thesis, we investigate how material cues can be integrated into a deep learning framework for affordance prediction. We propose a modular approach for combining high-level material information, or other mid-level cues, in order to improve affordance predictions. We present experiments which demonstrate the efficacy of our approach on an expanded RGB-D dataset, which includes data from non-tool objects and multiple depth sensors. The work presented in this thesis lays a foundation for the development of robots which can predict the potential functions of tool parts, and provides a basis for higher level reasoning about affordance.