From Form to Function: Detecting the Affordance of Tool Parts using Geometric Features and Material Cues

Thumbnail Image


Publication or External Link





With recent advances in robotics, general purpose robots like Baxter are

quickly becoming a reality. As robots begin to collaborate with humans in everyday

workspaces, they will need to understand the functions of objects and their

parts. To cut an apple or hammer a nail, robots need to not just know a tool’s name,

but they must find its parts and identify their potential functions, or affordances.

As Gibson remarked, “If you know what can be done with a[n] object, what it can

be used for, you can call it whatever you please.”

We hypothesize that the geometry of a part is closely related to its affordance,

since its geometric properties govern the possible physical interactions with the environment.

In the first part of this thesis, we investigate how the affordances of tool

parts can be predicted using geometric features from RGB-D sensors like Kinect.

We develop several approaches to learn affordance from geometric features: using

superpixel based hierarchical sparse coding, structured random forests, and convolutional

neural networks. To evaluate the proposed methods, we construct a large

RGB-D dataset where parts are labeled with multiple affordances. Experiments

over sequences containing clutter, occlusions, and viewpoint changes show that the

approaches provide precise predictions that can be used in robotics applications.

In addition to geometry, the material properties of a part also determine its

potential functions. In the second part of this thesis, we investigate how material

cues can be integrated into a deep learning framework for affordance prediction. We

propose a modular approach for combining high-level material information, or other

mid-level cues, in order to improve affordance predictions. We present experiments

which demonstrate the efficacy of our approach on an expanded RGB-D dataset,

which includes data from non-tool objects and multiple depth sensors. The work

presented in this thesis lays a foundation for the development of robots which can

predict the potential functions of tool parts, and provides a basis for higher level

reasoning about affordance.