Action compositionality with focus on neurodevelopmental disorders

Thumbnail Image


Publication or External Link





A central question in motor neuroscience is how the Central Nervous System (CNS) would handle flexibility at the effector level, that is, how the brain would solve the problem coined by Nikolai Bernstein as the “degrees of freedom problem”, or the task of controlling a much larger number of degrees of freedom (dofs) that is often needed to produce behavior. Flexibility is a bless and a curse: while it enables the same body to engage in a virtually infinite number of behaviors, the CNS is left with the job of figuring out the right subset of dofs to use and how to control and coordinate these degrees. Similarly, at the level of perception, the CNS seeks to obtain information pertaining to the action and actors involved based on perceived motion of other people’s dofs.

This problem is believed to be solved with a particular dimensionality reduction strategy, where action production would consist of tuning only a few parameters that control and coordinate a small number of motor primitives, and action perception would take place by applying grouping processes that would solve the inverse problem, that is to identify the motor primitives and the corresponding tuning parameters used by an actor. These parameters can encode not only information on the action per se, but also higher-order cognitive cues like body language or emotion. This compositional view of action representation has an obvious parallel with language: we can think of primitives as words and cognitive systems (motor, perceptual) as different languages.

Little is known, however, about how words/primitives would be formed from low-level signals measured at each dof. Here we introduce the SB-ST method, a bottom-up approach to find full-body postural primitives as a set of key postures, that is, vectors corresponding to key relationships among dofs (such as joint rotations) which we call spatial basis (SB) and second, we impose a parametric model to the spatio-temporal (ST) profiles of each SB vector. We showcase the method by applying SB vectors and ST parameters to study vertical jumps of young adults (YAD) typically developing (TD) children and children with Developmental Coordination Disorder (DCD) obtained with motion capture. We also go over a number of other topics related with compositionality: we introduce a top-down system of tool-use primitives based on kinematic events between body parts and objects. The kinematic basis of these events is inspired by the hand-to-object velocity signature reported by movement psychologists in the 1980’s. We discuss the need for custom-made movement measurement strategies to study action primitives on some target populations, for example infants. Having the right tools to record infant movement would be of help, for example, to research in Autism Spectrum Disorder (ASD) where early sensorimotor abnormalities were shown to be linked to future diagnoses of ASD and the development of the typical social traits ASD is mostly known for. We continue the discussion on infant movement measurement where we present an alternative way of processing movement data by using textual descriptions as re- placements to the actual movement signals observed in infant behavioral trials. We explore the fact that these clinical descriptions are freely available as a byproduct of the diagnosis process itself. A typical/atypical classification experiment shows that, at the level of sentences, traditionally used text features in Natural Language Processing such as term frequencies and TF-IDF computed from unigrams and bigrams can be potentially helpful.

In the end, we sketch a conceptual, compositional model of action generation based on exploratory results on the jump data, according to which the presence of disorders would be related not to differences in key postures, but in how they are controlled throughout execution. We next discuss the nature of action and actor information representation by analyzing a second dataset with arm-only data (bi-manual coordination and object manipulations) with more target populations than in the jump dataset: TD and DCD children, YAD and seniors with and without Parkinson’s Disease (PD). Multiple group analyses on dofs coupled with explained variances at SB representations suggest that the cost of representing a task as performed by an actor may be equivalent to the cost of representing the task alone.

Plus, group discriminating information appears to be more compressed than task-only discriminating information, and because this compression happens at the top spatial bases, we conjecture that groups may be recognized faster than tasks.