Parkinson’s disease analysis
2024-06-06
Preface
This notebook provides the code of the analysis described in the manuscript “Machine learning analysis of wearable sensor data from mobility testing distinguishes Parkinson’s disease from other forms of parkinsonism”. The work involves using machine learning and movement data to distinguish idiopathic PD from non-PD parkinsonism. Wearable sensor data were collected from a cohort of 260 individuals diagnosed with PD and 18 participants who were diagnosed with other forms of parkinsonism. Each participant performed five motor tasks, including a 32-foot walk involving walking back and forth four times with 180 degree turns between segments, standing with eyes open, standing with eyes closed, two trials of the Timed Up & Go test (TUG), and two trials of the cognitive TUG (cogTUG). Besides sensor-derived features, various non-sensor features including demographics and clinical evaluation scores were added to the feature set. Prior to constructing the classifiers, we employed a forward feature selection approach to reduce concerns of overfitting by reducing the number of features. Then, we we randomly divided our data into three groups using stratified sampling. Two groups where used for training and the remaininng group for testing. This three-fold cross-validation process was repeated five times with different seeds and the final predicted class for each participant was determined by majority vote from the predicted classes across the five repeats. For each training set, we constructed a random forest model following the random under-sampling of the PD group to match the size of the non-PD parkinsonism group.