Supplementary materials for characterization of high-yield mobility features to identify Parkinson’s disease with a wearable sensor

No Thumbnail Available

Files

README (2.1 KB)

(RESTRICTED ACCESS)
No. of downloads:
rdata.zip (92.93 MB)

(RESTRICTED ACCESS)
No. of downloads:
files.zip (1.52 MB)

(RESTRICTED ACCESS)
No. of downloads:
sensor_data.zip (319.36 MB)

(RESTRICTED ACCESS)
No. of downloads:
code_notebook.zip (16.18 MB)
(RESTRICTED ACCESS)
No. of downloads:

Related Publication Link

Date

2023

Related Publication Citation

Abstract

Quantitative mobility analysis using wearable sensors has potential to identify, characterize and manage patients with movement disorders, including Parkinson’s disease (PD). Nonetheless, such sensors are not yet part of routine clinical examinations, in large part because it is still unclear which mobility tasks and which sensor-derived features per task should be analyzed to optimize/maximize the yield of this type of mobility analysis. To address this gap of knowledge, data from 262 participants with PD and 50 controls performing a series of motor tasks with a single wearable sensor on the lower back were analyzed using ensembles of heterogeneous machine learning models incorporating a wide range of classifiers and trained on a large set of features calculated from triaxial accelerometer and triaxial gyroscope signals. Our data show that sensor data analyzed with an ensemble of models effectively differentiate between participants with PD and controls. Furthermore, feature importance analysis revealed that a small number of more complex mobility tasks contribute the most informative features for accurate predictions, suggesting potential simplifications in wearable sensor-based mobility testing without sacrificing predictive performance.

Notes

  1. sensor_data: folder with sensor readings derived from 32-foot walk, standing with eyes open, standing with eyes closed, two trials of TUG, and two trials of cogTUG. It has also the calculated kinesiological variables, demographics, and clinical evaluation data in separate files.
  2. code_notebook: notebook with the code used to generate the super learner models. It has sections corresponding to the components of the proposed machine-learning pipeline. To view the notebook open the file index.html in a web browser or open the file notebook.pdf.
  3. rdata: folder with intermediate R objects. sensor_features.RData: saves a list of the features table of each task. sensor_features_all_tasks.RData: saves one table of features for all subjects not missing cogTUG data and tasks. The mean of repeated tasks and demographics variables are also added. PD_control_seg.RData: save a data frame with rows corresponding to PD participants and controls and columns corresponding to the features selected by the feature reduction technique. HY_control_early.RData: save a data frame with rows corresponding to mild PD participants and controls and columns corresponding to the features selected by the feature reduction technique. HY_control_mild.RData: save a data frame with rows corresponding to moderate PD participants and controls and columns corresponding to the features selected by the feature reduction technique. HY_control_severe.RData: save a data frame with rows corresponding to severe PD participants and controls and columns corresponding to the features selected by the feature reduction technique. var_reduct_PD_control_splits.RData saves the training and test splits for the five repeats and five-fold cross-validation framework used to build a classifier distinguishing PD patients and controls. var_reduct_HY_early_HC_splits.RData saves the training and test splits for the five repeats and five- fold cross-validation framework used to build a classifier distinguishing mild PD participants and controls. var_reduct_HY_mild_HC_splits.RData saves the training and test splits for the five repeats and five-fold cross-validation framework used to build a classifier distinguishing moderate PD participants and controls. var_reduct_HY_severe_HC_splits.RData saves the training and test splits for the five repeats and five-fold cross-validation framework used to build a classifier distinguishing severe PD participants and controls.
  4. files: for each classifier, its folder contains five sl_predictions.csv files with the predictions of the super learner models for the five repeats of the outer loop, train_test_files with the train and test split files, top_imp_scores.csv with the permutation-based importance scores, and top_shap_values.csv with the SHAP values of each feature. Each classifier folder has also five files (GLM_params.csv, GBM_params.csv, DRF_params, XGBoost_params.csv, DeepLearning_params.csv) with the hyperparameters of the base models used to build the super learners.
  5. models: for each classifier, it has the 25 superlearner models built inside the nested loop framework (will be uploaded).
  6. README: file with detailed instructions on how to set up and run the code, as well as any dependencies or requirements.

Rights

CC0 1.0 Universal
http://creativecommons.org/publicdomain/zero/1.0/