Computer Science Research Works

Permanent URI for this collectionhttp://hdl.handle.net/1903/1593

Browse

Search Results

Now showing 1 - 10 of 22
  • Item
    Supplementary material for Applying Wearable Sensors and Machine Learning to the Diagnostic Challenge of Distinguishing Parkinson's Disease from Other Forms of Parkinsonism
    (2025) Khalil, Rana M.; Shulman, Lisa M.; Gruber-Baldini, Ann L.; Reich, Stephen G.; Savitt, Joseph M.; Hausdorff, Jeffrey M.; von Coelln, Rainer; Cummings, Michael P.
    Parkinson's Disease (PD) and other forms of parkinsonism share motor symptoms, including tremor, bradykinesia, and rigidity. This overlap in the clinical presentation creates a diagnostic challenge, underscoring the need for objective differentiation. However, applying machine learning (ML) to clinical datasets faces challenges such as imbalanced class distributions, small sample sizes for non-PD parkinsonism, and heterogeneity within the non-PD group. This study analyzed wearable sensor data from 260 PD participants and 18 individuals with etiologically diverse forms of non-PD parkinsonism during clinical mobility tasks, using a single sensor placed on the lower-back. We evaluated the performance of ML models in distinguishing these two groups and identified the most informative mobility tasks for classification. Additionally, we examined clinical characteristics of misclassified participants and presented case studies of common challenges in clinical practice, including diagnostic uncertainty at the initial visit and changes in diagnosis over time. We also suggested potential steps to address dataset challenges which limited the models' performance. We demonstrate that ML-based analysis is a promising approach for distinguishing idiopathic PD from non-PD parkinsonism, though its accuracy remains below that of expert clinicians. Using the Timed Up and Go test as a single mobility task outperformed the use of all tasks combined, achieving a balanced accuracy of 78.2%. We also identified differences in some clinical scores between participants correctly and falsely classified by our models. These findings demonstrate the feasibility of using ML and wearable sensors for differentiating PD from other parkinsonian disorders, addressing key challenges in diagnosis, and streamlining diagnostic workflows.
  • Item
    Supplementary material for machine learning and statistical analyses of sensor data reveal variability between repeated trials in Parkinson’s disease mobility assessments
    (2024) Khalil, Rana M.; Shulman, Lisa M.; Gruber-Baldini, Ann L.; Shakya, Sunita; Hausdorff, Jeffrey M.; von Coelln, Rainer; Cummings, Michael P.
    Mobility tasks like the Timed Up and Go test (TUG), cognitive TUG (cogTUG), and walking with turns provide insight into motor control, balance, and cognitive functions affected by Parkinson’s disease (PD). We assess the test-retest reliability of these tasks in 262 PD participants and 50 controls by evaluating machine learning models based on wearable sensor-derived measures and statistical metrics. This evaluation examines total duration, subtask duration, and other quantitative measures across two trials. We show that the diagnostic accuracy for distinguishing PD from controls decreases by a mean of 1.8% between the first and the second trial, suggesting that task repetition may not be necessary for accurate diagnosis. Although the total duration remains relatively consistent between trials (intraclass correlation coefficient (ICC) = 0.62 to 0.95), greater variability is seen in subtask duration and sensor-derived measures, reflected in machine learning performance and statistical differences. Our findings also show that this variability differs not only between controls and PD participants but also among groups with varying levels of PD severity, indicating the need to consider population characteristics. Relying solely on total task duration and conventional statistical metrics to gauge the reliability of mobility tasks may fail to reveal nuanced variations in movement.
  • Item
    NeuWS: Neural wavefront shaping for guidestar-free imaging through static and dynamic scattering media
    (AAAS, 2023-06-28) Feng, Brandon Y.; Guo, Haiyun; Xie, Mingyang; Boominathan, Vivek; Sharma, Manoj K.; Veeraraghavan, Ashok; Metzler, Christopher A.
    Diffraction-limited optical imaging through scattering media has the potential to transform many applications such as airborne and space-based imaging (through the atmosphere), bioimaging (through skin and human tissue), and fiber-based imaging (through fiber bundles). Existing wavefront shaping methods can image through scattering media and other obscurants by optically correcting wavefront aberrations using high-resolution spatial light modulators—but these methods generally require (i) guidestars, (ii) controlled illumination, (iii) point scanning, and/or (iv) statics scenes and aberrations. We propose neural wavefront shaping (NeuWS), a scanning-free wavefront shaping technique that integrates maximum likelihood estimation, measurement modulation, and neural signal representations to reconstruct diffraction-limited images through strong static and dynamic scattering media without guidestars, sparse targets, controlled illumination, nor specialized image sensors. We experimentally demonstrate guidestar-free, wide field-of-view, high-resolution, diffraction-limited imaging of extended, nonsparse, and static/dynamic scenes captured through static/dynamic aberrations.
  • Item
    Supplementary material for Machine learning analysis of wearable sensor data from mobility testing distinguishes Parkinson's disease from other forms of parkinsonism
    (2024-03-13) Khalil, Rana M.; Shulman, Lisa M.; Gruber-Baldini, Ann L.; Hausdorff, Jeffrey M.; von Coelln, Rainer; Cummings, Michael P.; Cummings, Michael P.
    Parkinson's Disease (PD) and other forms of parkinsonism share characteristic motor symptoms, including tremor, bradykinesia, and rigidity. This overlap in the clinical presentation creates a diagnostic challenge, underscoring the need for objective differentiation tools. In this study, we analyzed wearable sensor data collected during mobility testing from 260 PD participants and 18 participants with etiologically diverse forms of parkinsonism. Our findings illustrate that machine learning-based analysis of data from a single wearable sensor can effectively distinguish idiopathic PD from non-PD parkinsonism with a balanced accuracy of 83.5%, comparable to expert diagnosis. Moreover, we found that diagnostic performance can be improved through severity-based partitioning of participants, achieving a balanced accuracy of 95.9%, 91.2% and 100% for mild, moderate and severe cases, respectively. Beyond its diagnostic implications, our results suggest the possibility of streamlining the testing protocol by using the Timed Up and Go test as a single mobility task. Furthermore, we present a detailed analysis of several case studies of challenging scenarios commonly encountered in clinical practice, including diagnostic uncertainty at the initial visit, and changes in clinical diagnosis at a subsequent visit. Together, these findings demonstrate the potential of applying machine learning on sensor-based measures of mobility to distinguish between PD and other forms of parkinsonism.
  • Item
    Supplementary material for machine learning analysis of data from a simplified mobility testing procedure with a single sensor and single task accurately differentiates Parkinson's disease from controls
    (2023) Khalil, Rana M.; Shulman, Lisa M.; Gruber-Baldini, Ann L.; Shakya, Sunita; von Coelln, Rainer; Cummings, Michael P.; Fenderson, Rebecca; van Hoven, Maxwell; Hausdorff, Jeffrey M.; Cummings, Michael P.
    Quantitative mobility analysis using wearable sensors, while promising as a diagnostic tool for Parkinson's disease (PD), is not commonly applied in clinical settings. Major obstacles include uncertainty regarding the best protocol for instrumented mobility testing and subsequent data processing, as well as the added workload and complexity of this multi-step process. To simplify sensor-based mobility testing in diagnosing PD, we analyzed data from 262 PD participants and 50 controls performing several motor tasks wearing a sensor on the lower back containing a triaxial accelerometer and a triaxial gyroscope. Using ensembles of heterogeneous machine learning models incorporating a range of classifiers trained on a large set of sensor features, we show that our models effectively differentiate between participants with PD and controls, both for mixed-stage PD (92.6% accuracy) and a group selected for mild PD only (89.4% accuracy). Omitting algorithmic segmentation of complex mobility tasks decreased the diagnostic accuracy of our models, as did the inclusion of kinesiological features. Feature importance analysis revealed Timed Up & Go (TUG) tasks to contribute highest-yield predictive features, with only minor decrease in accuracy for models based on cognitive TUG as a single mobility task. Our machine learning approach facilitates major simplification of instrumented mobility testing without compromising predictive performance.
  • Item
    A Hybrid Tensor-Expert-Data Parallelism Approach to Optimize Mixture-of-Experts Training
    (Association for Computer Machinery (ACM), 2023-06-21) Singh, Siddarth; Ruwase, Olatunji; Awan, Ammar Ahmad; Rajbhandari, Samyam; He, Yuxiong; Bhatele, Abhinav
    Mixture-of-Experts (MoE) is a neural network architecture that adds sparsely activated expert blocks to a base model, increasing the number of parameters without impacting computational costs. However, current distributed deep learning frameworks are limited in their ability to train high-quality MoE models with large base models. In this work, we present DeepSpeed-TED, a novel, threedimensional, hybrid parallel algorithm that combines data, tensor, and expert parallelism to enable the training of MoE models with 4–8× larger base models than the current state-of-the-art. We also describe memory optimizations in the optimizer step, and communication optimizations that eliminate unnecessary data movement. We implement our approach in DeepSpeed and achieve speedups of 26% over a baseline (i.e. without our communication optimizations) when training a 40 billion parameter MoE model (6.7 billion base model with 16 experts) on 128 V100 GPUs.
  • Item
    “Is this my president speaking?” Tamper-proofing Speech in Live Recordings
    (Association for Computer Machinery (ACM), 2023-06-18) Shahid, Irtaza; Roy, Nirupam
    Malicious editing of audiovisual content has emerged as a popular tool for targeted defamation, spreading disinformation, and triggering political unrest. Public speeches and statements of political leaders, public figures, or celebrities are particularly at target due to their effectiveness in influencing the masses. Ubiquitous audiovisual recording of live speeches with smart devices and unrestricted content sharing and redistributing on social media make it difficult to address this threat using existing authentication techniques. Given public recordings of live events lack source control over the media, standard solutions falter. This paper presents TalkLock, a speech integrity verification system that can enable live speakers to protect their speeches from malicious alterations even when the speech is recorded by any member of the audience. The core idea is to generate meta-information from the speech signal in real-time and disseminate it through a secure QR code-based screen-camera communication. The QR code when recorded along with the speech embeds the meta-information in the content and it can be used later for independent verification in stand-alone applications or online platforms. A user study with live speech and real-world experiments with different types of voices, languages, environments, and distances show that TalkLock can verify fake content with 94.4% accuracy.
  • Item
    Detock: High Performance Multi-region Transactions at Scale
    (Association for Computer Machinery (ACM), 2023-06) Nguyen, Cuong D.T.; Miller, Johann K.; Abadi, Daniel J.
    Many globally distributed data stores need to replicate data across large geographic distances. Since synchronously replicating data across such distances is slow, those systems with high consistency requirements often geo-partition data and direct all linearizable requests to the primary region of the accessed data. This significantly improves performance for workloads where most transactions access data close to where they originate from. However, supporting serializable multi-geo-partition transactions is a challenge, and they often degrade the performance of the whole system. This becomes even more challenging when they conflict with single-partition requests, where optimistic protocols lead to high numbers of aborts, and pessimistic protocols lead to high numbers of distributed deadlocks. In this paper, we describe the design of concurrency control and deadlock resolution protocols, built within a practical, complete implementation of a geographically replicated database system called Detock, that enables processing strictly-serializable multi-region transactions with near-zero performance degradation at extremely high conflict and order of magnitude higher throughput relative to state-of-the art geo-replication approaches, while improving latency by up to a factor of 5.
  • Item
    Automating NISQ Application Design with Meta Quantum Circuits with Constraints (MQCC)
    (Association for Computer Machinery (ACM), 2023-04) Deng, Haowei; Peng, Yuxiang; Hicks, Michael; Wu, Xiaodi
    Near-term intermediate scale quantum (NISQ) computers are likely to have very restricted hardware resources, where precisely controllable qubits are expensive, error-prone, and scarce. Programmers of such computers must therefore balance trade-offs among a large number of (potentially heterogeneous) factors specific to the targeted application and quantum hardware. To assist them, we propose Meta Quantum Circuits with Constraints (MQCC), a meta-programming framework for quantum programs. Programmers express their application as a succinct collection of normal quantum circuits stitched together by a set of (manually or automatically) added meta-level choice variables, whose values are constrained according to a programmable set of quantitative optimization criteria. MQCC’s compiler generates the appropriate constraints and solves them via an SMT solver, producing an optimized, runnable program. We showcase a few MQCC’s applications for its generality including an automatic generation of efficient error syndrome extraction schemes for fault-tolerant quantum error correction with heterogeneous qubits and an approach to writing approximate quantum Fourier transformation and quantum phase estimation that smoothly trades off accuracy and resource use. We also illustrate that MQCC can easily encode prior one-off NISQ application designs-–multi-programming (MP), crosstalk mitigation (CM)—as well as a combination of their optimization goals (i.e., a combined MP-CM).
  • Item
    Absynthe: Abstract Interpretation-Guided Synthesis
    (Association for Computer Machinery (ACM), 2023-06) Guria, Sankha Narayan; Foster, Jeffrey S.; Van Horn, David
    Synthesis tools have seen significant success in recent times. However, past approaches often require a complete and accurate embedding of the source language in the logic of the underlying solver, an approach difficult for industrial-grade languages. Other approaches couple the semantics of the source language with purpose-built synthesizers, necessarily tying the synthesis engine to a particular language model. In this paper, we propose Absynthe, an alternative approach based on user-defined abstract semantics that aims to be both lightweight and language agnostic, yet effective in guiding the search for programs. A synthesis goal in Absynthe is specified as an abstract specification in a lightweight user-defined abstract domain and concrete test cases. The synthesis engine is parameterized by the abstract semantics and independent of the source language. Absynthe validates candidate programs against test cases using the actual concrete language implementation to ensure correctness. We formalize the synthesis rules for Absynthe and describe how the key ideas are scaled-up in our implementation in Ruby. We evaluated Absynthe on SyGuS strings benchmark and found it competitive with other enumerative search solvers. Moreover, Absynthe’s ability to combine abstract domains allows the user to move along a cost spectrum, i.e., expressive domains prune more programs but require more time. Finally, to verify Absynthe can act as a general purpose synthesis tool, we use Absynthe to synthesize Pandas data frame manipulating programs in Python using simple abstractions like types and column labels of a data frame. Absynthe reaches parity with AutoPandas, a deep learning based tool for the same benchmark suite. In summary, our results demonstrate Absynthe is a promising step forward towards a general-purpose approach to synthesis that may broaden the applicability of synthesis to more full-featured languages.