Electrical & Computer Engineering

Permanent URI for this communityhttp://hdl.handle.net/1903/2234

Browse

Search Results

Now showing 1 - 5 of 5
  • Item
    Can Querying for Bias Leak Protected Attributes? Achieving Privacy With Smooth Sensitivity
    (Association for Computer Machinery (ACM), 2023-06-12) Hamman, Faisal; Chen, Jiahao; Dutta, Sanghamitra
    Existing regulations often prohibit model developers from accessing protected attributes (gender, race, etc.) during training. This leads to scenarios where fairness assessments might need to be done on populations without knowing their memberships in protected groups. In such scenarios, institutions often adopt a separation between the model developers (who train their models with no access to the protected attributes) and a compliance team (who may have access to the entire dataset solely for auditing purposes). However, the model developers might be allowed to test their models for disparity by querying the compliance team for group fairness metrics. In this paper, we first demonstrate that simply querying for fairness metrics, such as, statistical parity and equalized odds can leak the protected attributes of individuals to the model developers. We demonstrate that there always exist strategies by which the model developers can identify the protected attribute of a targeted individual in the test dataset from just a single query. Furthermore, we show that one can reconstruct the protected attributes of all the individuals from O (𝑁𝑘log(𝑛/𝑁𝑘)) queries when 𝑁𝑘 ≪ 𝑛 using techniques from compressed sensing (𝑛 is the size of the test dataset and 𝑁𝑘 is the size of smallest group therein). Our results pose an interesting debate in algorithmic fairness: Should querying for fairness metrics be viewed as a neutral-valued solution to ensure compliance with regulations? Or, does it constitute a violation of regulations and privacy if the number of queries answered is enough for the model developers to identify the protected attributes of specific individuals? To address this supposed violation of regulations and privacy, we also propose Attribute-Conceal, a novel technique that achieves differential privacy by calibrating noise to the smooth sensitivity of our bias query function, outperforming naive techniques such as the Laplace mechanism. We also include experimental results on the Adult dataset and synthetic dataset (broad range of parameters).
  • Item
    Efficient Machine Learning Techniques for Neural Decoding Systems
    (2022) wu, xiaomin; Bhattacharyya, Shuvra S.; Chen, Rong; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    In this thesis, we explore efficient machine learning techniques for calcium imaging based neural decoding in two directions: first, techniques for pruning neural network models to reduce computational complexity and memory cost while retaining high accuracy; second, new techniques for converting graph-based input into low-dimensional vector form, which can be processed more efficiently by conventional neural network models. Neural decoding is an important step in connecting brain activity to behavior --- e.g., to predict movement based on acquired neural signals. Important application areas for neural decoding include brain-machine interfaces and neuromodulation. For application areas such as these, real-time processing of neural signals is important as well as high quality information extraction from the signals. Calcium imaging is a modality that is of increasing interest for studying brain activity. Miniature calcium imaging is a neuroimaging modality that can observe cells in behaving animals with high spatial and temporalresolution, and with the capability to provide chronic imaging. Compared to alternative modalities, calcium imaging has potential to enable improved neural decoding accuracy. However, processing calcium images in real-time is a challenging task as it involves multiple time-consuming stages: neuron detection, motion correction, and signal extraction. Traditional neural decoding methods, such as those based on Wiener and Kalman filters, are fast; however, they are outperformed in terms of accuracy by recently-developed deep neural network (DNN) models. While DNNs provide improved accuracy, they involve high computational complexity, which exacerbates the challenge of real-time processing. Addressing the challenges of high-accuracy, real-time, DNN-based neural decoding is the central objective of this research. As a first step in addressing these challenges, we have developed the NeuroGRS system. NeuroGRS is designed to explore design spaces for compact DNN models and optimize the computational complexity of the models subject to accuracy constraints. GRS, which stands for Greedy inter-layer order with Random Selection of intra-layer units, is an algorithm that we have developed for deriving compact DNN structures. We have demonstrated the effectiveness of GRS to transform DNN models into more compact forms that significantly reduce processing and storage complexity while retaining high accuracy. While NeuroGRS provides useful new capabilities for deriving compact DNN models subject to accuracy constraints, the approach has a significant limitation in the context of neural decoding. This limitation is its lack of scalability to large DNNs. Large DNNs arise naturally in neural decoding applications when the brain model under investigation involves a large number of neurons. As the size of the input DNN increases, NeuroGRS becomes prohibitively expensive in terms of computationaltime. To address this limitation, we have performed a detailed experimental analysis of how pruned solutions evolve as GRS operates, and we have used insights from this analysis to develop a new DNN pruning algorithm called Jump GRS (JGRS). JGRS maintains similar levels of model quality --- in terms of predictive accuracy --- as GRS while operating much more efficiently and being able to handle much larger DNNs under reasonable amounts of time and reasonable computational resources. Jump GRS incorporates a mechanism that bypasses (``jumps over'') validation and retraining during carefully-selected iterations of the pruning process. We demonstrate the advantages and improved scalability of JGRS compared to GRS through extensive experiments in the context of DNNs for neural decoding. We have also developed methods for raising the level of abstraction in the signal representation used for calcium imaging analysis. As a central part of this work, we invented the WGEVIA (Weighted Graph Embedding with Vertex Identity Awareness) algorithm, which enables DNN-based processing of neuron activity that is represented in the form of microcircuits. In contrast to traditional representations of neural signals, which involve spiking signals, a microcircuit representation is a graphical representation. Each vertex in a microcircuit corresponds to a neuron, and each edge carries a weight that captures information about firing relationships between the neurons associated with the vertices that are incident to the edge. Our experiments demonstrate that WGEVIA is effective at extracting information from microcircuits. Moreover,raising the level of abstraction to microcircuit analysis has the potential to enable more powerful signal extraction under limited processing time and resources.
  • Item
    Radio Analytics for Human Computer Interaction
    (2021) Regani, Sai Deepika; Liu, K.J. Ray; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    WiFi, as we know it, is no more a mere means of communication. Recent advances in research and industry have unleashed the sensing potential of wireless signals. With the constantly expanding availability of the radio frequency spectrum for WiFi, we now envision a future where wireless communication and sensing systems co-exist and continue to facilitate human lives. Radio signals are currently being used to ``sense" or monitor various human activities and vital signs. As Human-Computer Interaction (HCI) continues to form a considerable part of daily activities, it is interesting to investigate the potential of wireless sensing in designing practical HCI applications. This dissertation aims to study and design three different HCI applications, namely, (i) In-car driver authentication, (ii) Device-free gesture recognition through the wall, and (iii) Handwriting tracking by leveraging the radio signals. In the first part of this dissertation, we introduce the idea of in-car driver authentication using wireless sensing and develop a system that can recognize drivers automatically. The proposed system can recognize humans by identifying the unique radio biometric information embedded in the wireless channel state information (CSI) through multipath propagation. However, since the environmental information is also captured in the CSI, radio biometric recognition performance may be degraded by the changing physical environment. To this end, we address the problem of ``in-car changing environments” where the existing wireless sensing-based human identification system fails. We build a long-term driver radio biometric database consisting of radio biometrics of multiple people collected over two months. Machine learning (ML) models built using this database make the proposed system adaptive to new in-car environments. The performance of the in-car driver authentication system is shown to improve with extending multi-antenna and frequency diversities. Long-term experiments demonstrate the feasibility and accuracy of the proposed system. The accuracy achieved in the two-driver scenario is up to 99.13% for the best case compared to 87.7% achieved with the previous work. In the second part, we propose GWrite, a device-free gesture recognition system that can work in a through-the-wall scenario. The sequence of physical perturbations induced by the hand movement influences the multipath propagation and reflects in the CSI time series corresponding to the gesture. Leveraging the statistical properties of the EM wave propagation, we derive a relationship between the similarity of CSIs within the time series and the relative distance moved by the hand. Feature extraction modules are built on this relation to extract features characteristic of the gesture shapes. We built a prototype of GWrite on commercial WiFi devices and achieved a classification accuracy of 90.1\% on a set of 15 gesture shapes consisting of the uppercase English alphabets. We demonstrate that a broader set of gestures could be defined and classified using GWrite as opposed to the existing systems that operate over a limited gesture set. In the final part of this dissertation, we present mmWrite, the first high-precision passive handwriting tracking system using a single commodity millimeter wave (mmWave) radio. Leveraging the short wavelength and large bandwidth of 60 GHz signals and the radar-like capabilities enabled by the large phased array, mmWrite transforms any flat region into an interactive writing surface that supports handwriting tracking at millimeter accuracy. mmWrite employs an end-to-end pipeline of signal processing to enhance the range and spatial resolution limited by the hardware, boost the coverage, and suppress interference from backgrounds and irrelevant objects. Experiments using a commodity 60 GHz device show that mmWrite can track a finger/pen with a median error of 2.8 mm close to the device and thus can reproduce handwritten characters as small as 1 cm X 1 cm, with a coverage of up to 8 m^2 supported. With minimal infrastructure needed, mmWrite promises ubiquitous handwriting tracking for new applications in HCI.
  • Item
    AUTOMATIC FEATURE ENGINEERING FOR DISCOVERING AND EXPLAINING MALICIOUS BEHAVIORS
    (2019) Zhu, Ziyun; Dumitras, Tudor; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    A key task of cybersecurity is to discover and explain malicious behaviors of malware. The understanding of malicious behaviors helps us further develop good features and apply machine learning techniques to detect various attacks. The effectiveness of machine learning techniques primarily depends on the manual feature engineering process, based on human knowledge and intuition. However, given the adversaries’ efforts to evade detection and the growing volume of publications on malicious behaviors, the feature engineering process likely draws from a fraction of the relevant knowledge. Therefore, it is necessary and important to design an automated system to engineer features for discovering malicious behaviors and detecting attacks. First, we describe a knowledge-based feature engineering technique for malware detection. It mines documents written in natural language (e.g. scientific literature), and represents and queries the knowledge about malware in a way that mirrors the human feature engineering process. We implement the idea in a system called FeatureSmith, which generates a feature set for detecting Android malware. We train a classifier using these features on a large data set of benign and malicious apps. This classifier achieves comparable performance to a state-of-the-art Android malware detector that relies on manually engineered features. In addition, FeatureSmith is able to suggest informative features that are absent from the manually engineered set and to link the features generated to abstract concepts that describe malware behaviors. Second, we propose a data-driven feature engineering technique called ReasonSmith, which explains machine learning models by ranking features based on their global importance. Instead of interpreting how neural networks make decisions for one specific sample, ReasonSmith captures general importance in terms of the whole data set. In addition, ReasonSmith allows us to efficiently identify data biases and artifacts, by comparing feature rankings over time. We further summarize the common data biases and artifacts for malware detection problems at the level of API calls. Third, we study malware detection from a global view, and explore automatic feature engineering problem in analyzing campaigns that include a series of actions. We implement a system ChainSmith to bridge large-scale field measurement and manual campaign report by extracting and categorizing IOCs (indicators of compromise) from security blogs. The semantic roles of IOCs allow us to link qualitative data (e.g. security blogs) to quantitative measurements, which brings new insights to malware campaigns. In particular, we study the effectiveness of different persuasion techniques used on enticing user to download the payloads. We find that the campaign usually starts from social engineering and “missing codec” ruse is a common persuasion technique that generates the most suspicious downloads each day.
  • Item
    FEATURE LEARNING AND ACTIVE LEARNING FOR IMAGE QUALITY ASSESSMENT
    (2014) Ye, Peng; Chellappa, Rama; Doermann, David; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    With the increasing popularity of mobile imaging devices, digital images have become an important vehicle for representing and communicating information. Unfortunately, digital images may be degraded at various stages of their life cycle. These degradations may lead to the loss of visual information, resulting in an unsatisfactory experience for human viewers and difficulties for image processing and analysis at subsequent stages. The problem of visual information quality assessment plays an important role in numerous image/video processing and computer vision applications, including image compression, image transmission and image retrieval, etc. There are two divisions of Image Quality Assessment (IQA) research - Objective IQA and Subjective IQA. For objective IQA, the goal is to develop a computational model that can predict the quality of distorted image with respect to human perception or other measures of interest accurately and automatically. For subjective IQA, the goal is to design experiments for acquiring human subjects' opinions on image quality. It is often used to construct image quality datasets and provide the groundtruth for building and evaluating objective quality measures. In the thesis, we will address these two aspects of IQA problem. For objective IQA, our work focuses on the most challenging category of objective IQA tasks - general-purpose No-Reference IQA (NR-IQA), where the goal is to evaluate the quality of digital images without access to reference images and without prior knowledge of the types of distortions. First, we introduce a feature learning framework for NR-IQA. Our method learns discriminative visual features in the spatial domain instead of using hand-craft features. It can therefore significantly reduce the feature computation time compared to previous state-of-the-art approaches while achieving state-of-the-art performance in prediction accuracy. Second, we present an effective method for extending existing NR-IQA mod- els to "Opinion-Free" (OF) models which do not require human opinion scores for training. In particular, we accomplish this by using Full-Reference (FR) IQA measures to train NR-IQA models. Unsupervised rank aggregation is applied to combine different FR measures to generate a synthetic score, which serves as a better "gold standard". Our method significantly outperforms previous OF-NRIQA methods and is comparable to state-of-the-art NR-IQA methods trained on human opinion scores. Unlike objective IQA, subjective IQA tests ask humans to evaluate image quality and are generally considered as the most reliable way to evaluate the visual quality of digital images perceived by the end user. We present a hybrid subjective test which combines Absolute Categorical Rating (ACR) tests and Paired Comparison (PC) tests via a unified probabilistic model and an active sampling method. Our method actively constructs a set of queries consisting of ACR and PC tests based on the expected information gain provided by each test and can effectively reduce the number of tests required for achieving a target accuracy. Our method can be used in conventional laboratory studies as well as crowdsourcing experiments. Experimental results show our method outperforms state-of-the-art subjective IQA tests in a crowdsourced setting.