A. James Clark School of Engineering
Permanent URI for this communityhttp://hdl.handle.net/1903/1654
The collections in this community comprise faculty research works, as well as graduate theses and dissertations.
Browse
133 results
Search Results
Item Automated Management of Network Slices with Service Guarantees(2024) Nikolaidis, Panagiotis; Baras, John; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Future mobile networks are expected to support a diverse set of applications including high-throughput video streaming, delay-sensitive augmented reality applications, and critical control traffic for autonomous driving. Unfortunately, existing networks do not have the required management mechanisms to handle this complex mix of traffic efficiently. At the same time, however, there is a significant effort from both industry and academia to make networks more open and programmable, leading to the emergence of software-defined networking, network function virtualization, and packet-forwarding programming languages. Moreover, several organisations such as the Open Networking Foundation were founded to facilitate innovation and lower the entry barriers in the mobile networking industry. In this setting, the concept of network slicing emerged which involves the partitioning of the mobile network into virtual networks that are tailored for specific applications. Each network slice needs to provide premium service to its users as specified in a service level agreement between the mobile network operator and the customer. The deployment of network slices has been largely realized thanks to network function virtualization. However, little progress has been made on mechanisms to efficiently share the network resources among them. In this dissertation, we develop such mechanisms for the licensed spectrum at the base station, a scarce resource that operators obtain through competitive auctions. We propose a system architecture composed of two new network functions; the bandwidth demand estimator and the network slice multiplexer. The bandwidth demand estimator monitors the traffic of the network slice and outputs the amount of bandwidth currently needed to deliver the desired quality of service. The network slice multiplexer decides which bandwidth demands to accept when the available bandwidth does not suffice for all the network slices. A key feature of this architecture is the separation of the demand estimation task from the contention resolution task. This separation makes the architecture scalable for a large number of network slices. It also allows the mobile network operator to charge fairly each customer based on their bandwidth demands. In contrast, the most common approach in the literature is to learn online how to split the available resources among the slices to maximize a total network utility. However, this approach is neither scalable nor suitable for service level agreements. The dissertation contributes several algorithms to realize the proposed architecture and provisioning methods to guarantee the fulfillment of the service level agreements. To satisfypacket delay requirements, we develop a bandwidth demand estimator based on queueing theory and online learning. To share resources efficiently even in the presence of traffic anomalies, we develop a network slice multiplexer based on the Max-Weight algorithm and hypothesis testing. We implement and test the proposed algorithms on network simulators and 5G testbeds to showcase their efficiency in realistic settings. Overall, we present a scalable architecture that is robust to traffic anomalies and reduces the bandwidth needed to serve multiple network slices.Item Representation Learning for Reinforcement Learning: Modeling Non-Gaussian Transition Probabilities with a Wasserstein Critic(2024) Tse, Ryan; Zhang, Kaiqing; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Reinforcement learning algorithms depend on effective state representations when solving complex, high-dimensional environments. Recent methods learn state representations using auxiliary objectives that aim to capture relationships between states that are behaviorally similar, meaning states that lead to similar future outcomes under optimal policies. These methods learn explicit probabilistic state transition models and compute distributional distances between state transition probabilities as part of their measure of behavioral similarity. This thesis presents a novel extension to several of these methods that directly learns the 1-Wasserstein distance between state transition distributions by exploiting the Kantorovich-Rubenstein duality. This method eliminates parametric assumptions about the state transition probabilities while providing a smoother estimator of distributional distances. Empirical evaluation demonstrates improved sample efficiency over some of the original methods and a modest increase in computational cost per sample. The results establish that relaxing theoretical assumptions about state transition modeling leads to more flexible and robust representation learning while maintaining strong performance characteristics.xItem TOWARDS EFFICIENT OCEANIC ROBOT LEARNING WITH SIMULATION(2024) LIN, Xiaomin; Aloimonos, Yiannis; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In this dissertation, I explore the intersection of machine learning, perception, and simulation-based techniques to enhance the efficiency of underwater robotics, with a focus on oceanic tasks. My research begins with marine object detection using aerial imagery. From there, I address oyster detection using Oysternet, which leverages simulated data and Generative Adversarial Networks for sim-to-real transfer, significantly improving detection accuracy. Next, I present an oyster detection system that integrates diffusion-enhanced synthetic data with the Aqua2 biomimetic hexapedal robot, enabling real-time, on-edge detection in underwater environments. With detection models deployed locally, this system facilitates autonomous exploration. To enhance this capability, I introduce an underwater navigation framework that employs imitation learning, enabling the robot to efficiently navigate over objects of interest, such as rock and oyster reefs, without relying on localization. This approach improves information gathering while ensuring obstacle avoidance. Given that oyster habitats are often in shallow waters, I incorporate a deep learning model for real/virtual image segmentation, allowing the robot to differentiate between actual objects and water surface reflections, ensuring safe navigation. I expand on broader applications of these techniques, including olive detection for yield estimation and industrial object counting for warehouse management, using simulated imagery. In the final chapters, I address unresolved challenges, such as RGB/sonar data integration, and propose directions for future research to enhance underwater robotic learning through digital simulation further. Through these studies, I demonstrate how machine learning models and digital simulations can be used synergistically to address key challenges in underwater robotic tasks. Ultimately, this work advances the capabilities of autonomous systems to monitor and preserve marine ecosystems through efficient and robust digital simulation-based learning.Item FROM PARTS TO WHOLE IN ACTION AND OBJECT UNDERSTANDING(2024) Devaraj, Chinmaya; Aloimonos, Yiannis; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The traditional paradigm of supervised learning in action or object recognition often relieson a top-down approach, ignoring explicit modeling of what activity or objects consist of. Recent approaches in generative AI research have shown us the ability to generate images and videos using text, indirectly indicating that we have control over the constituents of images and videos. In this dissertation, we explore ways to use the constituents of actions to develop methods to improve understanding of action. We devise different approaches to utilize the parts of actions, namely object motion, object state changes, and motion descriptions obtained by LLMs in various tasks like in the next active object segmentation, zero-shot action recognition, or video-text retrieval. We show promising benefits in action anticipation, zero-shot action recognition, and text-video retrieval tasks, demonstrating the practical applications of our methods. In the first part of the dissertation, we explore the idea of using the constituents of actions inGCNs for zero-shot human-object action recognition. The main idea is that semantically similar actions (of similar constituents) are closer in feature space. Thus, in our graph, we encode the edges connecting those actions with higher similarity. We introduce a method to visually ground the external knowledge graph using the concept of shared similarity between similar actions. We evaluate the method on the EPIC Kitchens dataset and the Charades dataset showing impressive results over baseline methods. We further show that visually grounding the knowledge graph enhances the performance of GCNs when an adversarial attack corrupts the input graph. In the second part of the thesis, we extend our ideas on human-object interactions in firstpersonvideos. Human actions involving hand manipulations are structured according to the making and breaking of hand-object contact, and human visual understanding of action relies on anticipation of contact, as demonstrated by pioneering work in cognitive science. Taking inspiration from this, we introduce representations and models centered on contact, which we then use in action prediction and anticipation. We train the Anticipation Module, a module producing Contact Anticipation Maps and Next Active Object Segmentations - novel low-level representations providing temporal and spatial characteristics of anticipated near future action. On top of the Anticipation Module, we apply Egocentric Object Manipulation Graphs (Ego- OMG), a framework for action anticipation and prediction. Using the Anticipation Module to aid Ego-OMG produces state-of-the-art results, achieving first and second places on the unseen and seen test sets of the EPIC Kitchens Action Anticipation Challenge and achieving state-of-the-art results on action anticipation and action prediction over EPIC Kitchens. In the same line of thinking of constituents of action, we next focus on investigatinghow motion understanding can be modeled in current video-text models. We introduce motion descriptions generated by GPT4 on three action datasets that capture fine-grained motion descriptions of activities. We evaluated several video-text models on the task of retrieval of motion descriptions and found them to need to catch up to the human expert performance. We introduce a method of improving motion understanding in video-text models by utilizing motion descriptions. This method is demonstrated on two action datasets for the motion description retrieval task. The results draw attention to the need for quality captions involving fine-grained motion information in existing datasets and demonstrate the effectiveness of the proposed pipeline in understanding fine-grained motion during video-text retrieval.Item Efficient learning-based sound propagation for virtual and real-world audio processing applications(2024) Ratnarajah, Anton Jeran; Manocha, Dinesh; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Sound propagation is the process by which sound energy travels through a medium, such as air, to the surrounding environment as sound waves. The room impulse response (RIR) describes this process and is influenced by the positions of the source and listener, the room's geometry, and its materials. Physics-based acoustic simulators have been used for decades to compute accurate RIRs for specific acoustic environments. However, we have encountered limitations with existing acoustic simulators. For example, they require a 3D representation and detailed material knowledge of the environment. To address these limitations, we propose three novel solutions. First, we introduce a learning-based RIR generator that is two orders of magnitude faster than an interactive ray-tracing simulator. Our approach can be trained to input both statistical and traditional parameters directly, and it can generate both monaural and binaural RIRs for both reconstructed and synthetic 3D scenes. Our generated RIRs outperform interactive ray-tracing simulators in speech-processing applications, including Automatic Speech Recognition (ASR), Speech Enhancement, and Speech Separation, by 2.5%, 12%, and 48%, respectively. Secondly, we propose estimating RIRs from reverberant speech signals and visual cues in the absence of a 3D representation of the environment. By estimating RIRs from reverberant speech, we can augment training data to match test data, improving the word error rate of the ASR system. Our estimated RIRs achieve a 6.9% improvement over previous learning-based RIR estimators in real-world far-field ASR tasks. We demonstrate that our audio-visual RIR estimator aids tasks like visual acoustic matching, novel-view acoustic synthesis, and voice dubbing, validated through perceptual evaluation. Finally, we introduce IR-GAN to augment accurate RIRs using real RIRs. IR-GAN parametrically controls acoustic parameters learned from real RIRs to generate new RIRs that imitate different acoustic environments, outperforming Ray-tracing simulators on the Kaldi far-field ASR benchmark by 8.95%.Item ML-ENABLED SOLAR PV ELECTRICITY GENERATION PROJECTION FOR A LARGE ACADEMIC CAMPUS TO REDUCE ONSITE CO2 EMISSIONS(2024) Zargarzadeh, Sahar; Babadi, Behtash; Ohadi, Michael; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Mitigating CO2 emissions is crucial in reducing climate change, as these emissions contribute to global warming and its adverse impacts on ecosystems. According to statistics, photovoltaic electricity is 15 times less carbon-intensive than natural gas and 30 times less than coal, making Solar Photovoltaic an attractive option among various methods of reducing electricity demand. This study aims to apply Machine Learning to predict future impact of solar PV-Generated electricity in reducing CO2 emissions based. The primary utility data source is from the University of Maryland's campus; with over half of the campus's energy consumption derived from electricity, therefore reducing electricity consumption to mitigate carbon emissions is paramount. 153 buildings on the campus were investigated, spanning the years 2015-2022. This study was conducted in four key phases. In the first phase, an open source tool, PVWatts was used to gather data to predict PV-generated energy. This served as the foundation for phase II, where a novel tree-based ensemble learning model was developed to predict monthly PV-generated electricity on any period of time, leveraging machine learning to capture complex patterns in energy data for more accurate forecasts. The SHAP (SHapley Additive exPlanations) technique was incorporated into the proposed framework to enhance model explainability. Phase III involved calculating historical CO2 emissions based on past energy consumption data, providing a baseline for comparison. A meta-learning algorithm was implemented in the phase IV to project future CO2 emissions post-solar PV installation. This comparison facilitated the evaluation of different machine learning techniques for projecting emissions and assessing the university’s progress toward Maryland’s sustainability objectives. The ML-based tool developed in this study demonstrated that solar PV implementation could potentially reduce the campus’s footprint by approximately 18% for the studied clusters of buildings with the uncertainty level of about 1.7%, contributing to sustainability objectives and the promotion of cleaner energy use.Item Understanding and Improving Reliability of Predictive and Generative Deep Learning Models(2024) Kattakinda, Priyatham; Feizi, Soheil; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Deep learning models are prone to acquiring spurious correlations and biases during training and adversarial attacks during inference. In the context of predictive models, this results in inaccurate predictions relying on spurious features. Our research delves into this phenomenon specifically concerning objects placed in uncommon settings, where they are not conventionally found in the real world (e.g., a plane on water or a television in a cave). We introduce the "FOCUS: Familiar Objects in Common and Uncommon Settings" dataset which aims to stress-test the generalization capabilities of deep image classifiers. By leveraging the power of modern search engines, we deliberately gather data containing objects in common and uncommon settings in a wide range of locations, weather conditions, and time of day. Our comprehensive analysis of popular image classifiers on the FOCUS dataset reveals a noticeable decline in performance when classifying images in atypical scenarios. FOCUS only consists of natural images which are extremely challenging to collect as by definition it is rare to find objects in unusual settings. To address this challenge, we introduce an alternative dataset named Diffusion Dreamed Distribution Shifts (D3S). D3S comprises synthetic images generated through StableDiffusion, utilizing text prompts and image guides derived from placing a sample foreground image onto a background template image. This scalable approach allows us to create 120,000 images featuring objects from all 1000 ImageNet classes set against 10 diverse backgrounds. Due to the incredible photorealism of the diffusion model, our images are much closer to natural images than previous synthetic datasets. To alleviate this problem, we propose two methods of learning richer and more robust image representations. In the first approach, we harness the foreground and background labels within D3S to learn a foreground (background)representation resistant to changes in background (foreground). This is achieved by penalizing the mutual information between the foreground (background) features and the background (foreground) labels. We demonstrate the efficacy of these representations by training classifiers on a task with strong spurious correlations. Thus far, our focus has centered on predictive models, scrutinizing the robustness of the learned object representations, particularly when the contextual surroundings are unconventional. In the second approach, we propose to use embeddings of objects and their relationships extracted using off-the-shelf image segmentation models and text encoders respectively as input tokens to a transformer. This leads to remarkably richer features that improve performance on downstream tasks such as image retrieval. Large language models are also prone to failures during inference. Given the widespread use of LLMs, understanding the propensity of these models to fail given adversarial inputs is crucial. To that end we propose a series of fast adversarial attacks called BEAST that uses beam search to add adversarial tokens to a given input prompt. These attacks induce hallucination, cause the models to jailbreak and facilitate unintended membership inference from model outputs. Our attacks are fast and are executable in relatively compute constrained environments.Item Studies in Differential Privacy and Federated Learning(2024) Zawacki, Christopher Cameron; Abed, Eyad H; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In the late 20th century, Machine Learning underwent a paradigm shift from model-driven to data-driven design. Rather than field specific models, advances in sensors, data storage, and computing power enabled the collection of increasing amounts of data. The abundance of new data allowed researchers to fit flexible models directly to observed data. The influx of information made possible numerous advances, including the development of novel medicines, increases in efficiency of markets, and the proliferation of vast sensor networks. However, not all data should be freely accessible. Sensitive medical records, personal finances, and private IDs are all currently stored on digital devices across the world with the expectation that they remain private. However, at the same time, such data is frequently instrumental in the development of predictive models. Since the beginning of the 21st century, researchers have recognized that traditional methods of anonymizing data are inadequate for protecting client identities. This dissertation's primary focus is the advancement of two fields of data privacy: Differential Privacy and Federated Learning. Differential Privacy is one of the most successful modern privacy methods. By injecting carefully structured noise into a dataset, Differential Privacy obscures individual contributions while allowing researchers to extract meaningful information from the aggregate. Within this methodology, the Gaussian mechanism is one of the most common privacy mechanisms due to its favorable properties such as the ability of each client to apply noise locally before transmission to a server. However, the use of this mechanism yields only an approximate form of Differential Privacy. This dissertation introduces the first in-depth analysis of the Symmetric alpha-Stable (SaS) privacy mechanism, demonstrating its ability to achieve pure-Differential Privacy while retaining local applicability. Based on these findings, the dissertation advocates for using the SaS privacy mechanism in protecting the privacy of client data. Federated Learning is a sub-field of Machine Learning, which trains Machine Learning models across a collection (federation) of client devices. This approach aims to protect client privacy by limiting the type of information that clients transmit to the server. However, this distributed environment poses challenges such as non-uniform data distributions and inconsistent client update rates, which reduces the accuracy of trained models. To overcome these challenges, we introduce Federated Inference, a novel algorithm that we show is consistent in federated environments. That is, even when the data is unevenly distributed and the clients' responses to the server are staggered in time (asynchronous), the algorithm is able to converge to the global optimum. We also present a novel result in system identification in which we extend a method known as Dynamic Mode Decomposition to accommodate input delayed systems. This advancement enhances the accuracy of identifying and controlling systems relevant to privacy-sensitive applications such as smart grids and autonomous vehicles. Privacy is increasingly pertinent, especially as investments in computer infrastructure constantly grow in order to cater to larger client bases. Privacy failures impact an ever-growing number of individuals. This dissertation reports on our efforts to advance the toolkit of data privacy tools through novel methods and analysis while navigating the challenges of the field.Item SYMMETRIC-KEY CRYPTOGRAPHY AND QUERY COMPLEXITY IN THE QUANTUM WORLD(2024) Bai, Chen; Katz, Jonathan; Alagic, Gorjan; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Quantum computers are likely to have a significant impact on cryptography. Many commonly used cryptosystems will be completely broken once large quantum computers are available. Since quantum computers can solve the factoring problem in polynomial time, the security of RSA would not hold against quantum computers. For symmetric-key cryptosystems, the primary quantum attack is key recovery via Grover search, which provides a quadratic speedup. One way to address this is to double the key length. However, recent results have shown that doubling the key length may not be sufficient in all cases. Therefore, it is crucial to understand the security of various symmetric-key constructions against quantum attackers. In this thesis, we give the first proof of post-quantum security for certain symmetric primitives. We begin with a fundamental block cipher, the Even-Mansour cipher, and the tweakable Even-Mansour construction. Our research shows that both are secure in a realistic quantum attack model. For example, we prove that 2^{n/3} quantum queries are necessary to break the Even-Mansour cipher. We also consider the practical applications that our work implies. Using our framework, we derive post-quantum security proofs for three concrete symmetric-key schemes: Elephant (an Authenticated Encryption (AE) finalist of NIST’s lightweight cryptography standardization effort), Chaskey (an ISO-standardized Message Authentication Code), and Minalpher (an AE second-round candidate of the CAESAR competition). In addition, we consider the two-sided permutation inversion problem in the quantum query model. In this problem, given an image y and quantum oracle access to a permutation P (and its inverse oracle), the goal is to find its pre-image x such that P(x)=y. We prove an optimal lower bound \Omega(\sqrt{2^n}) for this problem against an adaptive quantum adversary. Moreover, we apply our lower bound above to show that a natural encryption scheme constructed from random permutations is secure against quantum attacks.Item Advances in Concrete Cryptanalysis of Lattice Problems and Interactive Signature Schemes(2024) Kippen, Hunter Michael; Dachman-Soled, Dana; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Advanced cryptography that goes beyond what is currently deployed to service our basic internet infrastructure is continuing to see widespread adoption. The enhanced functionality achieved by these schemes frequently yields an increase in complexity. Solely considering the asymptotic security of the underlying computational assumptions is often insufficient to realize practical and secure instantiations.In these cases, determining the risk of any particular deployment involves analyzing the concrete security (the exact length of time it would take to break the encryption) as well as quantifying how concrete security can degrade over time due to any exploitable information leakage. In this dissertation, we examine two such cryptographic primitives where assessing concrete security is paramount. First, we consider the cryptanalysis of lattice problems (used as the basis for current standard quantum resistant cryptosystems). We develop a novel side-channel attack on the FrodoKEM key encapsulation mechanism as submitted to the NIST Post Quantum Cryptography (PQC) standardization process. Our attack involves poisoning the FrodoKEM Key Generation (KeyGen) process using a security exploit in DRAM known as “Rowhammer”. Additionally, we revisit the security of the lattice problem known as Learning with Errors (LWE) in the presence of information leakage. We further enhance the robustness of prior methodology by viewing side information from a geometric perspective. Our approach provides the rigorous promise that, as hints are integrated, the correct solution is a (unique) lattice point contained in an ellipsoidal search space. Second, we study the concrete security of interactive signature schemes (used as part of many Privacy Enhancing Technologies). To this end, we complete a new analysis of the performance of Wagner’s k-list algorithm [CRYPTO ‘02], which has found significant utility in computing forgeries on several interactive signature schemes that implicitly rely on the hardness of the ROS problem formulated by Schnorr [ICICS ‘01].