Theses and Dissertations from UMD
Permanent URI for this communityhttp://hdl.handle.net/1903/2
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
2 results
Search Results
Item Private Information Retrieval and Security in Networks(2018) Banawan, Karim; Ulukus, Sennur; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This dissertation focuses on privacy and security issues in networks from an information-theoretic perspective. Protecting privacy requires protecting the identity of the desired message from the data source. This is highly desirable in next-generation networks, where data-mining techniques are present everywhere. Ensuring security requires that the data content is not interpretable by non-authorized nodes. This is critical in wireless networks, which are inherently open. We first focus on the privacy issue by investigating the private information retrieval (PIR) problem. PIR is a canonical problem to study the privacy of the downloaded content from public databases. In PIR, a user wishes to retrieve a file from distributed databases, in such a way that no database can know the identity of the user's desired file. PIR schemes need to be designed to be more efficient than the trivial scheme of downloading all the files stored in the databases. Fundamentally, PIR lies at the intersection of computer science, information theory, coding theory, and signal processing. The classical PIR formulation makes the following assumptions: The content is exactly replicated across the databases; the user wishes to retrieve a single file privately; the databases do not collude; the databases answer the user queries truthfully; the database answers go through noiseless orthogonal channels; there are no external security threats; the answer strings have unconstrained symmetric lengths. These assumptions are too idealistic to be practical in modern systems. In this thesis, we introduce extended versions of the classical PIR problem to be relevant to modern applications, namely: PIR from coded databases, multi-message PIR, PIR from colluding and Byzantine databases, PIR under asymmetric traffic constraints, noisy PIR, and PIR from wiretap channel II. We characterize the fundamental limits of such problems from an information-theoretic perspective. This involves two parts: first, we devise a practical scheme that retrieves the desired file(s) correctly and privately; second, we mathematically prove that no other retrieval scheme can achieve any higher rate than the proposed scheme. The optimal retrieval rate is called the PIR capacity reminiscent of the capacity of communication channels. First, we consider PIR from MDS-coded databases. Due to node failures and erasures that arise naturally in any storage system, redundancy should be introduced. However, replicating the content across the databases incurs high storage cost. This motivates the content of the databases to be coded instead of merely being replicated. We investigate the PIR problem from MDS-coded databases. We determine the optimal retrieval scheme for this problem, and characterize the exact PIR capacity. The result implies a fundamental tradeoff between the retrieval cost and the storage cost. Second, we consider the multi-message PIR. In this problem, the user is interested in retrieving multiple files from the databases without revealing the identities of these messages. We show that multiple messages can be retrieved more efficiently than retrieving them one-by-one in a sequence. When the user wishes to retrieve at least half of the files stored in the databases, we characterize the exact capacity of the problem by proposing a novel scheme that downloads MDS-coded mixtures of all messages. For all other cases, we develop a near-optimal scheme which is optimal if the ratio between the total number of files and the number of desired files is an integer. Third, we consider PIR from colluding and Byzantine databases. In this problem, a subset of the databases, called Byzantine databases, can return arbitrarily corrupted answers. In addition, a subset of the databases can collude by exchanging user queries. The errors introduced by the Byzantine databases can be unintentional (if databases store outdated message set), or even worse, can be intentional (as in the case of maliciously controlled databases). We propose a Byzantine and collusion resilient retrieval scheme, and determine the exact PIR capacity for this problem. The capacity expression reveals that the effect of the Byzantine databases is equivalent to removing twice the number of Byzantine databases from the system. Fourth, we consider PIR under asymmetric traffic constraints. A common property of the schemes constructed for the existing PIR settings is that they exhibit a symmetric structure across the databases. In practice, this may be infeasible, for instance when the links from the databases have different capacities. To that end, we develop a novel upper bound for the PIR capacity that incorporates the traffic asymmetry. We propose explicit achievability schemes for specific traffic ratios. For any other traffic ratio, we employ time-sharing. Our results show that asymmetry fundamentally hurts the retrieval rate. Fifth, we consider noisy PIR, where the returned answers reach the user via noisy channel(s). This is motivated by practical applications, such as, random packet dropping, random packet corruption, and PIR over wireless networks. We consider two variations of the problem, namely: noisy PIR with orthogonal links, and PIR from multiple access channel. For noisy PIR with orthogonal links, we show that channel coding and retrieval scheme are almost separable in the sense that the noisy channel affects only the traffic ratio. For the PIR problem from multiple access channel, the output of the channel is a mixture of all the answers returned by the databases. In this case, we show explicit examples, where the channel coding and the retrieval scheme are inseparable, and the privacy may be achieved for free. Sixth, we consider PIR from wiretap channel II. In this problem, there is an external eavesdropper who wishes to learn the contents of the databases by observing portions of the traffic exchanged between the user and the databases during the PIR process. The databases must encrypt their responses such that the eavesdropper learns nothing from its observation. We design a retrieval code that satisfies the combined privacy and security constraints. We show the necessity of using asymmetric retrieval schemes which build on our work on PIR under asymmetric traffic constraints. Next, we focus on the security problem in multi-user networks by physical layer techniques. Physical layer security enables secure transmission of information without a need for encryption keys. Hence, it mitigates the problems associated with exchanging encryption keys across open wireless networks. Existing work in physical layer security makes the following assumptions: All nodes are altruistic and follow a prescribed transmission policy to maximize the secure rate of the entire system; the channel inputs to Gaussian channels are constrained by a total transmitter-side power constraint; and in secure degrees of freedom studies for interference channels, users have a single antenna each. We address these issues by investigating the MIMO interference channel with confidential messages, security in networks with user misbehavior, and MIMO wiretap channel under receiver-side power constraints. We characterize the optimal secure transmission strategies in terms of the secrecy capacity and its high-SNR approximation, the secure degrees of freedom (s.d.o.f.). First, we determine the exact s.d.o.f. region of the two-user MIMO interference channel with confidential messages (ICCM). To that end, we propose a novel achievable scheme for the 2x2 ICCM system, which is a building block for any other antenna configuration. We show that the s.d.o.f. region starts as a square region, then it takes the shape of an irregular polytope until it returns back to a square region when the number of transmit antennas is at least twice the number of receiving antennas. Second, we investigate the security problem in the presence of user misbehavior. We consider the following multi-user scenarios: Multiple access wiretap channel with deviating users who do not follow agreed-upon optimum protocols, where we quantify the effect of user deviations and propose counter-strategies for the honest users; the broadcast channel with confidential messages in the presence of combating helpers, where we show that the malicious intentions of the helpers are neutralized and the full s.d.o.f. is retained; and interference channel with confidential messages when the users are selfish and have conflicting interests, where we show that selfishness precludes secure communication and no s.d.o.f. is achieved. Third, we consider the MIMO wiretap channel with a receiver-side minimum power constraint in addition to the usual transmitter-side power constraint. This problem is motivated by energy harvesting communications with wireless energy transfer, where an added goal is to deliver a minimum amount of energy to a receiver in addition to delivering secure data to another receiver. We prove that the problem is equivalent to solving a secrecy capacity problem with a double-sided correlation matrix constraint on the channel input. We extend the channel enhancement technique to our setting. We propose two optimum schemes that achieve the optimum rate: Gaussian signaling with a fixed mean and Gaussian signaling with Gaussian artificial noise. We extend our techniques to other related multi-user settings.Item Coding Schemes for Distributed Storage Systems(2017) Ye, Min; Barg, Alexander; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This thesis is devoted to problems in error-correcting codes motivated by data integrity problems arising in large-scale distributed storage systems. We study properties and constructions of Maximum Distance Separable (MDS) codes, which are widely used in storage applications since they provide the maximum failure tolerance for a given amount of storage overhead. Among the parameters of the code that are important for storage applications are: the amount of data transferred in the system during node repair (the repair bandwidth), which characterizes the network usage, and the volume of accessed data, which corresponds to the number of disk I/O operations. Therefore, recent research on MDS codes for distributed storage has focused on codes that can minimize these two quantities. A lower bound on the repair bandwidth of a code, called the cut-set bound, was proved by Dimakis et al. in 2010, and codes that attain this bound are said to have the optimal repair property. Explicit optimal-repair low-rate (rate $\le 1/2$) MDS codes were constructed by Rashmi et al. in 2011. At the same time, large-scale distributed systems such as the Google File System and Hadoop Distributed File System, employ high-rate (rate $> 1/2$) MDS codes due to the need of reducing storage overhead. Until recently, except for some particular cases, no general explicit constructions of high-rate optimal-repair MDS codes were known. In this thesis, we present the first explicit constructions of optimal-repair MDS codes, thereby providing a solution to the general construction problem of such codes for the high-rate regime. More specifically, we construct explicit MDS codes that can repair any number of failed nodes from any number of helper nodes with the smallest possible amount of downloaded/accessed data. For the particular case of repairing a single node failure, we further present an explicit family of MDS codes that minimize the amount of accessed data during the repair. This family of codes has an additional favorable property that the node size (the amount of information stored in the node) is also the smallest possible. Reducing the node size directly translates into reducing the complexity of storage systems. While most studies on MDS codes with optimal repair bandwidth focus on array codes, the repair problem of widely used scalar codes such as Reed-Solomon codes has also recently attracted attention of researchers. It has been an open problem whether scalar linear MDS codes can achieve the cut-set bound. In this thesis, we answer this question in the affirmative by giving explicit constructions of Reed-Solomon codes that can be repaired at the cut-set bound. We also prove a lower bound on the node size of optimally repairable scalar MDS codes, showing that the node size of our RS codes is close to the best possible for scalar linear codes. Finally, we extend the concept of repair bandwidth from erasure correction to error correction, which forms a new problem in coding theory. We prove a bound on the amount of downloaded information for this problem and present explicit code families that attain this bound for a wide range of parameters.