Theses and Dissertations from UMD
Permanent URI for this communityhttp://hdl.handle.net/1903/2
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
2 results
Search Results
Item Codes with efficient erasure correction(2020) Chen, Zitan; Barg, Alexander; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Distributed storage systems are becoming increasingly ubiquitous in the emerging era of Internet of Things. Major internet technology companies employ large-scale distributed storage systems to accommodate the massive amounts of data generated and requested by global users. The need of reliable and efficient storage of immense amounts of data calls for new applications and development of classical error-correcting codes. This dissertation is devoted to a study of codes with efficient erasure correction for distributed storage systems. The efficiency of erasure correction is often assessed by two performance metrics, bandwidth and locality. In this dissertation we address several problems for each of these two metrics. We construct families of codes with optimal communication complexity for erasure correction ("repair bandwidth") for a heterogeneous storage model, and derive several results for the problem of optimal repair of Reed-Solomon codes. We also construct families of cyclic and convolutional codes with locality, extending the range of parameters for which such families were previously known.Item Coding Schemes for Distributed Storage Systems(2017) Ye, Min; Barg, Alexander; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This thesis is devoted to problems in error-correcting codes motivated by data integrity problems arising in large-scale distributed storage systems. We study properties and constructions of Maximum Distance Separable (MDS) codes, which are widely used in storage applications since they provide the maximum failure tolerance for a given amount of storage overhead. Among the parameters of the code that are important for storage applications are: the amount of data transferred in the system during node repair (the repair bandwidth), which characterizes the network usage, and the volume of accessed data, which corresponds to the number of disk I/O operations. Therefore, recent research on MDS codes for distributed storage has focused on codes that can minimize these two quantities. A lower bound on the repair bandwidth of a code, called the cut-set bound, was proved by Dimakis et al. in 2010, and codes that attain this bound are said to have the optimal repair property. Explicit optimal-repair low-rate (rate $\le 1/2$) MDS codes were constructed by Rashmi et al. in 2011. At the same time, large-scale distributed systems such as the Google File System and Hadoop Distributed File System, employ high-rate (rate $> 1/2$) MDS codes due to the need of reducing storage overhead. Until recently, except for some particular cases, no general explicit constructions of high-rate optimal-repair MDS codes were known. In this thesis, we present the first explicit constructions of optimal-repair MDS codes, thereby providing a solution to the general construction problem of such codes for the high-rate regime. More specifically, we construct explicit MDS codes that can repair any number of failed nodes from any number of helper nodes with the smallest possible amount of downloaded/accessed data. For the particular case of repairing a single node failure, we further present an explicit family of MDS codes that minimize the amount of accessed data during the repair. This family of codes has an additional favorable property that the node size (the amount of information stored in the node) is also the smallest possible. Reducing the node size directly translates into reducing the complexity of storage systems. While most studies on MDS codes with optimal repair bandwidth focus on array codes, the repair problem of widely used scalar codes such as Reed-Solomon codes has also recently attracted attention of researchers. It has been an open problem whether scalar linear MDS codes can achieve the cut-set bound. In this thesis, we answer this question in the affirmative by giving explicit constructions of Reed-Solomon codes that can be repaired at the cut-set bound. We also prove a lower bound on the node size of optimally repairable scalar MDS codes, showing that the node size of our RS codes is close to the best possible for scalar linear codes. Finally, we extend the concept of repair bandwidth from erasure correction to error correction, which forms a new problem in coding theory. We prove a bound on the amount of downloaded information for this problem and present explicit code families that attain this bound for a wide range of parameters.