Pig Squeal: Bridging Batch and Stream Processing Using Incremental Updates
dc.contributor.advisor | Agrawala, Ashok | en_US |
dc.contributor.author | Lampton, James Holmes | en_US |
dc.contributor.department | Computer Science | en_US |
dc.contributor.publisher | Digital Repository at the University of Maryland | en_US |
dc.contributor.publisher | University of Maryland (College Park, Md.) | en_US |
dc.date.accessioned | 2015-06-25T05:49:24Z | |
dc.date.available | 2015-06-25T05:49:24Z | |
dc.date.issued | 2015 | en_US |
dc.description.abstract | As developers shift from batch MapReduce to stream processing for better latency, they are faced with the dilemma of changing tools and maintaining multiple code bases. In this work we present a method for converting arbitrary chains of MapReduce jobs into pipelined, incremental processes to be executed in a stream processing framework. Pig Squeal is an enhancement of the Pig execution framework that runs lightly modified user scripts on Storm. The contributions of this work include: an analysis that tracks how information flows through MapReduce computations along with the influence of adding and deleting data from the input, a structure to generically handle these changes along with a description of the criteria to re-enable efficiencies using combiners, case studies for running word count and the more complex NationMind algorithms within Squeal, and a performance model which examines execution times of MapReduce algorithms after converted. A general solution to the conversion of analytics from batch to streaming impacts developers with expertise in batch systems by providing a means to use their expertise in a new environment. Imagine a medical researcher who develops a model for predicting emergency situations in a hospital on historical data (in a batch system). They could apply these techniques to quickly deploy these detectors on live patient feeds. It also significantly impacts organizations with large investments in batch codes by providing a tool for rapid prototyping and significantly lowering the costs of experimenting in these new environments. | en_US |
dc.identifier | https://doi.org/10.13016/M2HC9H | |
dc.identifier.uri | http://hdl.handle.net/1903/16507 | |
dc.language.iso | en | en_US |
dc.subject.pqcontrolled | Computer science | en_US |
dc.subject.pquncontrolled | batch | en_US |
dc.subject.pquncontrolled | delta | en_US |
dc.subject.pquncontrolled | incremental | en_US |
dc.subject.pquncontrolled | performance | en_US |
dc.subject.pquncontrolled | pig | en_US |
dc.subject.pquncontrolled | streaming | en_US |
dc.title | Pig Squeal: Bridging Batch and Stream Processing Using Incremental Updates | en_US |
dc.type | Dissertation | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Lampton_umd_0117E_15998.pdf
- Size:
- 1.38 MB
- Format:
- Adobe Portable Document Format