Compiler Supported High-level Abstractions for Sparse Disk-Resident Datasets

dc.contributor.authorFerreira, Renatoen_US
dc.contributor.authorAgrawal, Gaganen_US
dc.contributor.authorSaltz, Joelen_US
dc.date.accessioned2004-05-31T23:12:00Z
dc.date.available2004-05-31T23:12:00Z
dc.date.created2001-07en_US
dc.date.issued2001-09-05en_US
dc.description.abstractProcessing and analysing large volumes of data plays an increasingly important role in many domains of scientific research. The complexity and irregularity of datasets in many domains make the task of developing such processing applications tedious and error-prone. We propose use of high-level abstractions for hiding the irregularities in these datasets and enabling rapid development of correct, but not necessarily efficient, data processing applications. We present two execution strategies and a set of compiler analysis techniques for obtaining high performance from applications written using our proposed high-level abstractions. Our execution strategies achieve high locality in disk accesses. Once a disk block is read from the disk, all iterations that read any of the elements from this disk block are performed. To support our execution strategies and improve the performance, we have developed static analysis techniques for: 1) computing the set of iterations that access a particular righ-hand-side element, 2) generating a function that can be applied to the meta-data associated with each disk block, for determining if that disk block needs to be read, and 3) performing code hoisting of conditionals. Cross-referenced as UMIACS-TR-2001-50en_US
dc.format.extent481911 bytes
dc.format.mimetypeapplication/postscript
dc.identifier.urihttp://hdl.handle.net/1903/1144
dc.language.isoen_US
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_US
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md.)en_US
dc.relation.isAvailableAtTech Reports in Computer Science and Engineeringen_US
dc.relation.isAvailableAtUMIACS Technical Reportsen_US
dc.relation.ispartofseriesUM Computer Science Department; CS-TR-4270en_US
dc.relation.ispartofseriesUMIACS; UMIACS-TR-2001-50en_US
dc.titleCompiler Supported High-level Abstractions for Sparse Disk-Resident Datasetsen_US
dc.typeTechnical Reporten_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
CS-TR-4270.ps
Size:
470.62 KB
Format:
Postscript Files