Browsing by Author "Saltz, Joel"
Now showing 1 - 20 of 56
Results Per Page
Sort Options
Item Active Proxy-G: Optimizing the Query Execution Process in the Grid(2002-05-22) Andrade, Henrique; Kurc, Tahsin; Sussman, Alan; Saltz, JoelThe Grid environment facilitates collaborative work and allows many users to query and process data over geographically dispersed data repositories. Over the past several years, there has been a growing interest in developing applications that interactively analyze datasets, potentially in a collaborative setting. We describe an Active Proxy-G service that is able to cache query results, use those results for answering new incoming queries, generate subqueries for the parts of a query that cannot be produced from the cache, and submit the subqueries for final processing at application servers that store the raw datasets. We present an experimental evaluation to illustrate the effects of various design tradeoffs. We also show the benefits that two real applications gain from using the middleware. (Also UMIACS-TR-2002-41)Item Adaptive Runtime Support for Direct Simulation Monte Carlo Methods on Distributed Memory Architectures(1998-10-15) Moon, Bongki; Saltz, JoelIn highly adaptive irregular problems such as many Particle-In-Cell (PICJ codes and Dimet Simulation Monte Carlo (DSMCJ codes, data access patterns may vary from time step to time step. This fluctuation may hinder efficient utilization of distributed memory parallel computers because of the resulting overhead for data redistribution and dynamic load balancing. To efficiently parallelize such adaptive irregular problems on distributed memory parallel computers, several issues such as effective methods for domain partitioning and fast data transportation must be addressed. This paper presents efficient runtime support methods for such problems. A simple one-dimensional domain partitioning method is implemented and compared with unstructured mesh partitioners such as recursive coordinate bisection and recursive inertial bisection. A remapping decision policy has been investigated for dynamic load balancing on S-dimensional DSMC codes. Performance results are presented (Also cross-referenced as UMIACS-TR-95-27)Item Analysis of the Clustering Properties of Hilbert Space-filling Curve(1998-10-15) Moon, Bongki; Jagadish, H.V.; Faloutsos, Christos; Saltz, JoelSeveral schemes for linear mapping of multidimensional space have been proposed for many applications such as access methods for spatio-temporal databases, image compression and so on. In all these applications, one of the most desired properties from such linear mappings is clustering, which means the locality between objects in the multidimensional space is preserved in the linear space. It is widely believed that Hilbert space-filling curve achieves the best clustering. In this paper we provide closed-form formulas of the number of clusters required by a given query region of an arbitrary shape (e.g., polygons and polyhedra) for Hilbert space-filling curve. Both the asymptotic solution for a general case and the exact solution for a special case generalize the previous work, and they agree with the empirical results that the number of clusters depends on the hyper-surface area of the query region and not on its hyper-volume. We have also shown that Hilbert curve achieves better clustering than z-curve. From the practical point of view, the formulas given in this paper provide a simple measure which can be used to predict the required disk access behaviors and hence the total access time. (Also cross-referenced as UMIACS-TR-96-20)Item Applying DEF/USE Information of Pointer Statements toTraversal-Pattern-Aware Pointer Analysis(1998-10-15) Hwang, Yuan-Shin; Saltz, JoelPointer analysis is essential for optimizing and parallelizing compilers. It examines pointer assignment statements and estimates pointer-induced aliases among pointer variables or possible shapes of dynamic recursive data structures. However, previously proposed techniques are not able to gather useful information or have to give up further optimizations when overall recursive data structures appear to be cyclic even though patterns of traversal are linear. The reason is that these proposed techniques perform pointer analysis without the knowledge of traversal patterns of dynamic recursive data structures to be constructed. This paper proposes an approach, {\em traversal-pattern-aware pointer analysis}, that has the ability to first identify the structures specified by traversal patterns of programs from cyclic data structures and then perform analysis on the specified structures. This paper presents an algorithm to perform shape analysis on the structures specified by traversal patterns. The advantage of this approach is that if the specified structures are recognized to be acyclic, parallelization or optimizations can be applied even when overall data structures might be cyclic. The DEF/USE information of pointer statements is used to relate the identified traversal patterns to the pointer statements which build recursive data structures. (Also cross-referenced as UMIACS-TR-97-66)Item Applying Traversal-Pattern-Sensitive Pointer Analysis to Dependence Analysis(1998-10-15) Hwang, Yuan-Shin; Saltz, JoelThis paper presents a technique for dependence analysis on programs with pointers or dynamic recursive data structures. It differs from previously proposed approaches in analyzing structure access conflicts between traversal patterns before gathering alias and connection information. Conflict analysis is conducted under the assumption that each unique path leads to a distinct storage location, and hence traversal patterns can be analytically compared to identify possible conflicts. The rationale of this assumption is that if statements are deemed to be dependent by this approach, they are inherently sequential regardless of the shapes of the data structures they traverse. Consequently, there is no need to perform alias/connection analysis on the statements that construct such data structures. Furthermore, the information of traversal patterns gathered in conflict analysis phase can direct alias/connection analysis algorithm to focus on statements that are crucial to optimizations or parallelization. A such {\em traversal-pattern-sensitive} pointer analysis algorithm will also be presented.Item Compile-Time Analysis on Programs with Dynamic Pointer-Linked Data Structures(1998-10-15) Hwang, Yuan-Shin; Saltz, JoelThis paper studies static analysis on programs that create and traverse dynamic pointer-linked data structures. It introduces a new type of auxiliary structures, called {\em link graphs}, to depict the alias information of pointers and connection relationships of dynamic pointer-linked data structures. The link graphs can be used by compilers to detect side effects, to identify the patterns of traversal, and to gather the DEF-USE information of dynamic pointer-linked data structures. The results of the above compile-time analysis are essential for parallelization and optimizations on communication and synchronization overheads. Algorithms that perform compile-time analysis on side effects and DEF-USE information using link graphs will be proposed.Item Compiler and Runtime Support for Programming in Adaptive Parallel Environments(1998-10-15) Edjlali, Guy; Agrawal, Gagan; Sussman, Alan; Humphries, Jim; Saltz, JoelFor better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at runtime. In this paper, we discuss runtime support for data parallel programming in such an adaptive environment. Executing programs in an adaptive environment requires redistributing data when the number of processors changes, and also requires determining new loop bounds and communication patterns for the new set of processors. We have developed a runtime library to provide this support. We discuss how the runtime library can be used by compilers of HPF-like languages to generate code for an adaptive environment. We present performance results for a Navier-Stokes solver and a multigrid template run on a network of workstations and an IBM SP-2. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not significant compared to the time required for the actual computation. Overall, our work establishes the feasibility of compiling HPF for a network of non-dedicated workstations, which are likely to be an important resource for parallel programming in the future. (Also cross-referenced as UMIACS-TR-95-83)Item Compiler Supported High-level Abstractions for Sparse Disk-Resident Datasets(2001-09-05) Ferreira, Renato; Agrawal, Gagan; Saltz, JoelProcessing and analysing large volumes of data plays an increasingly important role in many domains of scientific research. The complexity and irregularity of datasets in many domains make the task of developing such processing applications tedious and error-prone. We propose use of high-level abstractions for hiding the irregularities in these datasets and enabling rapid development of correct, but not necessarily efficient, data processing applications. We present two execution strategies and a set of compiler analysis techniques for obtaining high performance from applications written using our proposed high-level abstractions. Our execution strategies achieve high locality in disk accesses. Once a disk block is read from the disk, all iterations that read any of the elements from this disk block are performed. To support our execution strategies and improve the performance, we have developed static analysis techniques for: 1) computing the set of iterations that access a particular righ-hand-side element, 2) generating a function that can be applied to the meta-data associated with each disk block, for determining if that disk block needs to be read, and 3) performing code hoisting of conditionals. Cross-referenced as UMIACS-TR-2001-50Item Compiler-directed Dynamic Linking for Mobile Programs(1998-10-15) Acharya, Anurag; Saltz, JoelIn this paper, we present a compiler-directed technique for safe dynamic linking for mobile programs. Our technique guarantees that linking failures can occur only when a program arrives at a new execution site and that this failure can be delivered to the program as an error code or an exception. We use interprocedural analysis to identify the set of names that must be linked at the different sites the program executes on. We use a combination of runtime and compile-time techniques to identify the calling context and to link only the names needed in that context. Our technique is able to handle recursive programs as well as separately compiled code that may itself be able to move. We discuss language constructs for controlling the behavior of dynamic linking and the implication of some of these constructs for application structure. (Also cross-referenced as UMIACS-TR-96-81)Item A Component-based Implementation of Iso-surface Rendering for Visualizing Large Datasets(2001-05-10) Beynon, Michael D.; Kurc, Tahsin; Catalyurek, Umit; Sussman, Alan; Sussman, Alan; Saltz, JoelIsosurface rendering is a technique for extracting and visualizing surfaces within a 3D volume. It is a widely used visualization method in many application areas. In this paper, we describe a component-based implementation of isosurface rendering for visualizing very large datasets in a distributed, heterogeneous environment. We use DataCutter, a component framework that supports subsetting and user-defined processing of large multi-dimensional datasets in a distributed environment. We present experimental results on a heterogeneous collection of multiprocessor machines. (Cross-referenced as UMIACS-TR-2001-34)Item A Customizable Simulator for Workstation Networks(1998-10-15) Uysal, Mustafa; Acharya, Anurag; Bennett, Robert; Saltz, JoelWe present a customizable simulator called netsim for high-performance point-to-point workstation networks that is accurate enough to be used for application-level performance analysis yet is easy enough to customize for multiple architectures and software configurations. Customization is accomplished without using any proprietary information, using only publicly available hardware specifications and information that can be readily determined using a suite of test programs. We customized netsim for two platforms: a 16-node IBM SP-2 with a multistage network and a 10-node DEC Alpha Farm with an ATM switch. We show that netsim successfully models these two architectures with a 2-6% error on the SP-2 and a 10% error on the Alpha Farm for most test cases. It achieves this accuracy at the cost of a 7-36 fold simulation slowdown with respect to the SP-2 and a 3-8 fold slowdown with respect to the Alpha Farm. In addition, we show that the cross-traffic congestion for today's high-speed point-to-point networks has little, if any, effect on application-level performance and that modeling end-point congestion is sufficient for a reasonably accurate simulation. (Also cross-referenced as UMIACS-TR-96-68)Item Data Parallel Programming in an Adaptive Environment(1998-10-15) Edjlali, Guy; Agrawal, Gagan; Sussman, Alan; Saltz, JoelFor better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at runtime. In this paper, we discuss runtime support for data parallel programming in such an adaptive environment. Executing data parallel programs in an adaptive environment requires redistributing data when the number of processors changes, and also requires determining new loop bounds and communication patterns for the new set of processors. We have developed a runtime library to provide this support. We discuss how the runtime library can be used by compilers to generate code for an adaptive environment. We also present performance results for a multiblock Navier-Stokes solver run on a network of workstations using PVM for message passing. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not significant compared to the time required for the actual computations. (Also cross-referenced as UMIACS-TR-94-109)Item DataCutter and A Client Interface for the Storage Resource Broker with DataCutter Services(2000-05-13) Kurc, Tahsin; Beynon, Michael; Sussman, Alan; Saltz, JoelThe continuing increase in the capabilities of high performance computers and continued decreases in the cost of secondary and tertiary storage systems is making it increasingly feasible to generate and archive very large (e.g. petabyte and larger) datasets. Applications are also increasingly likely to make use of archived data obtained by different types of sensors. Such sensors include imaging devices deployed on satellites and aircraft, microscopy related imagery and radiology related imagery. Simulation or sensor datasets generated or acquired by one group may need to be accessed over a wide-area network by other groups. Datasets frequently describe data associated with collections of very large structured or unstructured grids where each grid point is associated with several variables. Applications frequently need only to obtain portions of a dataset. Required data may correspond to a particular region in a multidimensional space. The application may need to access all data associated in a multidimensional region or it may need only certain variable values at a subsampled set of spatial locations. In addition, in some cases, applications may require data products obtained by aggregating data in one way or another. For instance, a user might require time or space averaged data. This document describes the design of a middleware infrastructure, called DataCutter, that enables subsetting and user-defined filtering of multi-dimensional datasets stored in archival storage systems across a wide-area network. We also describe a client API for Storage Resource Broker (SRB) clients, which allows SRB clients to carry out subsetting and filtering of datasets stored through the SRB. This API uses a prototype implementation of the DataCutter indexing and filtering services. (Also cross-referenced as UMIACS-TR-2000-26)Item Decision Tree Construction for Data Mining on Cluster of Shared-Memory Multiprocessors(2001-05-10) Andrade, Henrique; Kurc, Tahsin; Sussman, Alan; Saltz, JoelClassification of very large datasets is a challenging problem in data mining. It is desirable to have decision-tree classifiers that can handle large datasets, because a large dataset often increases the accuracy of the resulting classification model. Classification tree algorithms can benefit from parallelization because of large memory and computation requirements for handling large datasets. Clusters of shared-memory multiprocessors (SMPs), in which each shared-memory node has a small number of processors (e.g., 2--8 processors) and is connected to the other nodes via a high-speed inter-connect, have become a popular alternative to pure distributed-memory and shared-memory machines. A cluster of SMPs provides a two-tier architecture, in which a combination of shared-memory and distributed-memory paradigms can be employed. In this paper we investigate decision tree construction on a cluster of SMPs. We present an algorithm that employs a hybrid approach. The classification training dataset is partitioned across the SMP nodes so that each SMP node performs tree construction using a subset of the records in the dataset. Within each SMP node, on the other hand, tasks associated with an attribute are dynamically scheduled to the light-weight threads running on the SMP node. We present experimental results on a Linux PC cluster with dual-processor SMP nodes. (Also cross-referenced as UMIACS-TR-2000-78)Item Deferred Data-Flow Analysis : Algorithms, Proofs and Applications(1998-11-03) Sharma, Shamik D.; Acharya, Anurag; Saltz, JoelLoss of precision due to the conservative nature of compile-time dataflow analysis is a general problem and impacts a wide variety of optimizations. We propose a limited form of runtime dataflow analysis, called deferred dataflow analysis (DDFA), which attempts to sharpen dataflow results by using control-flow information that is available at runtime. The overheads of runtime analysis are minimized by performing the bulk of the analysis at compile-time and deferring only a summarized version of the dataflow problem to runtime. Caching and reusing of dataflow results reduces these overheads further. DDFA is an interprocedural framework and can handle arbitrary control structures including multi-way forks, recursion, separately compiled functions and higher-order functions. It is primarily targeted towards optimization of heavy-weight operations such as communication calls, where one can expect significant benefits from sharper dataflow analysis. We outline how DDFA can be used to optimize different kinds of heavy-weight operations such as bulk-prefetching on distributed systems and dynamic linking in mobile programs. We prove that DDFA is safe and that it yields better dataflow information than strictly compile-time dataflow analysis. (Also cross-referenced as UMIACS-TR-98-46)Item Design of a Framework for Data-Intensive Wide-Area Applications(2000-02-23) Beynon, Michael D.; Kurc, Tahsin; Sussman, Alan; Saltz, JoelApplications that use collections of very large, distributed datasets have become an increasingly important part of science and engineering. With high performance wide-area networks becoming more pervasive, there is interest in making collective use of distributed computational and data resources. Recent work has converged to the notion of the Grid, which attempts to uniformly present a heterogeneous collection of distributed resources. Current Grid research covers many areas from low level infrastructure issues to high level application concerns. However, providing support for efficient exploration and processing of very large scientific datasets stored in distributed archival storage systems remains a challenging research issue. We have initiated an effort that focuses on developing efficient data-intensive applications in a Grid environment. In this paper, we present a framework, called filter-stream programming, that represents the processing units of a data-intensive application as a set of filters, which are designed to be efficient in their use of memory and scratch space. We describe a prototype infrastructure that supports execution of applications using the proposed framework. We present the implementation of two applications using the filter-stream programming framework, and discuss experimental results demonstrating the effects of heterogeneous resources on application performance. (Also cross-referenced as UMIACS-TR-2000-04)Item Efficient Execution of Multi-Query Data Analysis Batches Using Compiler Optimization Strategies(2003-08-01) Andrade, Henrique; Aryangat, Suresh; Kurc, Tahsin; Saltz, Joel; Sussman, AlanThis work investigates the leverage that can be obtained from compiler optimization techniques for efficient execution of multi-query workloads in data analysis applications. Our approach is to address multi-query optimization at the algorithmic level by transforming a declarative specification of scientific data analysis queries into a high-level imperative program that can be made more efficient by applying compiler optimization techniques. These techniques -- including loop fusion, common subexpression elimination and dead code elimination -- are employed to allow data and computation reuse across queries. We describe a preliminary experimental analysis on a real remote sensing application that is used to analyze very large quantities of satellite data. The results show our techniques achieve sizable reduction in the amount of computation and I/O necessary for executing query batches and in average executing times for the individual queries in a given batch. (UMIACS-TR-2003-76)Item Efficient Execution of Multiple Query Workloads in Data Analysis Applications(2001-05-10) Andrade, Henrique; Kurc, Tahsin; Sussman, Alan; Saltz, JoelApplications that analyze, mine, and visualize large datasets is considered an important class of applications in many areas of science, engineering and business. Queries commonly executed in data analysis applications often involve user-defined processing of data and application-specific data structures. If data analysis is employed in a collaborative environment, the data server should execute multiple such queries simultaneously to minimize the response time to the clients of the data analysis application. In a multi-client environment, there may be a large number of overlapping regions of interest and common processing requirements among the clients. Thus, better performance can be achieved if commonalities among multiple queries can be exploited. In this paper we present the design of a runtime system for executing multiple query workloads on a shared-memory machine. We describe initial experimental results using an application for browsing digitized microscopy images. (Cross-referenced as UMIACS-TR-2001-35)Item Efficient Support for Irregular Applications on Distributed Memory Machines.(1998-10-15) Mukherjee, Shubhendu S.; Sharma, Shamik D.; Hill, Mark D.; Larus, James R.; Rogers, Anne; Saltz, JoelIrregular computation problems underlie many important scientific applications. Although these problems are computationally expensive, and so would seem appropriate for parallel machines, their irregular and unpredictable run-time behavior makes this type of parallel program difficult to write and adversely affects run-time performance. This paper explores three issues---partitioning, mutual exclusion, and data transfer---crucial to the efficient execution of irregular problems on distributed-memory machines. Unlike previous work, we studied the same programs running in three alternative systems on the same hardware base (a Thinking Machines CM-5): the CHAOS irregular application library, Transparent Shared Memory (TSM), and eXtensible Shared Memory (XSM). CHAOS and XSM performed equivalently for all three applications. Both systems were somewhat (13%) to significantly faster (991%) than TSM. (Also cross-referenced as UMIACS-TR-95-46)Item An Evaluation of Architectural Alternatives for Rapidly Growing Datasets, Active Disks, Clusters, SMPs(1998-12-08) Uysal, Mustafa; Acharya, Anurag; Saltz, JoelGrowth and usage trends for several large datasets indicate that there is a need for architectures that scale the processing power as the dataset increases. In this paper, we evaluate three architectural alternatives for rapidly growing and frequently reprocessed datasets: active disks, clusters, and shared memory multiprocessors (SMPs). The focus of this evaluation is to identify potential bottlenecks in each of the alternative architectures and to determine the performance of these architectures for the applications of interest. We evaluate these architectural alternatives using a detailed simulator and a suite of nine applications. Our results indicate that for most of these applications Active Disk and cluster configurations were able to achieve significantly better performance than SMP configurations. Active Disk configurations were able to match (and in some cases improve upon) the performance of commodity cluster configurations. (Also cross-referenced as UMIACS-TR-98-68)
- «
- 1 (current)
- 2
- 3
- »