# Tech Reports in Computer Science and Engineering

## Permanent URI for this community

The technical reports collections in this community are deposited by the Library of the Computer Science department. If you have questions about these collections, please contact library staff at library@cs.umd.edu

## Browse

### Browsing Tech Reports in Computer Science and Engineering by Issue Date

Now showing 1 - 20 of 1220

###### Results Per Page

###### Sort Options

Item Hypothesis Testing with Errors in the Variables(1995-02-06) David, Nancy; Stewart, G. W.In this paper we give reason to hope that errors in regression variables are not as harmful as one might expect. Specifically, we will show that although the errors can change the values of the quantities one computes in a regression analysis, under certain conditions they leave the distributions of the quantities approximately unchanged.Item A Two-Stage Iteration for Solving Nearly Uncoupled Markov Chains(1995-02-06) Stewart, G. W.; Stewart, W. J.; McAllister, D. F.This paper is concerned with an iteration for determining the steady-state probability vector of a nearly uncoupled Markov Chain. The states of these chains can be partitioned into aggregates with low probabilities of transitions between aggregates. The iteration consists of alternating block Gauss--Seidel iterations with Rayleigh--Ritz refinements. Under natural regularity conditions, the composite iteration reduces the error by a factor proportional to the size of the coupling between aggregates, so that the more loosely the chain is coupled, the faster the convergence.Item An Iterative Method for Solving Linear Inequalities(1995-02-06) Stewart, G. W.This paper describes and analyzes a method for finding nontrivial solutions of the inequality $Ax \geq 0$, where $A$ is an $m \times n$ matrix of rank $n$. The method is based on the observation that a certain function $f$ has a unique minimum if and only if the inequality {\it fails to have} a nontrivial solution. Moreover, if there is a solution, an attempt to minimize $f$ will produce a sequence that will diverge in a direction that converges to a solution of the inequality. The technique can also be used to solve inhomogeneous inequalities and hence linear programming problems, although no claims are made about competitiveness with existing methods.Item Invariant Subspaces and Capital Punishment (A Participatory Paper)(1995-02-06) Stewart, G. W.The notion of invariant subspaces is useful in a number of theoretical and practical applications. In this paper we give an elementary treatment of invariant subspaces that stresses their connection with simple eigenvalues and their eigenvectors.Item A Study of File Manipulation by Novices Using Commands vs. Direct Manipulation(1995-04-13) Margono, Sepeedeh; Shneiderman, BenThere are three basic interactive styles of control in human interfaces with computers: command, menu, and direct manipulation. In the past few years, these three styles have become the subject of many studies. However, few comparisons have been done between interfaces that use direct manipulation and command styles. This experiment compares file manipulation operations on the Apple Macintosh, which has a direct manipulation interface, with the IBM PC with MS-DOS, which has the command interface. After a brief training period, novices accomplished file manipulation tasks more rapidly, with fewer errors and greater satisfaction with the Apple Macintosh. Problems arising for both versions are discussed and suggestions for improvements are made. (Also cross-referenced as CAR-TR-264)Item The Relative Importance of Concurrent Writers and Weak Consistency Models(1998-10-15) Keleher, PeteThis paper presents a detailed comparison of the relative importance of allowing concurrent writers versus the choice of the underlying consistency model. Our comparison is based on single- and multiplewriter versions of a lazy release consistent (LRC) protocol, and a single-writer sequentially consistent protocol, all implemented in the CVM software distributed shared memory system. We find that in our environment, which we believe to be representative of distributed systems today and in the near future, the consistency model has a much higher impact on overall performance than the choice of whether to allow concurrent writers. The multiple writer protocol performs an average of 9% better than the single writer LRC protocol, but 34% better than the single-writer sequentially consistent protocol. Set against this, MW-LRC required an average of 72% memory overhead, compared to 10% overhead for the single-writer protocols.Item Human Face Segmentation and Identification(1998-10-15) Sirohey, Saad Ahmed(Also cross-referenced as CAR-TR-695) This thesis considers segmentation and identification of human faces from grey scale images with clutter. The segmentation developed utilizes the elliptical structure of the human head. It uses the information present in the edge map of the image and thr ough some preprocessing separates the head from the background clutter. An ellipse is then fitted to mark the boundary between the head region and the background. The identification procedure finds feature points in the segmented face through a Gabor wave let decomposition and performs graph matching. The segmentation and identification algorithms were tested on a database of 48 images of 16 persons with encouraging results.Item Fast Subsequence Matching in Time-Series Databases(1998-10-15) Faloutsos, Christos; Ranganathan, M.; Manolopoulos, YannisWe present an efficient indexing method to locate 1-dimensional subsequences within a collection of sequences, such that the subsequences match a given (query) pattern within a specified tolerance. The idea is to map each data sequence into a small set of multidimensional rectangles in feature space. Then, these rectangles can be readily indexed using traditional spatial access methods, like the R*-tree \cite{Beckmann90R}. In more detail, we use a sliding window over the data sequence and extract its features; the result is a trail in feature space. We propose an efficient and effective algorithm to divide such trails into sub-trails, which are subsequently represented by their Minimum Bounding Rectangles (MBRs). We also examine queries of varying lengths, and we show how to handle each case efficiently. We implemented our method and carried out experiments on synthetic and real data (stock price movements). We compared the method to sequential scanning, which is the only obvious competitor. The results were excellent: our method accelerated the search time from 3 times up to 100 times. Appeared in ACM SIGMOD 1994, pp 419-429. Given "Best Paper award" (Also cross-referenced as UMIACS-TR-93-131)Item Analysis of the n-dimensional quadtree decomposition for arbitrary hyper-rectangles(1998-10-15) Faloutsos, Christos; Jagadish, H.V.; Manolopoulos, YannisWe give a closed-form expression for the average number of $n$-dimensional quadtree nodes (`pieces' or `blocks') required by an $n$-dimensional hyper-rectangle aligned with the axes. Our formula includes as special cases the formulae of previous efforts for 2-dimensional spaces \cite{Faloutsos92Analytical}. It also agrees with theoretical and empirical results that the number of blocks depends on the hyper-surface of the hyper-rectangle and not on its hyper-volume. The practical use of the derived formula is that it allows the estimation of the space requirements of the $n$-dimensional quadtree decomposition. Quadtrees are used extensively in 2-dimensional spaces (geographic information systems and spatial databases in general), as well in higher dimensionality spaces (as oct-trees for 3-dimensional spaces, e.g. in graphics, robotics and 3-dimensional medical images [Arya et al., 1994]. Our formula permits the estimation of the space requirements for data hyper-rectangles when stored in an index structure like a ($n$-dimensional) quadtree, as well as the estimation of the search time for query hyper-rectangles. A theoretical contribution of the paper is the observation that the number of blocks is a piece-wise linear function of the sides of the hyper-rectangle. (Also cross-referenced as UMIACS-TR-94-130)Item Stabbing Orthogonal Objects in 3-Space(1998-10-15) Mount, David M.; Pu, Fan-TaoWe consider a problem that arises in the design of data structures for answering {\em visibility range queries}, that is, given a $3$-dimensional scene defined by a set of polygonal patches, we wish to preprocess the scene to answer queries involving the set of patches of the scene that are visible from a given range of points over a given range of viewing directions. These data structures recursively subdivide space into cells until some criterion is satisfied. One of the important problems that arise in the construction of such data structures is that of determining whether a cell represents a nonempty region of space, and more generally computing the size of a cell. In this paper we introduce a measure of the {\em size} of the subset of lines in 3-space that stab a given set of $n$ polygonal patches, based on the maximum angle and distance between any two lines in the set. Although the best known algorithm for computing this size measure runs in $O(n^2)$ time, we show that if the polygonal patches are orthogonal rectangles, then this measure can be approximated to within a constant factor in $O(n)$ time. (Also cross-referenced as UMIACS-TR-96-71)Item Iterative methods for solving Ax = b GMRES/FOM versus QMR/BiCG(1998-10-15) Cullum, Jane K.We study the convergence of GMRES/FOM and QMR/BiCG methods for solving nonsymmetric Az = b. We prove that given the results of a BiCG computation on Az = b, we can obtain a matrix B with the same eigenvalues as A and a vector c such that the residual norms generated by a FOM computation on Bz = c are identical to those generated by the BiCG computations. Using a unitary equivalence for each of these methods, we obtain test problems where we can easily vary certain spectral properties of the matrices. We use these test problems to study the effects of nonnormality on the convergence of GMRES and QMR, to study the effects of eigenvalue outliers on the convergence of QMR, and to compare the convergence of restarted GMRES and QMR across a family of normal and nonnormal problems. Our GMRES tests on nonnormal test matrices indicate that nonnormality can have unexpected effects upon the residual norm convergence, giving misleading indications of superior convergence when the error norms for GMRES are not significantly different from those for QMR. Our QMR tests indicate that the convergence of the QMR residual and error norms is infLuenced predominantly by small and large eigenvalue outliers and by the character, real, complex, or nearly real, of the outliers and the other eigenvalues. In our comparison tests QMR outperformed GMRES(10) and GMRES(20) on both the normal and nonnormal test matrices. If you have difficulty viewing the second part of the linked postscript file, open the file: http://www.cs.umd.edu/fs/ftp/pub/papers/papers/3587.figures.ps. This is the second part of the paper in a separate file. (Also cross-referenced as UMIACS-TR-96-2)Item Qualitative Analysis for Maintenance Process Assessment(1998-10-15) Briand, Lionel; Kim, Yong-Mi; Melo, Walcelio L.; Seaman, Carolyn B.; Victor R. Basili ,In order to improve software maintenance processes, we first need to be able to characterize and assess them. These tasks must be performed in depth and with objectivity since the problems are complex. One approach is to set up a measurement-based software process improvement program specifically aimed at maintenance. However, establishing a measurement program requires that one understands the problems to be addressed by the measurement program and is able to characterize the maintenance environment and processes in order to collect suitable and cost-effective data. Also, enacting such a program and getting usable data sets takes time. A short term substitute is therefore needed. We propose in this paper a characterization process aimed specifically at maintenance and based on a general qualitative analysis methodology. This process is rigorously defined in order to be repeatable and usable by people who are not acquainted with such analysis procedures. A basic feature of our approach is that actual implemented software changes are analyzed in order to understand the flaws in the maintenance process. Guidelines are provided and a case study is shown that demonstrates the usefulness of the approach. (Also cross-referenced as UMIACS-TR-96-7)Item Concurrent Maintenance of Skip Lists(1998-10-15) Pugh, WilliamThis papers describes a new approach to providing efficient concurrent access to a dynamic search structure. Previous approaches have attempted to solve this problem using search trees (either balanced or unbalanced). We describe methods for performing concurrent access and updates using skip lists. Skip lists are a probabilistic alternative to balanced trees that provide much of the simplicity of unbalanced trees, together with good worst-case expected performance. In this paper, we briefly review skip lists, describe simple methods for concurrent maintenance of sorted linked lists, formally prove the correctness of these methods, and show how they can be extended to provide simple and efficient algorithms for concurrent maintenance of skip lists. (Also cross-referenced as UMIACS-TR-90-80)Item What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation(1998-10-15) Lawrence, Steve; Giles, C. Lee; Tsoi, Ah ChungOne of the most important aspects of any machine learning paradigm is how it scales according to problem size and complexity. Using a task with known optimal training error, and a pre-specified maximum number of training updates, we investigate the convergence of the backpropagation algorithm with respect to a) the complexity of the required function approximation, b) the size of the network in relation to the size required for an optimal solution, and c) the degree of noise in the training data. In general, for a) the solution found is worse when the function to be approximated is more complex, for b) oversize networks can result in lower training and generalization error, and for c) the use of committee or ensemble techniques can be more beneficial as the amount of noise in the training data is increased. For the experiments we performed, we do not obtain the optimal solution in any case. We further support the observation that larger networks can produce better training and generalization error using a face recognition example where a network with many more parameters than training points generalizes better than smaller networks. (Also cross-referenced as UMIACS-TR-96-22)Item UM Translog: A Planning Domain for the Development and Benchmarking of Planning Systems(1998-10-15) Andrews, Scott; Kettler, Brian; Erol, Kutluhan; Hendler, JamesThe last twenty years of AI planning research has discovered a wide variety of planning techniques such as state-space search, hierarchical planning, case-based planning and reactive planning. These techniques have been implemented in numerous planning systems (e.g., STRIPS, SNLP, UCPOP, NONLIN, SIPE). Initially, a number of simple toy domains have been devised to assist in the analysis and evaluation of planning systems and techniques. The most well known examples are ``Blocks World'' and ``Towers of Hanoi.'' As planning systems grow in sophistication and capabilities, however, there is a clear need for planning benchmarks with matching complexity to evaluate those new features and capabilities. UM Translog is a planning domain designed specifically for this purpose. UM Translog was inspired by the CMU Transport Logistics domain developed by Manuela Veloso. UM Translog is an order of magnitude larger in size (41 actions versus 6), number of features and types interactions. It provides a rich set of entities, attributes, actions and conditions, which can be used to specify rather complex planning problems with a variety of plan interactions. The detailed set of operators provides long plans (~40 steps) with many possible solutions to the same problem, and thus this domain can also be used to evaluate the solution quality of planning systems. The UM Translog domain has been used with the UMCP, UM Nonlin, and CaPER planning systems thus far. (Also cross-referenced as UMIACS-TR-95-69)Item Communication and Organization in Software Development: An Empirical Study(1998-10-15) Seaman, Carolyn B.; Basili, Victor R.The empirical study described in this paper addresses the issue of communication among members of a software development organization. The independent variables are various attributes of organizational structure. The dependent variable is the effort spent on sharing information which is required by the software development process in use. The research questions upon which the study is based ask whether or not these attributes of organizational structure have an effect on the amount of communication effort expended. In addition, there are a number of blocking variables which have been identified. These are used to account for factors other than organizational structure which may have an effect on communication effort. The study uses both quantitative and qualitative methods for data collection and analysis. These methods include participant observation, structured interviews, and graphical data presentation. The results of this study indicate that several attributes of organizational structure do affect communication effort, but not in a simple, straightforward way. In particular, the distances between communicators in the reporting structure of the organization, as well as in the physical layout of offices, affects how quickly they can share needed information, especially during meetings. These results provide a better understanding of how organizational structure helps or hinders communication in software development. This work was supported in part IBM's Centre for Advanced Studies, and by NASA grant NSG-5123. (Also cross-referenced as UMIACS-TR-96-23)Item Optimization within a Unified Transformation Framework(1998-10-15) Kelly, WayneProgrammers typically want to write scientific programs in a high level language with semantics based on a sequential execution model. To execute efficiently on a parallel machine, however, a program typically needs to contain explicit parallelism and possibly explicit communication and synchronization. So, we need compilers to convert programs from the first of these forms to the second. There are two basic choices to be made when parallelizing a program. First, the computations of the program need to be distributed amongst the set of available processors. Second, the computations on each processor need to be ordered. My contribution has been the development of simple mathematical abstractions for representing these choices and the development of new algorithms for making these choices. I have developed a new framework that achieves good performance by minimizing communication between processors, minimizing the time processors spend waiting for messages from other processors, and ordering data accesses so as to exploit the memory hierarchy. This framework can be used by optimizing compilers, as well as by interactive transformation tools. The state of the art for vectorizing compilers is already quite good, but much work remains to bring parallelizing compilers up to the same standard. The main contribution of my work can be summarized as improving this situation by replacing existing ad hoc parallelization techniques with a sound underlying foundation on which future work can be built. (Also cross-referenced as UMIACS-TR-96-93)Item Assessing Software Review Meetings: Results of a Comparative Analysis of Two Experimental Studies,(1998-10-15) Porter, Adam A.; Johnson, Philip M.Software review is a fundamental tool for software quality assurance. Nevertheless, there are significant controversies as to the most efficient and effective review method. One of the most important questions currently being debated is the utility of meetings. Although almost all industrial review methods are centered around the inspection meeting, recent findings call their value into question. To gain insight into these issues, the two authors of this paper separately and independently conducted controlled experimental studies. This paper discusses a joint effort to understand the broader implications of these tow studies. To do this, we designed and carried out a process of "reconciliation" in which we established a common framework for the comparison of the two experimental studies, re-analyzed to experimental data with respect to this common framework, and compared the results. Through this process we found many striking similarities between the the results of the two studies, strengthening their individual conclusions. it also revealed interesting differences between the two experiments, suggesting important avenues for future research. (Also cross-referenced as UMIACS-TR-97-15)Item Sorting on Clusters of SMPs(1998-10-15) Helman, David R.; JaJa, JosephClusters of symmetric multiprocessors (SMPs) have emerged as the primary candidates for large scale multiprocessor systems. In this paper, we introduce an efficient sorting algorithm for clusters of SMPs. This algorithm relies on a novel scheme for stably sorting on a single SMP coupled with balanced regular communication on the cluster. Our SMP algorithm seems to be asymptotically faster than any of the published algorithms we are aware of. The algorithms were implemented in C using Posix Threads and the SIMPLE library of communication primitives and run on a cluster of DEC AlphaServer 2100A systems. Our experimental results verify the scalability and efficiency of our proposed solution and illustrate the importance of considering both memory hierarchy and the overhead of shifting to multiple nodes. (Also cross-reference as UMIACS-TR-97-69Item Software Engineering of Distributed Simulation Environments(1998-10-15) Duff, James; Purtilo, James M.; Capps, Michael; Stotts, DavidWith the increasing popularity of simulation and virtual environment software, it is necessary to provide software engineering techniques to simulation program designers. In this paper we lay out the requirements that any such techniques will have to meet, then suggest a formalism and an interconnection tool that will allow the interconnection of re-usable simulator components to build distributed simulation software. (Also cross-referenced as UMIACS-TR-95-100)