Skip to content
University of Maryland LibrariesDigital Repository at the University of Maryland
    • Login
    View Item 
    •   DRUM
    • College of Computer, Mathematical & Natural Sciences
    • Computer Science
    • Technical Reports from UMIACS
    • View Item
    •   DRUM
    • College of Computer, Mathematical & Natural Sciences
    • Computer Science
    • Technical Reports from UMIACS
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Decision Tree Construction for Data Mining on Cluster of Shared-Memory Multiprocessors

    Thumbnail
    View/Open
    CS-TR-4203.ps (275.7Kb)
    No. of downloads: 236

    Date
    2001-05-10
    Author
    Andrade, Henrique
    Kurc, Tahsin
    Sussman, Alan
    Saltz, Joel
    Metadata
    Show full item record
    Abstract
    Classification of very large datasets is a challenging problem in data mining. It is desirable to have decision-tree classifiers that can handle large datasets, because a large dataset often increases the accuracy of the resulting classification model. Classification tree algorithms can benefit from parallelization because of large memory and computation requirements for handling large datasets. Clusters of shared-memory multiprocessors (SMPs), in which each shared-memory node has a small number of processors (e.g., 2--8 processors) and is connected to the other nodes via a high-speed inter-connect, have become a popular alternative to pure distributed-memory and shared-memory machines. A cluster of SMPs provides a two-tier architecture, in which a combination of shared-memory and distributed-memory paradigms can be employed. In this paper we investigate decision tree construction on a cluster of SMPs. We present an algorithm that employs a hybrid approach. The classification training dataset is partitioned across the SMP nodes so that each SMP node performs tree construction using a subset of the records in the dataset. Within each SMP node, on the other hand, tasks associated with an attribute are dynamically scheduled to the light-weight threads running on the SMP node. We present experimental results on a Linux PC cluster with dual-processor SMP nodes. (Also cross-referenced as UMIACS-TR-2000-78)
    URI
    http://hdl.handle.net/1903/1113
    Collections
    • Technical Reports from UMIACS
    • Technical Reports of the Computer Science Department

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility
     

     

    Browse

    All of DRUMCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister
    Pages
    About DRUMAbout Download Statistics

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility