Hill-Climbing SMT Processor Resource Distribution

dc.contributor.advisorYeung, Donalden_US
dc.contributor.authorChoi, Seungryulen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.description.abstractThe key to high performance in SMT processors lies in optimizing the shared resources distribution among simultaneously executing threads. Existing resource distribution techniques optimize performance only indirectly. They infer potential performance bottlenecks by observing indicators, like instruction occupancy or cache miss count, and take actions to try to alleviate them. While the corrective actions are designed to improve performance, their actual performance impact is not known since end performance is never monitored. Consequently, opportunities for performance gains are lost whenever the corrective actions do not effectively address the actual performance bottlenecks occurring in the SMT processor pipeline. In this dissertation, we propose a different approach to SMT processor resource distribution that optimizes end performance directly. Our approach observes the impact that resource distribution decisions have on performance at runtime, and feeds this information back to the resource distribution mechanisms to improve future decisions. By successively applying and evaluating different resource distributions, our approach tries to learn the best distribution over time. Because we perform learning on-line, learning time is crucial. We develop a hill-climbing SMT processor resource distribution technique that efficiently learns the best resource distribution by following the performance gradient within the resource distribution space. This dissertation makes three contributions within the context of learning-based SMT processor resource distribution. First, we characterize and quantify the time-varying performance behavior of SMT processors. This analysis provides understanding of the behavior and guides the design of our hill-climbing algorithm. Second, we present a hill-climbing SMT processor resource distribution technique that performs learning on-line. The performance evaluation of our approach shows a 11.4% gain over ICOUNT, 11.5% gain over FLUSH, and 2.8% gain over DCRA across a large set of 63 multiprogrammed workloads. Third, we compare existing resource distribution techniques to an ideal learning-based technique that performs learning off-line to show the potential performance of the existing techniques. This limit study identifies the performance bottleneck of the existing techniques, showing that the performance of ICOUNT, FLUSH, and DCRA is 13.2%, 13.5%, and 6.6%, respectively, lower than the ideal performance. Our hill-climbing based resource distribution, however, handles most of the bottlenecks of the existing techniques properly, achieving 4.1% lower performance than the ideal case.en_US
dc.format.extent2273068 bytes
dc.subject.pqcontrolledComputer Scienceen_US
dc.subject.pqcontrolledEngineering, Electronics and Electricalen_US
dc.subject.pquncontrolledComputer architectureen_US
dc.subject.pquncontrolledMicro processor designen_US
dc.subject.pquncontrolledSimultaneous multi-threadingen_US
dc.subject.pquncontrolledResource distributionen_US
dc.subject.pquncontrolledHill-climbing algorithmen_US
dc.titleHill-Climbing SMT Processor Resource Distributionen_US


Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
2.17 MB
Adobe Portable Document Format