Unified Datacenter Power Management Considering On-Chip and Air Temperature Constraints

View/ Open
Date
2010Author
Shi, Bing
Srivastava, Ankur
Advisor
Srivastava, Ankur
Metadata
Show full item recordAbstract
The current approaches for datacenter power management (workload scheduling, CPU speed control, etc) focus primarily on maintaining
the air temperature surrounding servers to be within the manufacturer specified constraint. This is problematic since several CPUs may still be violating the on-chip thermal constraint thereby leading to reliability loss. The primary objective of this work is
to develop a unified approach for datacenter power optimization (by controlling the CPU speeds) which accounts for both the silicon
level temperature of the VLSI components such as CPUs and the air temperature that directly impacts the reliability of other devices
such as disks, and also the performance delivered. Our algorithm follows a two step approach: optimally solving a convex
approximation that assigns continuous frequency values to all CPUs and a discretization step for legalization of the assigned frequencies. The experimental results indicate that our method guarantees both on-chip CPU and off-chip air temperature to be within temperature constraints. However, the traditional approach of
constraining only air temperature will result in on-chip CPU temperature violation on about 40% of the CPUs, or 42% more power consumption to pull the CPU temperature back within constraint by increasing the HVAC cooling.