Optimizing Power Distribution in Data Centers With Software-Defined Power

Power gets stranded in almost every data center. It exists whenever the power distributed to any rack exceeds the power it actually consumes during periods of peak utilization. For a single rack, this mismatch may seem trivial. But multiply a few kilowatts by the many racks that fill a typical data center, and the amount of stranded power can approach 50 percent of the total available. Reclaiming stranded power enables an organization to extend the life of its data center, thereby avoiding the huge capital expenditure that would be required for a major upgrade or the construction of a new facility.

Stranded power is not a problem per se, as it does not pose any risk to any IT equipment already installed. But it can impose a constraint on future expansion, and that in turn can lead to a very serious problem: a decision to upgrade the existing data center or build a new one, even though the power available is sufficient to accommodate expansion well into the future.

The all-too-common underlying cause of stranded power is the use of wildly overstated nameplate or datasheet ratings on equipment, especially servers. These ratings must specify the maximum power consumption possible, and that only rarely (and usually never) actually occurs, even under peak load conditions, as most servers are typically lightly loaded with components. This is why using nameplate ratings often results in stranding 40 percent to 50 percent of the power being distributed to server racks.

Unfortunately, IT managers have had few viable alternatives to nameplate ratings -- until recently. In an effort to promote greater energy efficiency in data centers, the U.S. Environmental Protection Agency created an ENERGY STAR rating system for servers and other IT equipment. But a high ENERGY STAR rating indicates only that a server is more “efficient” compared to others within the same category, and therefore, it does nothing to help optimize power distribution. Even worse, the rating ignores the year of production, which means there is no way to take into account improvements in performance and the many other advances made year over year.

Minimizing stranded power requires knowing the actual peak power being consumed by servers, and that requires a standardized means of testing. Underwriters Laboratories recently fulfilled this need with the UL2640 power measurement standard. Utilizing the PAR⁴ Efficiency Rating system, UL2640 specifies a test methodology that accurately measures actual power consumption under loads ranging from idle to peak, and provides other useful metrics, such as transactions per second per watt (TPS/W). It also adds the year to the rating to account for the many advances that occur over time.

With increased server virtualization, and despite the fact that individual servers rarely achieve utilization rates above 60 percent, it is nevertheless prudent to assume a reasonably high workload for power distribution purposes. The reason is that “utilization” is an average, and even a momentary peak in workload (when a rack of servers could be operating, albeit briefly, at or near 100 percent utilization) is enough to trip a circuit breaker. This is, therefore, a vitally important consideration, and is another way the UL2640 standard helps by adding real precision to power allocation and equipment placement. While peak power consumption is the primary consideration in the context of server placement, transactions per second per watt should be used as the primary measure of efficiency, as well as the indicator of any potential for consolidation.

Using the most energy-efficient servers helps ensure that the power being consumed is being put to its most effective use. This is the real power (forgive the pun) of UL2640’s TPS/W rating, which enables IT managers to compare the transactional efficiency of legacy servers with newer ones, and to compare newer models of servers with one another. And this measure of energy efficiency should be considered during every hardware refresh cycle and whenever adding capacity.

In one actual case involving a large data center that had exhausted available power, the PAR⁴ measurements revealed that the servers actually consumed only 2500 watts under their normal load, and never consumed more than 3500 watts under their peak load, which occurred less than 1 percent of the time. The server’s nameplate power rating of 7000 watts was, therefore, twice the actual maximum for this server configuration. With that knowledge, the data center operator was able to configure racks with nearly double the number of servers than originally thought possible.

Other best practices related to power involve achieving greater efficiencies in cooling and dynamically matching server capacity to actual workloads. The solution needed for these efforts requires software-defined power that is seamlessly integrated with virtual environments and data center infrastructure management systems to continuously monitor capacity, power consumption and environmental conditions, and make adjustments dynamically within the context of the application service levels. And this is just the beginning of the powerful future of software-defined power.

***

Clemens Pfeiffer is the CTO of Power Assure and is a 25-year veteran of the software industry, where he has held leadership roles in process modeling and automation, software architecture and database design, and data center management and optimization technologies.