Mastering the Heat: Cooling & Power Solutions for a 50kW Rack Density AI Data Center

by Sean Murphy on 2/27/24 11:42 AM

As artificial intelligence (AI) continues to reshape industries and drive innovation, the demand for high-performance computing in data centers has reached unprecedented levels. Managing the cooling and power requirements of a 50kW rack density AI data center presents a unique set of challenges. In this blog post, we will explore effective strategies and cutting-edge solutions to ensure optimal performance and efficiency in such a demanding environment. 

artificial-int

Precision Cooling Systems

The heart of any high-density data center is its cooling system. For a 50kW rack density AI data center, precision cooling is non-negotiable. Invest in advanced cooling solutions such as in-row or overhead cooling units that can precisely target and remove heat generated by high-density servers. These systems offer greater control and efficiency compared to traditional perimeter cooling methods.

Liquid Cooling Technologies

liquid-cooling-newsletterLiquid cooling has emerged as a game-changer for high-density computing environments. Immersive liquid cooling systems or direct-to-chip solutions can effectively dissipate heat generated by AI processors, allowing for higher power densities without compromising on reliability. Explore liquid cooling options to optimize temperature control in your data center.

High-Efficiency Power Distribution

To meet the power demands of a 50kW rack density, efficient power distribution is paramount. Implementing high-voltage power distribution systems and exploring alternative power architectures, such as busway systems, can enhance energy efficiency and reduce power losses. This not only ensures reliability but also contributes to sustainability efforts.

Redundancy and Resilience

A high-density AI data center demands a robust power and cooling infrastructure with built-in redundancy. Incorporate N+1 or 2N redundancy models for both cooling and power systems to mitigate the impact of potential failures. Redundancy not only enhances reliability but also allows for maintenance without disrupting critical operations.

Dynamic Thermal Management

Utilize intelligent thermal management systems that adapt to the dynamic workload of AI applications. These systems can adjust cooling resources in real-time, ensuring that the infrastructure is optimized for varying loads. Dynamic thermal management contributes to energy efficiency by only using the necessary resources when and where they are needed.

Energy-Efficient Hardware

Opt for energy-efficient server hardware designed for high-density environments. AI-optimized processors often come with advanced power management features that can significantly reduce energy consumption. Choosing hardware that aligns with your data center's efficiency goals is a key factor in managing power and cooling requirements effectively.

Monitoring and Analytics

Implement comprehensive monitoring and analytics tools to gain insights into the performance of your AI data center. Real-time data on temperature, power consumption, and system health can help identify potential issues before they escalate. Proactive monitoring allows for predictive maintenance and ensures optimal conditions for your high-density racks.

Successfully cooling and powering a 50kW rack density AI data center requires a holistic and forward-thinking approach. By investing in precision cooling, liquid cooling technologies, high-efficiency power distribution, redundancy, dynamic thermal management, energy-efficient hardware, and robust monitoring tools, you can create a resilient and high-performing infrastructure. Embrace the technological advancements available in the market to not only meet the challenges posed by high-density AI computing, but to excel in this dynamic and transformative era of data center management.

Author's Note:

Not a bad blog post, right? I was tasked with writing a blog post on how to power and cool high density racks for AI applications. So, I had Chat GPT write my blog post in 15 seconds, saving me a ton of time and allowing me to enjoy watching my kid’s athletic events this weekend. As end users embrace AI technology, it is imperative that we understand how to support the hardware and software that enables us to achieve these time saving technologies. Over the past 6 months, about 20% of my time has been spent discussing how to support customer 35kW to 75kW rack densities.

Additionally, another key to understand, is the balance of AI and the end-user’s ability to recognize limitations and areas for improvement. AI taps into the database of information that is the Internet. Powerful, but it does so (at least currently) in a fashion that makes it appear to be two years behind. For example, this blog post was written to reflect a 35kW rack density, and subsequently, ChatGPT noted 35kW. However, today, I’m regularly working with racks supporting AI that average 50kW, and have seen go up to 75kW… and know that applications can hit upwards of 300kW per rack. So, please note, anywhere in the blog where it says 50kW, human intervention made these necessary edits to AI's outdated "35kW".

Also, just for reference, a 75kW application requires 21 tons of cooling for one IT rack! So, these new high-density technologies require the equivalent of one traditional perimeter CRAC to cool one AI IT Rack. DVL is here to help provide engineering and manufacturing support to design your Cooling, PLC Switchgear, Busway Distribution, Rack Power Distribution, Cloud Monitoring, and other critical infrastructure to support your efficient AI Technology.

Read More

Topics: Data Center, Thermal Management, Data Center efficiency, beyond the product, artificial intelligence

Achieving Excellence in Data Center Operations

by Robert Leake on 1/9/24 11:41 AM

Data centers are the beating hearts of modern businesses. They house critical infrastructure and sensitive data that is vital to all departments across an organization. In this fast-paced digital landscape, making sure your data center is always in top operational shape shouldn’t be just a goal, but an absolute necessity on any given day that someone will need to access pivotal data at the click of a mouse.

And, as you know quite well, running a data center pulls you in multiple directions at once. That’s why, to ensure you’re never offline, it’s important to always have a real-time pulse on the areas outlined below. 

data center operations infographic

Security: Building Fortresses for Data

Imagine a data center as a fortress with a hard outer shell and multiple layers within, each with their own security measures. Strict management of access ensures only those who require entry to each of these levels can actually get in. This goes beyond the front door and is a physical concern throughout the entire data center. To minimize security risks, it’s a must to manage the who, why, and where of every person entering your facility, as non-company staff must access the grounds for daily demands or periodic maintenance.

Preparation is Key

The COVID-19 pandemic brought many unexpected challenges for those leading data center operations at the time. Companies have long developed various types of disaster recovery plans accounting for a variety of scenarios. However, the pandemic tested those plans. And, when we found ourselves in a situation that hadn’t been experienced in 100 years, many failed the test. Fortunately, lessons learned strengthened disaster recovery going forward. Such lessons include the delicate nature of supply chain management, the importance of procuring inventory when available, and being able to execute “on a dime” during even the most chaotic of times. For these reasons, establishing thorough disaster recovery plans and being able to quickly adapt to unknowns have become indispensable.

Safety: A Cultural Requirement

Prioritizing the well-being of employees working under extreme conditions is crucial and should never be a question. That is why, for very good reasons, safety has become a cultural requirement for all businesses. Main concerns within data center environments include managing worksites where employees from multiple companies are working in tandem, ensuring the safety of workers that are working alone, taking precautions when working with high voltage power infrastructure, and having in place efficient response processes in case of emergencies. It’s not just enough to have these processes in place, but to ensure that no one is cutting corners, especially organizational leaders, as values are engrained from the very top. If you get everyone home safely at the end of the day, you’ve got yourself a strong culture and a safe data center.

Continuous Improvement

Even the top tier of organizations have room for improvement, whether being driven for the need to optimize efficiency or new ways to stay on budget. Repetitive tasks can be improved by identifying process enhancements and design strategies. Challenging the status quo can have significant results when driven by the employees who are closest to the challenges. Buy-in at all levels is needed for improvements and long-term success, as support from leadership helps to ensure this evolution occurs.

Nurturing Future Leaders

As the most experienced data center professionals continue to retire, there is a greater need for fresh faces. But to accomplish this, the industry needs to make sure students at all levels are being properly introduced to the concept of data centers, how they work, and why they must work for society to function. For example, younger generations are the largest consumers and creators of data. The broadband requirements are ever increasing, and the workhorse behind this data isn’t even a thought, as they may not recognize the connection between data centers and their iCloud folders, unless it is demonstrated to them. Furthermore, tomorrow’s professionals stand to benefit from learning more about our industry, as it opens for them a new door of career potential and even lucrative compensation.

Exposing younger generations to the industry, whether through professional forums and societies or internships, providing guidance on required skills, and mentoring them as they mature, are essential to properly pass the torch. These future leaders will shape the industry's evolution and will more immediately allow you to sleep soundly at night knowing the lights are being properly kept on, and equipment is up and running.

Finding the Right Fit

Attitude and aptitude are definite requirements for an employee to succeed in data center operations. When recruiting for the best possible fit, you’re going to ultimately need someone who can handle the stress of working in such an unpredictable environment. Being resilient during challenging times makes for outstanding professionals in any field. Additionally, communication skills are vital. Being able to identify and resolve problems is great, but being able to turn those problems into learning opportunities for an entire team, is invaluable, especially in the high-stress moments.

By making these items a priority, and by constantly reevaluating your organization’s needs, you are positioning your organization for great success. One data center operations team that has figured this out quite well, is the EdgeCore Data Centers’ team of operations leaders, led by Therese Kerfoot, SVP Operations. In December, Kerfoot and her team, Harrison Stoll (VP Operations), Matt Silvers (VP Operations Programs), and Sarah Kasper (Sr. Director, Environmental Health & Safety) joined us on the DVL Power Hour, “Data Center Excellence: Operations & Safety,” where the four shared their experiences in these areas and more. To learn about the extremely valuable insights they brought to the table, please check out the On-Demand webinar, or listen to the adapted podcast version available below and on iTunes and Spotify.

WATCH THE WEBINAR LISTEN TO THE PODCAST
Read More

Topics: Data Center, Safety, beyond the product, operations

Keeping a Critical Eye on Critical Infrastructure

by Robert Leake on 5/20/20 4:50 PM

Reluctantly, today’s workforce is getting more accustomed to working from home, and data center operators are not immune to this shift in operational flexibility. This along with the impacts of Critical Infrastructure becoming more tangible, has made the need for a Data Center Infrastructure Management (DCIM) systems more apparent. This is not groundbreaking news for those of us in Infrastructure and Operations (I&O) who have always tended to on-prem conditions in a 24x7 environment. While this has historically had flavors of “managing from afar”, the extent of insight and control have been steadily increasing with the abilities enabled by today’s DCIM solutions.

vertiv data center 0513

Critical Infrastructure has typically been a world of hardware; physical equipment that is your last line of defense in keeping your facility operational. When problems arose, a tech would address the alarm through a physical inspection - or you’d call a Hardware Hotshot (like DVL) offering expertise in that particular problem. But organizations today expect much more than simply staying operational from their I&O Teams. Leadership Teams expect:

  • Risk reduction
  • Improving capacity management/forecasting
  • Increasing agile decision making
  • Compliance to federal regulations and corporate responsibility requirements

These expectations are made a reality thanks to the continual advancements in the DCIM landscape. From native solutions developed by manufacturers like Vertiv, to the after-market solutions supplementing information to management teams across everything from generator, to ATS, to UPS, to CRAC, DCIM enables users a greater sense of control created by granular specifics across the entire critical infrastructure equation. Availability of data is not only providing better insight into the overall performance of the data center, but in some instances it’s actually able to predict the problem before it arises. This data is improving threat management, response times, and paving the way for positive financial impacts to the business.

Most professionals (especially those in the world of operations) have asked themselves, “How can I do more with less?” Well, the old adage of “you can’t manage what you can’t measure” puts into perspective what is needed in order to get more positive results with less… dollars, efforts, and assets. Efficiency of a data center revolves around reducing waste and unnecessary overprovisioning of power, cooling, space, and IT resources. DCIM solutions allow you to tap into the data behind how your infrastructure is performing today, and helps you understand how you can better manage impacts of those variables tomorrow, resulting in improved financials.

While DCIM offers many paths to greener pastures, unfortunately, there have been plenty of DCIM projects that had to be cancelled due to damaging mistakes in solution selection and deployment. This challenge is indicative of why only 42% of data center operators are using a solution today, and emphasizes why DCIM should be a collaborative process across the entire organization. When an organization takes a collective look at the available data, connecting one dot to the next, decision making teams are able to recognize more opportunities for improvement and create a shared perspective on where the organization stands on:

  • Required Analytics
  • Must-Have Features
  • Agreed upon objectives
  • Security Policies (i.e. platform resiliency, data integrity)
  • Reporting and Mobility

These are only a few points to consider when looking at DCIM as more than just a technology. Technologies are tools to enhance the management philosophy of how you run the business, and how to maximize not just the equipment – but the people providing tangible results in the forms of efficiency and financial improvements. We discussed a few more of these considerations on a recent webinar, Keeping a Critical Eye on Critical Infrastructure; and covered how DCIM has exponentially improved throughout the years with our friends at Critical Labs and Packet Power. The panel discussion ended up being a great overview of the DCIM landscape, and the value-added impacts that are behind today’s data centers.

logo-final Packet Power logo-1

We invite you to listen to last week’s webinar to learn much more about DCIM and monitoring systems and how they can, in most cases, be easily integrated into your existing infrastructure.

ACCESS THE WEBINAR RECORDING

Read More

Topics: monitoring, beyond the product, packet power, critical labs

Understanding the Critical Infrastructure Behind Healthcare Facilities

by Jodi Holland on 4/23/20 3:22 PM

In recent years, the IT world has been seeing a movement to the Edge across most industries, especially in the realms of Finance, Legal, and Healthcare. Now, the Coronavirus pandemic has added a new variable to the Edge equation for Healthcare, as facilities across the country are constructing additions to their hospitals in support of testing and providing care. A recent DVL webinar addressed this rising concern, with many of the products and solutions discussed (and here below) being applicable to Edge environments in any industry.

healthcarehero

Unfortunately, there is no playbook for building these temporary facilities. While they do require the same types of critical infrastructure as facilities we’re used to creating, they demand an even greater sense of confidence in their operability. The aspects of building out temporary facilities are not terribly different than what is typically driving the demands of Healthcare IT: Electronic Records Management, Artificial Intelligence, and communications amongst staff and with patients and staff. The critical infrastructure supporting these applications must take into account an additional set of considerations for these temporary facilities:

  • Footprint Size
  • Power Requirements
  • Environmental Conditions
  • Procurement & Installation
  • Deployment Timeframe
  • Infrastructure Monitoring

If a Playbook were to exist on this topic, it would include chapters like:

  • Defining your specific applications
  • What do those applications require to operate
  • How do those requirements translate into critical infrastructure, and
  • How to build your infrastructure for efficient operations.

This is where DVL can be helpful to your project. Our commitment to going “Beyond the Product” means we don’t just sell you equipment and move to the next customer. Rather, we are here to help you connect the dots and present you with solutions to meet your objectives. Each project is its own unique venture as we work with you to define and understand everything from power and cooling requirements driven by your specific IT applications, to what type of rack is best suited for your peace of mind.

Temporary-Healthcare-Facilities-Buying-Guide-1No matter your industry, if you weren’t able to join us for the one-hour webinar, we invite you and your colleagues to watch the recording here. Or, you can download our Buyer's Guide for a better understanding of the aforementioned critical infrastructure considerations.   

Get the Guide by filling out the form below:

Read More

Topics: server room, healthcare, hospitals, beyond the product

DVL's Employee Ownership Culture

by Robert Leake on 3/16/20 11:47 AM

“There is no more profitable investment than investing in yourself.” - Roy T. Bennett

Our customers often mention the dedication of DVL associates as one of the many reasons they continually turn to us for their critical infrastructure needs. From our Sales Engineers’ ability to find unique ways to cost-effectively solve project challenges, to each DVL Technician’s diligence to quality for maintenance and emergency calls; the most significant ingredient in the DVL Secret Sauce may be that we are a 100% employee owned company.

DVL became partially employee owned in 2006, and eventually 100% employee owned in 2012. We are an organization that is driven by employee owners—subsequently, our Mission and Vision aren’t just arbitrary concepts, but are brought to life by an entire group of people inspired to achieve a shared success.

DSCN2741

In 1974, Congress passed The Employee Retirement Income Security Act (ERISA), which formally established a legal framework for ESOPs (employee owned companies). Since then, according to the National Center for Employee Ownership, the practice of employee ownership has proven to motivate employees, increase productivity, improve worker retention, and contribute to business longevity. If you directly benefit from the success of your company, you’ll be all-the-more motivated to succeed, and more importantly, encourage your co-owners to succeed as well. If their success is your success, teamwork is inherently engrained in everything you do!

To ensure we garner the very best from each and every one of our associates, from day one, we go all-in with investing in their development as a professional, and as a valued member of our team. So, just like many of the 6,400+ ESOPs in the country, we empower our people by educating, sharing, and involving. This includes the following measures:

  • Friend-tor Program: New associates are assigned a Friend-tor, another associate, by their manager, so they can have a colleague for one-on-one conversations if they have questions about the company or employee ownership.
  • ESOP 101: This course is held quarterly so new associates can take part in an in-depth lesson on ESOPs, and also how our business works. What effects the stock value? How does the performance of your department positively or negatively effect the bottom line? These questions and many more are addressed.
  • Finance 101: This course is held twice a year, and provides a foundation for associates to understand the performance figures that are shared with the company . This way, there are no surprises with the financials, and everyone understands what contributes to the stock price, which is determined once a year.
  • Lastly, we have the DVL ESOP Communications Committee, which bears the responsibility and mission of educating (and ultimately engaging) the employee owners of DVL Group. We strive to assure the committee is an accurate cross-organizational representation of the company so that all departments and offices have a voice.

From our own experiences as an ESOP, we can whole-heartedly agree with the NCEO’s findings. Afterall, as CEO Gary Hill likes to say, “we have careers here at DVL, not jobs.” Which is why going #BeyondTheProduct will always be our modus operandi.

 

Want a career at DVL?

Read More

Topics: beyond the product, employee owned, ESOP Association, National Center for Employee Ownership

Subscribe to Our Blog

Recent Posts

Posts by Tag

see all