How to Troubleshoot Overheating in CRAC Units with Thermal Imaging

By Justin Sheard, Product Manager, Thermal and Acoustic Imaging

Data centers power our digital world but consume massive amounts of energy, generating significant heat. When this heat is not controlled with CRAC units, it can harm delicate electronics and lead to costly downtime.

Common Causes of Overheating in CRAC Units - Graphic

Overheating is a common failure mode for CRAC units. If left unaddressed, overheating can lead to downtime and wasted energy as units struggle to maintain the correct environment. Promptly identifying the issue is essential not only to protect valuable data center infrastructure but also to avoid downtime and SLA penalties.

This article provides a practical, tool-based approach to diagnosing overheating issues.

Common Causes of Overheating in CRAC Units

Four common causes can lead to overheating CRAC units:

  • Electrical Connection Faults: Electrical connection faults in fans or motors can disrupt proper operation. This causes CRAC units to overheat by reducing airflow or by forcing components to work harder than designed. These faults, such as loose wiring or short circuits, may lead to inefficient cooling and increased heat buildup within the unit.
  • Blocked or Poorly Managed Airflow: Clogged filters or obstructed ducts restrict the CRAC unit's ability to circulate cool air effectively. In data centers, airflow issues often extend beyond physical blockages. Poor hot-aisle/cold-aisle separation, bypass air leaks under raised floors, and misplaced perforated tiles can all cause cooled air to mix with return air before it reaches the equipment. These conditions reduce cooling efficiency, force CRAC units to work harder, generate excess heat, and accelerate wear. Even small airflow management problems can create persistent hot spots that thermal imaging helps identify.
  • Failing Bearings or Misaligned Fan Assemblies: Failing bearings or misaligned fan assemblies in CRAC units disrupt smooth operation. This increases friction, generates excess heat, and inhibits efficient energy transfer. These conditions strain components, which leads to overheating and potential system failure.
  • Voltage Imbalance/Power Quality Issues: Inconsistent electrical supply or fluctuating voltage in CRAC unit components like motors and compressors can cause overheating, reduced efficiency, and damaged components.

How Thermal Imaging Helps

Regular inspections with a thermal imager can reveal overheating components and help technicians identify problem areas before they lead to damage or downtime.

Thermal imaging allows for a fast, non-contact inspection of electrical panels, motors, ducts, and surrounding areas. Thermal imagers quickly identify hot spots that may indicate increased resistance, imbalance, or blocked airflow.

Predictive maintenance with a thermal imager helps technicians detect overheating issues in CRAC units early. By identifying abnormal heat patterns in components, it enables proactive repairs and prevents a full system shutdown.

Step-by-Step: Thermal Imaging of CRAC Units

As a starting point for creating your specific inspection procedures, review the industry standards that currently exist (such as NFPA and NETA). Establish baseline thermal images under normal operating conditions to determine typical operating temperatures. The thermal images serve both as a baseline image for comparison of temperatures but also as an example of what image to capture during future inspections. Compare the new thermal images to the baseline image to identify temperature differences (delta-T (ΔT)) of components with the baseline which identify potential issues.

If a baseline was not established, it can be useful to compare thermal image patterns with other known equipment of the same type in good condition. Manufacturer specifications can also be used to determine if a thermal reading is outside recommended parameters.

Step 1: Power On and Load the CRAC Unit Under Normal Operation

Turn on the CRAC unit and operate it with electrical components under load. Be aware of environmental factors that can affect thermal readings such as wind, air currents, sunlight and reflective surfaces. These factors can create false hot or cold spots. This allows the thermal imager to display the most accurate thermal image of what occurs during operation.

Step 2: Use a Fluke Ti480 PRO Thermal Imager

Scan critical electrical and mechanical assets such as:

  • Electrical panels: Check for loose or overheated connections. These will appear as localized hot spots compared to surrounding components.
  • Motors and bearings: Scan the motor casing, bearing areas, and compressor body for hot spots or uneven heating patterns. Localized heat around bearings can indicate lubrication issues or misalignment, while elevated temperatures across the compressor shell may signal overloading, restricted refrigerant flow, or impending motor failure.
  • Airflow and ductwork: Check air outlets and ducts to verify cooling performance and identify airflow issues. Compressor inefficiencies or blocked filters may cause warmer-than-expected outlet temperatures. Check for temperature cold spots to identify leaks in the ducts, or warm spots that may indicate poor insulation.

Step 3: Record Results

Use Fluke thermal imaging software to record thermal imaging results and temperature data from CRAC unit components, such as compressors, fan motors, and air outlets. This documentation serves as a reference point during repairs and confirms that completed repairs have solved the problem. Stored data enables comparisons with baseline readings or manufacturer specifications during future inspections. This helps detect warming trends in components that may indicate developing issues such as bearing wear or airflow restrictions. Early identification allows proactive heating, ventilation, and air conditioning (HVAC) troubleshooting to prevent costly downtime.

Step 4: Perform Further Testing

If vibration or mechanical imbalance is suspected, use a Fluke 810 Vibration Tester to assess overall vibration levels, check fault severity, and obtain recommendations for corrective actions.

Compliance and Efficiency Impact

Regular inspections with a thermal imager help data centers maintain temperature SLAs and uptime certifications. They help technicians see early signs of component stress, airflow imbalances, and localized hot spots before these issues escalate into system-wide cooling failures.

Using thermal imaging to troubleshoot CRAC units helps data centers align with ASHRAE Technical Committee 9.9 guidelines. These guidelines emphasize efficient thermal management and reliable cooling in data centers to maintain equipment performance and minimize energy waste. Thermal imaging also supports ISO 50001 energy efficiency goals by allowing proactive identification of issues like hot spots or airflow restrictions and reducing unnecessary energy consumption.

Summary: Best Practices and Optional Tools

  • Pair Fluke Ti480 PRO thermal inspections with other tools, such as the Fluke 376 FC Clamp Meter to check motor current draw, or the Fluke 1770 PQ Analyzer to identify electrical supply issues.
  • Use eMaint™ CMMS to track inspection findings, schedule maintenance, and store documentation for audits. The CMMS can integrate with your DCIM software to allow continuous monitoring and real-time alerts through a single platform.

About the Author

Justin Sheard is an accomplished product development leader specializing in thermal and acoustic imaging technology, particularly in preventive maintenance applications. With multiple patents and published works, he contributes significantly to the industry. He is dedicated to shaping the future of preventive maintenance through innovative imaging solutions that help maintenance professionals prevent unplanned downtime and improve operational efficiency. Connect with Justin on LinkedIn.

You might also be interested in

Chat with ourFluke assistant
Clear Chat