Monitoring a data center is an important part of managing how it operates and making decisions for the future. Knowing what is important to measure and monitor is crucial, as the deluge of information and misleading metrics can begin to lead you astray. About one in three facilities will have an outage of four hours or more and many of those thought they had no warning. Proactively monitoring can help protect your facility from ‘unexpected’ outages due to unknown environmental conditions.
Air Conditions – Temperature & Humidity: Temperature gradients and fluctuations are important to track throughout a data center. Operating the data center temperature with empirical data is better than having C-level managers increase cooling just because the data center is warmer than a hallway outside. The temperature monitoring leads to understanding performance problems that can lead to shutdown, failure, or damage in older systems. Know the expected cold and hot aisle temperature averages. Find out where hot spots are and make plans to address them. Also watch out for high humidity in your data center. Unlike temperature, humidity will disperse throughout a room, which means that there is a risk of condensation that can cause shorts in sensitive electrical equipment.
Airflow & Quality: Proper airflow monitoring allows better management of data center performance and air cooling efficiency. This also overlaps with temperature monitoring, as sometimes hot spots can be caused by restricted airflow. Airflow and tempertaure should be viewed as early warnings of potential issues that if not addressed will lead to unacceptable conditions throughout the data center. Airborne contaminants, including smoke and other sources, are becoming a greater concern as recent studies have shown increases in failure rates based on higher particulate counts. Smoke detection and suppression systems should be monitored and tested regularly to ensure that false alarms or actual fire events are stopped quickly before causing catastrophic damage.
Power & Electrical Systems: Power and UPS systems impact reliability more than most other threats and knowing where anomalies exist is key to prevention. Monitoring the power systems enables foreshadowing of sags and spikes than can cause power-related issues. This monitoring should extend from the utility to data center and possibly down to the server if needed. In the same manner UPS systems should have monitoring in place for all of the components: generators; fuel systems; each battery string (and individual batteries); and all the electrical gear that are expected to function automatically when needed.
Monitoring Products: If you don’t have them yet and don’t have them in your budget, environmental monitoring equipment should be given a priority. Often when a monitor sends an alert about quickly pending changes they recover their investment in those crucial minutes of preventing a partial or full outage. Quality sensors that can quickly convey the data center conditions can be coupled with controls to understand what changes need to be made to prevent issues before they happen. However not all of the environmental systems should be expected to do this as some, such as leak detection, may be reactive instead of proactive.
Monitoring Software: Products and sensors often come with a trustworthy monitoring application to give reliable information about your data center conditions. As the industry evolves, the interfaces can likely be customized for your needs, the data can be accessed securely from remote locations, and alerts can be pushed anywhere. Many consider DCIM as part of the ultimate solution to environmental monitoring. This is partially true only if the DCIM has the capability to connect to the equipment. The benefits can far exceed the limitations, as a robust DCIM package may be able to optimize the power and space as well as overlay the expected cooling conditions.
Having a reliable monitoring system is the next step forward to not being surprised as often by environmental changes. More than anything getting a solution in place will allow better visualization of what you have; there are no perfect solutions, and having at least a notional idea of your data center is better than having no clue at all.