Monitoring Disk Usage in Windows Server

Hard disk controllers and disk drives  are the two primary components of the disk subsystem. The two objects which gauge hard disk performance are Physical and Logical Disk. Despite the disk subsystem becoming more an powerful, they are still the most common performance bottleneck as their speeds are exponentially slower than other system resources.

In the Windows Server Resource Monitor’s Disk tab, in Windows Server 2008 R2 the physical and logical disk counters are enabled by default . The Disk section in Resource Monitor, shown below,  gives a decent high-level overview of the current combined physical and logical disk activity. For more fine-grained monitoring of the disk activity, you should consider using the Performance Monitor component with the desired counters in the Physical Disk and Logical Disk sections.

Monitor Disk Usage

Monitoring using the Physical and Logical Disk objects comes with a small price however as each object uses a small amount of system resources  when they are used for monitoring. As such, they should be disabled unless you are using them for monitoring purposes.

The most useful  counters to monitor the disk subsystem are the % Disk Time and Avg. Disk Queue Length counters.

  • % Disk Time  monitors the time that  a certain physical or logical drive uses in servicing the read and write requests.
  • Avg. Disk Queue Length counts the number of requests which have not yet been serviced on the physical or logical drive. The Avg. Disk Queue Length is an interval average and therefore is a numerical representation of the number of delays the disk drive is having. In general, if the delay is often higher than 2, the disks are inadequate to service the system workload and  performance may be compromised.

Monitoring Processor (CPU) Usage in Windows Server

To analyze the processor (CPU) utilization of your system you should focus on  two counters - % Processor Time and Interrupts/sec.  % Processor Time shows the percentage of overall CPU utilization. If there is more than one  processor  on a system, a counter for each one is shown as well as the total (combined) value counter. If % Processor Time averages a usage rate of over 50% for extended durations, you should first review other system counters to try and identify  processes which may be improperly using the processing resource or alternatively consider upgrading the processor. Consistent CPU utilization around the 50% range does not necessarily impair performance, however, the average processor utilization goes beyond 65%  performance will almost certainly be impaired. If the system has multiple processors installed, you should use the % Total Processor Time counter to determine the average usage of all processors.

Interrupts/sec is useful for providing an overall  guide of processor health. This counter indicates the number of device interrupts which the processor  is handling per second. Similar to the Page Faults/sec counter  this counter can show very high numbers (well into the thousands) without there being a significantly performance drag.

In general, conditions which could indicate a processor bottleneck include the below:

  • “Average of % Processor Time” is consistently beyond 60%–70%. Additionally, spikes which frequently occur frequently of 90% or greater can also indicate a bottleneck even if the average is below  60%–70%.
  • “Maximum of % Processor Time” is consistently beyond 90%.
  • “Average of the System Performance Counter; Context Switches/second” is consistently beyond 20,000.
  • “System Performance Counter; Processor Queue Length” is consistently higher than two.

Continues…