Hard disk controllers and disk drives are the two primary components of the disk subsystem. The two objects which gauge hard disk performance are Physical and Logical Disk. Despite the disk subsystem becoming more an powerful, they are still the most common performance bottleneck as their speeds are exponentially slower than other system resources.
In the Windows Server Resource Monitor’s Disk tab, in Windows Server 2008 R2 the physical and logical disk counters are enabled by default . The Disk section in Resource Monitor, shown below, gives a decent high-level overview of the current combined physical and logical disk activity. For more fine-grained monitoring of the disk activity, you should consider using the Performance Monitor component with the desired counters in the Physical Disk and Logical Disk sections.
Monitoring using the Physical and Logical Disk objects comes with a small price however as each object uses a small amount of system resources when they are used for monitoring. As such, they should be disabled unless you are using them for monitoring purposes.
The most useful counters to monitor the disk subsystem are the % Disk Time and Avg. Disk Queue Length counters.
- % Disk Time monitors the time that a certain physical or logical drive uses in servicing the read and write requests.
- Avg. Disk Queue Length counts the number of requests which have not yet been serviced on the physical or logical drive. The Avg. Disk Queue Length is an interval average and therefore is a numerical representation of the number of delays the disk drive is having. In general, if the delay is often higher than 2, the disks are inadequate to service the system workload and performance may be compromised.
To analyze the processor (CPU) utilization of your system you should focus on two counters - % Processor Time and Interrupts/sec. % Processor Time shows the percentage of overall CPU utilization. If there is more than one processor on a system, a counter for each one is shown as well as the total (combined) value counter. If % Processor Time averages a usage rate of over 50% for extended durations, you should first review other system counters to try and identify processes which may be improperly using the processing resource or alternatively consider upgrading the processor. Consistent CPU utilization around the 50% range does not necessarily impair performance, however, the average processor utilization goes beyond 65% performance will almost certainly be impaired. If the system has multiple processors installed, you should use the % Total Processor Time counter to determine the average usage of all processors.
Interrupts/sec is useful for providing an overall guide of processor health. This counter indicates the number of device interrupts which the processor is handling per second. Similar to the Page Faults/sec counter this counter can show very high numbers (well into the thousands) without there being a significantly performance drag.
In general, conditions which could indicate a processor bottleneck include the below:
- “Average of % Processor Time” is consistently beyond 60%–70%. Additionally, spikes which frequently occur frequently of 90% or greater can also indicate a bottleneck even if the average is below 60%–70%.
- “Maximum of % Processor Time” is consistently beyond 90%.
- “Average of the System Performance Counter; Context Switches/second” is consistently beyond 20,000.
- “System Performance Counter; Processor Queue Length” is consistently higher than two.
Available memory is usually the most common source for performance issues on a Windows Server installation. Fortunately, however, it is an easy metric to measure since there are several counters in the memory object which can help troubleshoot memory issues. Most notable there are, two very important counters which provide a reasonably accurate overview of memory pressures, namely Page Faults/sec and Pages/sec memory. Just using these two memory counters alone can highlight if the system is correctly configured and experiencing memory issues. The below are the counters necessary to monitor memory and pagefile usage.
- Committed Bytes – monitors the amount of memory (in bytes) which has been allocated by the various processes. As this increases above available memory so does the pagefile size since paging has increased.
- Pages/sec – Shows the number of pages which are read from or written to the disk.
- Pages Output/sec – Shows the virtual memory pages written to the pagefile per second which can help to identify paging as a bottleneck.
- Page Faults/sec – Reports both the soft and the hard faults.
- Working Set,_Total – Shows the amount of virtual memory which is actually being used.
- %pagefile in use - Shows the percentage of the paging file which is actually being used which can be used to check if the Windows pagefile is a potential bottleneck. If this consistently remains above 50% or 75% you should consider increasing the pagefile size or alternatively moving the pagefile to a another disk.