Using esxtop to identify storage performance issues

The esxtop utility can be used to measure how much I/O is moving across various devices. The esxtop utility is interactive. As such, pressing certain keys changes the view.

Configuring monitoring using esxtop

To monitor storage performance per HBA:

  1. Start esxtop by typing esxtop at the command line.
  2. Press d to switch to disk view (HBA mode).
  3. Press f to modify the fields that are displayed.
  4. Press b, c, d, e, h, and j to toggle the fields and press Enter.
  5. Press s, then 2 to alter the update time to every 2 seconds and press Enter.
  6. See Analyzing esxtop columns for a description of relevant columns.

To monitor storage performance per LUN:

Note: This option is only available in ESX 3.5 and later.

  1. Start esxtop by typing esxtop from the command line.
  2. Press u to switch to disk view (LUN mode).
  3. Press f to modify the fields that are displayed.
  4. Press b, c, f, and h to toggle the fields and press Enter.
  5. Press s, then 2 to alter the update time to every 2 seconds and press Enter.
  6. See Analyzing esxtop columns for a description of relevant columns.

To monitor storage performance per virtual machine:

Note: This option is only available in ESX 3.5 and later.

  1. Start esxtop by typing esxtop at the command line.
  2. Type v to switch to disk view (virtual machine mode).
  3. Press f to modify the fields that are displayed.
  4. Press b, d, e, h, and j to toggle the fields and press Enter.
  5. Press s, then 2 to alter the update time to every 2 seconds and press Enter.
  6. See Analyzing esxtop columns for a description of relevant columns.

Analyzing esxtop columns

The following table lists the relevant columns and a brief description of these values.

Column Description
CMDS/s This is the number of IOPS (Input/Output Operations Per Second) being sent to or coming from the device or virtual machine being monitored
DAVG/cmd This is the average response time in milliseconds per command being sent to the device
KAVG/cmd This is the amount of time the command spends in the VMkernel
GAVG/cmd This is the response time as it is perceived by the guest operating system. This number is calculated with the formula: DAVG + KAVG = GAVG

These columns are for both reads and writes, whereas xAVG/rd is for reads and xAVG/wr is for writes. The combined value of these columns is the best way to monitor performance, but high read or write response time it may indicate that the read or write cache is disabled on the array.   All arrays perform differently, but DAVG/cmd, KAVG/cmd, and GAVG/cmd should not exceed than 10 milliseconds (ms). These values should not exceed 20/30 ms for a sustained period of time.

Note: ESX 3.0.x cannot monitor individual LUNs or virtual machines. Many inactive LUNS on the HBA can lower the average of DAVG/cmd, KAVG/cmd, and GAVG/cmd.    These values are also visible from the VirtualCenter performance charts. For more information, see Performance Charts in the Basic System Administration Guide.   If you experience high latency times, look at the switches (either FC or TCP) and the SAN for errors that may indicate a delay in commands being sent to and acknowledged from the SAN. This includes the array’s ability to process IO’s from a spindle count aspect, or the array’s ability to handle the load being presented to it.   If the response time goes over 5000 ms (or 5 seconds), SCSI aborts occur in the logs. If a command is sent to an array and is not acknowledged within 5000 ms, the command is aborted. Abort messages and other SCSI errors can be seen on the following logs:

  • ESX – /var/log/vmkernel
  • ESXi – /var/log/messages

The type of logs you see on those files depend on your Advanced Options SCSI.Log* or SCSI.Print*. You can find the value of these options in Host > Configuration > Advanced Settings > SCSI > SCSI.Log* or SCSI.Print*.

Source : http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1008205

Posted in VMWare and tagged .

Leave a Reply