Using iostat for monitoring disk activities

There could possibly be a lot of reasons for application slow down. Identifying the cause for the slow down could be a bit tricky. iostat is a tool that helps in monitoring I/O activities in the system, which might have been caused your application slowdown.

iostat helps to monitor I/O activity on a per-disk and per-partition basis. There are a number of options that might suite your particular need. But I find the ones below to be good enough for my needs:
iostat -x -n -p -z 5 10

-x : Show me extended statistics for each disk.
-n : Don't show cryptic names of devices, if possible show readable names.
-p : Show per device statistics and per partition statistics.
-z : Don't show me the rows that have all zeros in them.

Let us take a sample output and explore.

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.8 0.0 10.8 0.0 0.0 0.0 0.6 0 0 c2t7d3s6

The following is what the 'man iostat' has to say regarding the columns above:

device
name of the disk

r/s reads per second

w/s writes per second

Kr/s kilobytes read per second

Kw/s kilobytes written per second

wait average number of transactions waiting for service
(queue length)

actv average number of transactions actively being serviced
(removed from the queue but not yet completed)

svc_t average service time, in milliseconds

%w percent of time there are transactions waiting for
service (queue non-empty)

%b percent of time the disk is busy (transactions in pro-
gress)

wsvc_t is the average time spent in the wait queue and asvc_t is the average time spent being serviced.

There are a couple of things that are important to us:
  • If your application is performing too many random reads/writes, you will find that the first four columns will have high values. (What is high is dependent on your system! There is no universal number.)
  • As a result, you will find that the wsvc_t and asvc_t to be high too.
Here comes the tricky part: how will you know if these numbers go high, it is due to your application? To a reasonable extent, you could find out.

First, make sure that you are looking at the right device/partition where your application is doing reads/writes. You could use mount, and find out the device which is having the directory you are interested in.

Second, as much as possible you should try to isolate the numbers on a per partition basis, rather than on a per deice basis. Per device statistics are aggregations over all the partitions under them. For e.g. monitor c2t7d3s6 instead of c2t7d3, as you will get a slightly more accurate picture.

The following are some sample outputs of iostat that would help you to do a comparison.


extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
27.8 181.6 497.8 4748.0 124.0 211.6 592.0 1010.5 72 100 c2t7d3s6

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
216.2 167.0 5534.2 5363.0 0.3 88.7 0.8 231.4 2 100 c2t7d3s6

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.2 0.0 0.2 0.0 0.0 0.0 11.1 0 0 c2t7d3s6

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 2.8 0.0 1.5 0.0 0.0 0.0 8.6 0 0 c2t7d3s6

The first two snap shots were taken when the system was having heavy IO activity, and the latter two were taken when there was not much of an IO activity. Compare the wait times and bytes read/written. It is interesting to ask the following questions:
  1. Why is the waiting time in the queue is much lesser than the time being serviced? And what would the reverse of this case mean?
  2. Let us assume that the device can serve 2N fixed sized IO requests in the given time interval. Consider two applications making N requests each. How different this would be from one application making 2N requests? Using which parameter you could distinguish these two cases?
Okay. Once you have identified that your application is choking IO, what is the next step?

One strategy that you could follow to make your application better is to keep your design such a way that the read/writes are sequential instead of random. This might not be possible always. At least, try to reduce the spread of random read/writes by using strategies like always using the lower addressed blocks. I have encountered applications that use linked list to keep track of free memory blocks on the disk, where I changed the the linked list to a min-heap that proved to be improving performance. But using a min-heap too has its downsides :-)

Comments

Unknown said…
Don't forget the -r flag to output to CSV.
It has even helped students around all those possible values and hopefully for the future these would either govern a better platform for them to move around.

Popular posts from this blog

Gotchas with DBCP

A note on Java's Calendar set() method

The mysterious ORA-03111 error