What is an IOP Anyway?

There is a lot of confusion and misguided discussion about IOPs, what they are, why they matter or do not matter, etc. Hopefully, this will not add to the confusion. The goal of this post is to expose a number of issues with IOPs, their measurement and what they mean in different contexts.

The trouble with IOPs begins with the very definition of the term. To most people it means a number of Input or Output Operations happening per second. Wikipedia has a meaningful write-up on what the term means, here.
The problem with this definition and most other definitions that sound very much like the Wikipedia one is that they do not tell you where these IOPs are measured and really why they matter at all. First of all, an IOP is very generically defined as Operation, but operation could mean different things at different layers of the client <-> server relationship. When we think about some network file protocol, such as NFS for example, there are a number of commands that NFS server and client use to talk to each other, to accomplish useful work. By definition each command issued by client would then be an IO, and a number of these commands per second would fit the definition of IOPs. However, only some of these commands result in actual physical IO to the storage. Most commands, while technically IOs, are metadata operations, such as looking-up existence of something we are trying to open, updating access time, etc. In extreme cases you may have more than 50% of IOPs resulting from metadata operations.

To be clear, some would argue that an IOP is something that results in physical IO to the storage, some would argue that any command which a storage protocol issues, whether or not there is any real IO happening is an IOP. Who is right? I suppose it depends. Unfortunately, there is no strict definition for what an IOP is, and so we have to accept, generically, that any command is an IOP, since any command will result in some kind of work happening on the system.

But this gets worse. Not all IOPs are same as you may have already read from another blog post here, or elsewhere, and if we strictly focus on those resulting in physical IO, i.e. Reads and Writes, we must understand that what is really key about these sorts of IOs is their size. Generally speaking IO-size is a vital metric, which we tend to completely ignore, yet we obsess over number of IOPs our storage can support, without realizing that sizes of IOs issued by clients will result in vastly different IOPs that storage systems can support. Same storage system will support vastly different number of IOPs depending on the size of IOs issued.

All storage systems are constrained by two basic realities. There are only so many operations that a system can do in any given time, so the larger a system the more it can do, up to a point. And, only some finite amount of data could be written out to the drives at any time, which is the throughput potential of a storage system. These constraints are typically at different points in the system, but we have to understand that they are universal, and we must accept this.

Sizes of IOs are not something we can easily control, as they are typically a product of applications performing work. Commonly, applications that want to have very low-latency response will issue lots of small IOs, which may be 4KB in size, a very common number. If you have many such apps, they will surely tax the system in terms of sheer number of operations a system can support, which could be limited by CPU, system memory, protocol implementation issues, etc. This constraint may be experienced as increasing latency and increased IO-wait as reported by applications. This means applications are spending time waiting for IO to complete.

On the flip side of same coin there are some apps that treat data differently, with primary goal being throughput, think backup operations. We want to have as high a throughput as possible, so our backups finish sooner, but lightning-fast response from each individual operation is not required. Large IO sizes result in far fewer IOs, but more work is actually being done, in terms of bytes transferred, than say many more IOPs that are much smaller. For example, a backup utility may be doing IOs in 64KB, 128KB, or 1MB units. Think about how many 4KB IOPs it might take to write 1MB, it is 256 operations, which we potentially achieve with a single 1MB IO. You can get the same work done with 256 small IOs, or 1 large IO, but doing this work with small IOs is likely to be far more taxing on storage, and will result in lower throughput versus the 1MB IO size.

This reminds us that we need to think about latency differently for small IOs and for large IOs. Typically, large IOs are those resulting from non-interactive use, such as backups or volume to volume data transfers, and their latency is not important, since we are not looking to complete each such IO as quickly as possible. Small IOs however, are the result of commonly interactive operations, such as a database query, a lookup of metadata on a file, update of a file on save, etc. These IOs we actually want to see happen as fast as possible, because they are interactive, and we have users impatiently waiting to do the next thing. Unfortunately, when we talk about IOPs and how many IOPs a given system can handle, we almost never see any mention of either what we call an IOP, i.e. do we consider metadata operations or not, and the sizes of these IOPs. Truth is, on ANY production system that is shared by multiple users the sizes of IOs are very much distributed anywhere from 512b to 1MB or larger, all the time.

The hardest question to answer is how do we figure out how many IOPs and of what size we need to support to satisfy our environment. There is no panacea, unfortunately. We also have to bear in mind that we must be able to support a gamut of IO sizes, from large to small, because at any moment our environment can sharply turn from latency critical to throughput biased, and back again. In part, this is complicated further by the fact that we don’t tend to know what our environment produces. Because we don’t know what we need, we cannot possibly ask for it. The only thing we can do is experiment, test, measure. Better storage vendors out there can help you measure what happens on your array, how much of the work is random, how much is sequential, what the dominant IO sizes are from clients, etc. Working closely with your storage vendor is the way to go, and it they are unable to help, consider giving us the opportunity.