One of the most common problems found in Linux and sometimes Windows (especially pre Windows 2008) is poor I/O performance that seems unexplainable even after repeated digging. I have found that most often misaligned LUNs or disk partitions are to be blamed especially if you have migrated to a new storage like Netapp or added a few disks. So what is a misaligned LUN? In very simple terms a misaligned LUN is a LUN whose I/O boundaries don't match with the offset calculated by the OS when the OS writes to the LUN or disk. Most writes/reads to the storage on filers like Netapp are in 4K (or some multiple of 4 usually) increments. If the offset calculated by the OS from where it would write the first byte is not a number divisible by 4 then you have a misaligned LUN. An easy way to check for misaligned LUNs in Linux is to run a fsdik -l -u (on ONTAP for Netapp you can probably use lun show -v) on the device in question and see what the starting sector for that LUN as seen from the OS is. For example if the starting sector is 62 or 63 then you know both these are misaligned since they are not a multiple 4. You will then need to change the starting sector for this LUN to be 64 or 128 so that it aligns with 4K boundary when I/O by the storage is done to the disk. Similarly if you are hosting a VMware ESX environment it is very crucial that you align the vmdks on the VMFS Datastore to the correct boundary on the LUN or partition or use the VI Client to create the VMFS partition since using the tool automatically aligns the partition.
So what happens when you have a misaligned LUN or partition? Well, the simplest answer is you will force your storage array to do double the amount of work for every I/O. Since the storage writes/reads in chunks of 4K or multiples thereof let's say, then some of these 4K chunks will be written in fragments across the boundaries in the disk because of misalignment. This means it will read part of the 4k from one boundary and since the remainder of this 4k is on another boundary, the storage will have to do an I/O again. All of this will mean that as your transaction load increases for I/O on the server, your filer will not be able to keep up and return data slower as seen from your sar or iostat report with very high disk service times.
Now comes the big issue, if you have data already on the misaligned LUN then you are screwed because you will first need to shut you applications and move the data (using a storage provided SNAP or BCV copy is best and fastest instead of copying manually using cp, tar, cpio etc) to either an aligned LUN that can be the permanent place for the data or you will need to copy data from the temporary place back to the original partition or LUN once it is aligned (you can use fdisk in Linux or diskpart in Windows or mbralign/mbrscan if Netapp). Aligning essentially involves repartitioning the LUN which makes it very destructive hence the need for data save. If you have several gig or terra bytes worth of data (especially if using large vmdks for VMs) on a misaligned LUN then the downtime needed to copy will be large and painful. So in summary best to capture LUN misalignments before you start using the partitions or LUNs and save yourself a heap of trouble.
Comments