Friday, March 27, 2015

EHR Performance Issues Hard Drive Bottleneck HDD Active Time HDD Response Time Disk Queue Length

This post is about EHR performance issues as a result of a hard drive bottleneck. While this post relates to a server environment, the same principles apply to personal computers running Windows.

EHR performance issues can arise for many different reasons including bad network performance, bad client side performance, faulty configurations, etc. As ruling out all the possible causes of EHR performance issues can prove tedious and timing is critical, I recommend closely monitoring the server hardware often to address possible hardware problems before they occur. I do so daily. In fast-growing health centers this is especially important to avoid overloading the system due to an increase in the number of supported users. When EHR is first deployed, the deployment team will usually conduct a detailed analysis of the organization and usage needs to allocate proper resources to the server hardware. As these systems are tailored based on that early assessment, if the EHR implementation is planned to support between 100 - 250 users, increasing the number of EHR users in two years to 350+ may cause EHR performance issues depending on the scalability that was built in.

In the image below, notice that there is sufficient drive space available on the server.


 But drive space is not the issue in this scenario. When looking for possible EHR performance issues due to a hard drive bottleneck, it is important to know how many physical disks are part of the array. Using Windows Resource Monitor, we can determine that there is an insufficient amount of hard drives to respond to Input/Output (I/O) requests. Notice that for Drive E, Active Time and Response Time are too high.



 EHR Performance Issues – Hard Drive Bottleneck Active Time is the percentage of time that the disk is in use. If Active Time is constantly reaching 100%, this is strong indication that the disk cannot handle the current load. Response Time is the time it takes for the disk to respond to I/O requests. The lower the number is, the better. Anything above 25ms is bad, although this can depend of the size of data being written and retrieved. For additional information regarding Windows Resource monitor click here.

The problem we are seeing in this example is that read/write requests are too many for the two disks that compose the array to service. Thus, if you look at the Disk Queue Length you will notice that there is too high a number in queue causing the drive to be constantly at 100%. This will cause lag in changing tabs, viewing documents, saving changes, closing nurse encounters, etc. The Disk Queue Length should never be more than 2 per physical disk in the array as that is a sign of a hard drive bottleneck. Thus, since Drive E is composed of two disks, the Queue Length for Drive E should not exceed 4; if say, a disk array is composed of four disks, then the Disk Queue Length should not exceed 8. As seen in this example, Drive E Disk Queue Length was constantly at between 8 and 10.

In this scenario, adding two more physical disks to the array will allow for better response time since more disks will be available for servicing requests. This in turn reduces Response Time, Queue Length and Active Time eliminating EHR performance issues caused by a hard drive bottleneck.

No comments:

Post a Comment