How To Troubleshoot High Load in Linux Web Hosting Servers?
For every problem, there is a “Right Approach” towards that. But, when people confront with problems in their life, they seem to be confused and embarrassed. Our reaction to the problem is important. At times, an immediate action is better than a late solution because the issue may get complicated as time progresses. It is applicable to servers too. Even though server solutions are an evolved system, administrators confront with severe server spikes. Of course, server load issues can be resolved, but spending more time for fixing the issues is the problem. In businesses, even the down time for a short period of time may affect the sales and ultimately the credibility of the company. In our blogs, we usually bring to light issues pertaining to server platforms. At this time we are addressing Linux server platform with a solution to the most chaotic server load spikes.
The Million Times Heard! Still, What Is This Server Load Spikes All About?
Every server works with a limited set of resources. Let’s consider a server with 8 GB RAM, 75 IOPS SATA II hard disks, 4 processors, and 1 Gigabit NIC cards. Assume that a user has decided to back up his/her account which occupies 7.5 GB of RAM; others have to wait for that process to get over. The wait in the queue becomes more as the backup process becomes longer. Here the wait is represented as server loads.
How To Fix Server Load Spikes In Linux Web Hosting Servers?
Server load needs to be resolved quickly because each second, the number of processes will be queuing up one after the other. The server becomes non-responsive and leads to a reboot once your commands take longer time to execute.
The recovery should happen within the first few minutes. So for this, a 24/7 monitoring is necessary. Always start from ‘what you know’ and go to ‘what you don’t’. You will be aware of the server resources such as RAM, CPU, I/O. One of the resources will be abused and you have to find it out. The next step is to track which service is using that resource. It can be a database server, mail server, web server or other services. From the service, you can identify the user who is actually abusing the server.
Let’s discuss the process in detail.
The command ‘atop’ is an ideal tool for you, if you are troubleshooting a physical server or a hardware virtualized instance. If you are operating on an OS virtualization environment, then ‘top’ command is suitable for you. It is recommended to start with ‘vztop’ command to troubleshoot server load in VPS node. Even the commands and methods used are different the ultimate goal is to locate the overloaded resource such as Memory, Disk, CPU, Viz, and Network. Never jump to a conclusion immediately, instead observe for at least 30 secs before deciding on which resource is being hogged. Use ‘i’ switch to only see the active processes if you are using top and to spot the full command line use ‘c’ switch. To see the wait average and to know if it’s a non-CPU resource hog, use %wa in top command. To identify the existence of any suspicious processes use ‘pstree’ command. For identifying multiple connections from one particular IP, use ‘netstat’.
The next step is to track down the service which is hogging the resource. To sort out the overloaded service also, we can use commands such as ‘atop’. The utilities such as ‘atop’ and ‘top’ are suitable for checking CPU usage during the tracking of overloaded service. Likewise ‘nethogs’ is the best utility for checking network usage. The third step of iteration is tracking the virtual host which is becoming the reason for the server load. The individual access logs are the best place to start service specific troubleshooting. The virtual host which is taxing the service will be available by increasing the log verbosity. So through the three stages of sorting the exact source of the server load spike can be identified.
A disciplined approach is required for troubleshooting the server load spikes. The three level processes (on resource, service, and host levels) help the server administrator to track the hogging source. As mentioned before the way for reaching the exact point of server load is to start from ‘what you know’ and go to ‘what you don’t’. The practice of checking all command output in normal server helps the Linux server administrator to identify what went wrong. In spotting the source of server issues, there are specialist tools to use in different instances. Server load spikes require immediate resolution because as long as the delay exists, it affects the business of the enterprise. So tracking and resolving the delays in its initial stage help enterprises to avoid huge business loss and credibility.