While the cloud has helped decrease server outages, it cannot entirely eliminate them. Additionally, even if elements of your infrastructure have been migrated to the cloud, there is a high possibility you are still running some servers on-premise.
We will explain how to effectively maintain these servers in this article. I’ve assisted two businesses with consolidating their servers over the last couple of years. We upgraded part of the existing gear with new Xeon-based servers running Windows Server. I learnt a lot during that process, including how to do appropriate server maintenance — and the consequences of failing to do so. I hope you found this post informative.
Keep your OS Updated
It seems to be so apparent, yet it is often overlooked. Thus, all it takes is a malicious piece of malware such as the WannaCry worm to capture the public’s attention. WannaCry primarily targeted unpatched Windows 7 PCs, but it also targeted certain Windows Server 2003 servers. There are many concerns at stake here. To begin, ensure that you are running a version of Windows that Microsoft continues to maintain via regular patch releases. You must then maintain that operating system up to date. I talked with a number of individuals who were unaware that Microsoft had discontinued support for Windows 7 users.
I’ve spoken with far too many IT professionals who, after they’ve gotten their servers operating correctly, are unwilling to touch them. A few go as far as to disable the Windows update service, which is a formula for catastrophe. Patch testing in a VM is time-consuming. Microsoft has previously issued flawed fixes. That is not to say you should not strive to maintain all of your systems current. You may not encounter problems for years. Running unpatched servers, on the other hand, will ultimately catch up with you.
Users have more control over how and when updates are deployed in the latest versions of Windows. If you’re curious about how Microsoft updates Windows Server 2016, check out this Redmond Magazine article.
Physically Clean Your Server
Nonetheless, I store my server in a locked cabinet! That is an excellent start. If you are fortunate enough to work for a business that offers server racks, cabinets, and a suitable environment for all of the company’s servers, then you should be grateful to your CEO. Even if your business offers all of this, your servers may inhale dirt and dust, impairing performance and dependability. Today’s high-performance CPUs and GPUs will automatically downclock if they lack sufficient cooling.
A high-quality server is equipped with powerful fans that circulate air over and around crucial components. However, with so much power, the fans have the potential to suck dirt and dust inside the casing. I went to a dentist’s office a few years back to assist him with upgrading his server. He said that he never took it out of the beautiful glass cage he kept at the rear of his office. The server was running his patient management software and was restarting intermittently during the day. I inquired as to the last time he cleaned the case filters.
When he just looked back at me, I had my answer. When I withdrew the server from the enclosure, I discovered his case filters were clogged to the point that the server was self-throttling owing to the case’s heat.
I’ve cleaned both desktop and rackmount server cases using compressed air. Take care not to harm the fans while blasting compressed air through them. Ensure that you remove and clean all filters from your server case. Some of the more recent instances have bottom-mounted filters in addition to top-and rear-mounted ones.
Virtualization Helps Server Maintenance
Do you recall the backup server’s heyday? I recently talked with a day trader who, despite the additional expenses and management, is still a supporter of the backup server. Fortunately, we live in an era where almost any server can be virtualized. Indeed, it would be prudent to virtualize every service possible. Why? Because it is so simple to set up a backup virtual machine these days. Consolidating several servers operating on older hardware into a single virtual machine on newer hardware almost always results in increased uptime.
As far as I am aware, not all servers can be virtualized. Occasionally, licencing, performance and hardware constraints prohibit this. That provides a plethora of options for virtualizing the servers that make sense. You should not virtualize a list of servers.
Check Logs for Hardware Errors
If left unchecked, bad components may bring a server to its knees. Hardware problems often manifest themselves after POST and after Windows has launched all of its services. As part of your server maintenance plan, check the system logs for hardware problems. You may discover that upgrading a GPU or RAID card driver resolves the problem. If the problem continues, the component must be replaced.
RAID Controllers, like this model from LSI, run very hot
It is not a bad idea to delete any unused PCI-E cards or discs. Server hardware is designed to operate continuously. I notice no problems with CPUs, motherboards, or RAM. Even today’s GPUs often operate without problem for years. However, I’ve seen my share of failed power supply, fans, and expansion cards over time. RAID cards are infamous for overheating, which significantly reduces their lifetime. It’s also a good idea to keep an eye out for system problems. However, I’ve seen that unchecked hardware failures are much more likely to bring a system down.
Backups should be verified
Thus, you’ve established a schedule for server backups. Each week, you verify that the backup server is operating properly. However, are you spending additional time verifying that your backups work? Verifying the backups’ integrity is often the most neglected phase in the server backup procedure. How do you accomplish this? To begin, you’ll want to do several test recoveries until you’re confident with your approach. In the future, spot inspections may suffice.
If you’re outsourcing backups to a cloud provider, it’s critical to understand how they check backups. Backup locations, schedules, and recovery durations are all important components of a robust backup strategy. You should have a solid grasp on all of these aspects regardless of whether the service is provided by your team or a third party. As I previously said, you want to utilise tried and true methods when your reputation is at stake.
Numerous variables contribute to your services operating effectively and with the least amount of drama possible. Some of the simplest recommendations are often overlooked. Maintaining your server off the floor might seem to be noticeable. I continue to visit businesses that have one or more servers operating off the floor. Finding an appropriate location for your server should be your first priority.