Hardware Rarely Solves Database Performance Issues
Database Performance Use Case
Databases power mission critical applications at every company and when there are performance issues, very frequently companies decide to purchase new or upgrade their existing hardware to fix the issue. But the issue is that most database issues are not solved by upgrading the hardware based on our experiences. We have a saying that new hardware might increase the performance by 50 – 75% long term while database indexing and code changes can increase the performance by 1,000% plus.
When Fortified Data started working with a nationwide critical Healthcare provider, the client was experiencing painfully slow turn-around times in both their billing and payment processing systems. By properly diagnosing the problem and understanding every solution option, Fortified Data was able to help them get their daily business back on track.
The Challenge
The Database Couldn’t Keep Up With Daily Workload
When it comes to busy Healthcare Service Providers who accommodate large amounts of products and services, the daily demand on their data ecosystem to manage the amount of accounting required – accounts payable and receivable – is enormous. As a medical service provider, generating bills for insurers and booking that revenue into the billing system needs to run smoothly.
When billing slows down so does revenue, and our client was losing millions in revenue due to their database performance issues because their systems couldn’t keep up with demand.
With the maximum number of tasks that could be simultaneously requested from the system being too low to meet business requirements (105,000 transactions per second), the database simply couldn’t keep up with the application’s daily workload.
Over time, the database had become overrun and could no longer handle the volume of bills in a designated pay period, pushing back processing times. They needed more capacity to handle both payables that go out and the receivables coming back into the application. Month-end was continually pushed to the 3rd and 4th of every month.
This left them continually playing catch up at the end of every month because of the database performance issues.
We knew there was a scalability problem – the database simply couldn’t keep up with the workload being asked of it. We had to decide the best solution: scale up, scale out or performance tune their data ecosystem.
Each option had benefits and potential drawbacks, and deciphering which was the best approach meant understanding exactly what the problem was with the existing server.
“Simply replacing the hardware isn’t a one size fits all approach. Each situation needs to be carefully analyzed to determine the correct approach and solution”
– Jeremy Lowell, CTO Fortified Data
Database Performance Solution
Scale Up – Replacing The Server Hardware
Fortified Data worked to first understand how the server was operating and why it was slow. This led us to consider:
- What should we improve?
- What hardware would work best?
- What are the implications between size and cost – would it make a difference?
In order to handle the daily demand, the entire system needed to speed up. We began by thoroughly investigating, comparing and understanding the processor architecture. By changing the processors, we ended up with much faster bus and memory speeds. We also saw better balance between NUMA Nodes, and improved the PCIe offerings.
Typically enterprises simply select a bigger piece of hardware, sometimes at the suggestion of the hardware vendor, but our diagnosis lead us to use a smaller one instead.
Results
Overall Increase in Performance
This increased the transactions per second, write transactions per second, % CPU utilization, key business function calls per second, key business function duration per call, and peak business transaction class per hour.
As a result, our clients can get their pay periods back on track and eliminate the lag in bill payments and revenue disruption.
Why the Smaller Server Worked
From our investigations we knew that the database was struggling because of the back plane. The existing DL 980 server we replaced had two 580 boards with a back plane. This caused a communication disruption between the boards and wasn’t allowing the necessary throughput.
By changing to a single server, with a single mother board, with all the processors and memory on that board, the direct attachment to the network and storage eliminated any back plane involvement and therefore improving overall throughput.
As a result, the transaction time was decreased and the through put was increased.
Key Take Away and Learning
Choosing the wrong hardware or server could’ve made the problem worse. If we had just replaced the server without carefully understanding what exact problems we needed to solve each individual transaction would have been slowed down – defeating the purpose and failing to solve the problem.
While it is tempting to choose the bigger more expensive servers, it’s dangerous to assume that would always be a cure all solution to every problem.
Sometimes less can be more.
Know your system. Seek to understand the problem. Choose the best solution.
Leave a Reply
Want to join the discussion?Feel free to contribute!