When designing solutions for Lenovo server projects, especially for SMBs and mid-sized businesses, I frequently observe that customers tend to over-focus on CPU selection. Questions like “How many cores will it have?” or “What GHz does Turbo Boost reach?” are often prioritized. However, in my experience, the primary factor determining a server’s overall performance and responsiveness under workload is a well-planned disk infrastructure, rather than the CPU.
A server becomes efficient not just through processor power, but also through the harmonious operation of memory, network, and most importantly, storage units. A slow disk infrastructure can bottleneck even the most powerful processor, degrading the performance of the entire system. For this reason, our first step in a Lenovo server project is always to thoroughly analyze the storage requirements of the workload.
Understanding the Workload: Disk Needs Before CPU
When starting a server project, our first question is always, “What will this server do?” The subsequent question, “How much disk I/O will this workload utilize?” is far more decisive than the number of CPU cores or clock speed. In most SMB and mid-sized business applications, the processor doesn’t continuously run at 100% load; however, database, file server, or ERP applications constantly require disk access.
For example, an accounting software or ERP system needs the capacity to read and write thousands of small data blocks per second (random I/O), rather than intense processor power. For such workloads, fast storage solutions offering high IOPS (Input/Output Operations Per Second) provide a much more critical performance boost than a processor with a few more cores. If the disk infrastructure cannot meet this demand, applications will slow down and users will experience delays, while the processor remains idle.
Consider multiple virtual machines running in a typical virtualization environment. Each virtual machine generates disk I/O with its own operating system and applications. In this scenario, the underlying server’s disk infrastructure must be able to handle this combined I/O load. If the infrastructure is insufficient, each virtual machine might feel like it’s “freezing” internally, even though the server’s CPU utilization might not exceed 20%. This situation arises because the CPU is idle, waiting for data from the disk.
RAID Levels and the Intersection of Performance
When planning disks in Lenovo servers, the chosen RAID (Redundant Array of Independent Disks) level is vital for both performance and data security. Each RAID level has different performance characteristics, which must be carefully selected based on the type of workload. My approach in the field is generally to prefer RAID levels that offer the best balance between performance and redundancy.
For instance, RAID 10 (RAID 1+0) offers both high read/write performance and data redundancy. It can withstand the failure of half the disk group, and its write performance is much better than RAID 5 or RAID 6 because there is no write penalty. This makes it ideal for high-IOPS database servers or virtualization platforms. However, in RAID 10, half of the disk capacity is used for redundancy, which can increase costs.
# Typical RAID controller CLI output (e.g., LSI MegaRAID)
# Example RAID 10 configuration:
MegaCli -LDInfo -Lall -aALL
Adapter #0
Logical Drive #0 (Target Id 0):
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 1.817 TB
State : Optimal
Strip Size : 256 KB
Number of PDs : 4
Span Depth : 2
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Access Policy : Read/Write
Disk Cache Policy : Enabled
Encryption Type : None
On the other hand, RAID 5 or RAID 6 offer more usable capacity but compromise somewhat on write performance. RAID 5 can tolerate a single disk failure, while RAID 6 can tolerate two. However, because parity calculation and writing are required for every write operation, performance degradation can occur, especially with small random writes. This can create a bottleneck for highly transactional databases. For this reason, I prefer RAID 10 for critical applications and RAID 5 or RAID 6 for file servers with less I/O intensity.
The Role of Disk Types and Technology: From HDD to NVMe
Today, the disk technologies we can use in Lenovo servers are quite diverse, each with its unique advantages and disadvantages. Choosing the right disk type directly impacts the server’s expected performance. For me, this choice always involves striking a balance between workload requirements and budget.
Traditional HDDs (Hard Disk Drives) offer high capacity and low cost but lag behind SSDs in terms of IOPS and latency. They are generally preferred for scenarios requiring high capacity and low access speed, such as archival storage, large file servers, or backup targets. Enterprise-class SAS HDDs offer higher IOPS and reliability compared to SATA HDDs.
SSDs (Solid State Drives) offer much higher IOPS, lower latency, and better durability compared to HDDs. SATA SSDs are significantly faster than HDDs and provide sufficient performance for most general-purpose server workloads. SAS SSDs offer higher performance, reliability, and dual-port support in enterprise environments, making them suitable for more critical applications. NVMe (Non-Volatile Memory Express) SSDs, using the PCIe interface, deliver the highest performance, lowest latency, and highest IOPS values. NVMe SSDs are indispensable for ultra-high performance requirements such as databases, big data analytics, or artificial intelligence workloads.
Typical comparison table (representative values):
| Disk Type | Average IOPS (Read) | Average Latency (ms) | Cost/GB (Relative) | Use Case |
|---|---|---|---|---|
| HDD (7.2K RPM) | 80-150 | 5-10 | Low | Archive, large files |
| HDD (10K RPM) | 120-200 | 3-6 | Medium-Low | General file server |
| SAS SSD | 50,000-100,000 | 0.1-0.2 | High | Enterprise DB, virtualization |
| NVMe SSD | 300,000-1,000,000+ | 0.02-0.05 | Very High | HPC, AI, ultra-critical DB |
Capacity Planning and a Future-Oriented Approach
In server disk planning, capacity is a strategic approach that not only meets today’s needs but also accounts for future growth. In my experience, accurately estimating capacity and leaving a certain “growth buffer” is much less costly and creates fewer operational difficulties in the long run.
When we set up a monitoring infrastructure with tools like Grafana and InfluxDB to track a system’s disk usage, we can clearly see capacity trends. For example, if we determine that a file server is growing at an average rate of 2-3% per month, it’s important to equip the server with at least 2-3 years of growth allowance when it’s first deployed. Otherwise, early capacity issues will arise, and the cost of expansion can be much higher than the initial additional cost.
Let’s take an example: Suppose an SMB’s ERP system needs 2 TB of usable disk space, with an average annual growth of 15% expected. If we use RAID 10, and 4x1TB disks are required for 2 TB of usable space (half the capacity in RAID 10), then after three years with 15% growth, we will need approximately 3 TB of usable space. In this case, starting with 4x2TB disks, providing 4 TB of usable space, would both meet the current need and comfortably last for 3-4 years. This type of calculation, even with “approximate” values, significantly impacts the server’s lifecycle cost.
# Example of checking disk usage in Linux
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 200G 50G 140G 27% /
/dev/sdb1 1.8T 900G 800G 53% /mnt/data
/dev/sdc1 5.0T 3.0T 2.0T 60% /mnt/backup
Even this simple command allows us to quickly see current disk usage and remaining free space. By regularly monitoring and collecting this data to analyze trends, we can more accurately predict future capacity needs.
Firmware Discipline and Hardware Compatibility
For the disk infrastructure in Lenovo servers to operate smoothly and performantly, simply choosing the right hardware is not enough. Firmware discipline and hardware compatibility are, for me, as critical as the hardware selection itself. Many “strange” performance drops or random system freezes we encounter in the field are due to outdated firmware or driver incompatibilities.
A RAID controller, HBA (Host Bus Adapter), or even the disks’ own firmware should be at the current level specified by the manufacturer. Lenovo regularly releases firmware updates, and these updates typically improve performance, fix known bugs, and ensure compatibility with new operating systems or hardware. Applying these updates is essential for maintaining a server’s stability and efficiency throughout its lifecycle.
# Example of checking RAID controller firmware version on VMware ESXi
# These commands may vary depending on the server model and RAID controller manufacturer.
# Example: For an LSI-based MegaRAID controller
esxcli software vib list | grep lsi
# Then, the version for the lsi_mr3 driver can be checked.
# The firmware version is usually found in the controller's BIOS or web interface.
# This information can also be centrally monitored via Lenovo XClarity Controller (XCC).
If we are using a server with Windows Server or a Linux distribution, we must ensure that the RAID controller drivers are fully compatible with the operating system and up-to-date. Old or incorrect drivers can lead to delays in disk access, data corruption, or unexpected system crashes. At ITWISE, we have made it a standard practice to always download and apply the latest firmware and driver packages from Lenovo’s support site before deploying a new server. This is the most effective way to prevent potential problems before they arise.
Disk Infrastructure for Backup and Recovery Processes
Disk infrastructure planning directly impacts not only daily operational performance but also the effectiveness of disaster recovery (DR) and backup processes. For me, when designing a server’s storage solution, the questions “How and how quickly will we back up this system?” and “How quickly can we restore data in the event of a disaster?” are key. When working with solutions like Acronis, the impact of disk infrastructure on these processes is very evident.
A high-performance disk infrastructure significantly shortens our backup windows. Especially for systems with large datasets or those that need to operate 24/7, completing the backup process as quickly as possible is critically important. If the source disks are slow, the backup process can take hours, which risks our RPO (Recovery Point Objective) targets and increases the load on the server. Similarly, data restoration after a disaster also depends on disk speed. Fast storage helps us achieve our RTO (Recovery Time Objective) goals.
For example, let’s consider taking a full image backup of a 1 TB server. If the server has a 4xSAS SSD RAID 10 configuration, the backup process can typically be completed within 2-3 hours. However, if the same server is running on 4xSATA HDD RAID 5, this time could extend to 5-6 hours, or even longer. This difference creates a significant operational challenge, especially in scenarios where backups need to be performed during weekday business hours. A similar effect is seen during recovery; systems come back online much faster thanks to quick disks.
Conclusion: Disk Planning is the Heart of the Server
I understand that focusing on the CPU when choosing a Lenovo server is a natural inclination. However, my experience at ITWISE and our practical applications in the field show that the real-world performance and business continuity of a server are determined by comprehensive and correctly executed disk infrastructure planning. If a server’s processor is its brain, its disk infrastructure is its heart; and when the heart doesn’t function correctly, no matter how powerful the brain, the entire system falters.