How to Detect Hard Disk Sector Size and Physical Geometry with Microbenchmarking Tools
How to Detect Hard Disk Sector Size and Physical Geometry with Microbenchmarking Tools
When working with storage systems, databases, or kernel-level I/O optimization, understanding your hard disk's actual physical geometry is critical. Modern drives report logical sector sizes (often 4096 bytes) that differ from physical properties, leading to performance issues and alignment problems. This guide walks you through discovering true disk geometry using microbenchmarking techniques.
Why Physical Disk Geometry Matters
Operating systems and applications typically rely on advertised disk parameters, but these often don't reflect actual hardware behavior. Hard disk drives use internal buffers, caching strategies, and zone-bit recording that affect performance patterns. When you microbenchmark sequential access patterns, you can infer the true underlying geometry.
Understanding physical geometry helps you:
- Optimize partition alignment for better performance
- Predict performance degradation patterns
- Debug I/O bottlenecks in storage-intensive applications
- Make informed decisions about database block sizes
Common Issues with Reported Disk Parameters
Most tools report logical sector sizes (512 or 4096 bytes), but drives often have different physical characteristics:
- Logical vs. Physical sectors: A 4K logical sector might be composed of multiple smaller physical sectors
- Zone bit recording: Different tracks have different data densities
- Cylinder/head arrangements: Modern drives hide these, but performance cliffs still exist
- Buffer effects: Intelligent caching masks true sequential performance
Setting Up Microbenchmarking on Linux
Prerequisites
You'll need tools for measuring I/O timing with nanosecond precision:
# Install required tools
sudo apt-get install blktrace blkparse fio sysbench linux-tools-generic
# For detailed timing analysis
git clone https://github.com/axboe/fio.git
cd fio
./configure && make && sudo make install
Measuring Sequential Access Patterns
Create a microbenchmark script that tests sequential reads at various block sizes to identify geometry:
#!/bin/bash
# disk_geometry_test.sh
DISK="/dev/sdb" # Change to your test disk
OUTPUT="geometry_results.txt"
echo "Testing disk geometry on $DISK" > $OUTPUT
# Test various block sizes from 512B to 1MB
for blocksize in 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576; do
echo "\n--- Block Size: $blocksize bytes ---" >> $OUTPUT
fio --name=seq_read \
--filename=$DISK \
--rw=read \
--bs=$blocksize \
--size=100M \
--numjobs=1 \
--runtime=10 \
--group_reporting \
--output-format=normal >> $OUTPUT 2>&1
done
cat $OUTPUT
Identifying Performance Cliffs
When you plot the throughput results against block size, you'll notice distinct performance plateaus and cliffs:
| Block Size | Expected Pattern | Indicates | |-----------|-----------------|----------| | 512 B - 2K | Lower throughput | Sub-optimal alignment | | 4K - 32K | Steady increase | Zone recording transitioning | | 64K - 256K | Peak throughput | Optimal track buffer size | | 256K+ | Plateau or drop | Exceeding physical buffer |
The inflection points reveal the actual physical sector organization and buffer boundaries.
Advanced Microbenchmarking with Seek Patterns
Beyond sequential reads, seek patterns expose cylinder geometry:
#!/bin/bash
# seek_pattern_test.sh
DISK="/dev/sdb"
# Test random access across different LBA ranges
echo "Testing seek distances..."
for seek_distance in 512 4096 65536 262144 1048576 4194304; do
echo "\nSeek distance: $seek_distance bytes"
fio --name=random_seek \
--filename=$DISK \
--rw=randread \
--bs=4096 \
--size=1G \
--iodepth=1 \
--numjobs=1 \
--runtime=30 \
--random_generator=pareto:$seek_distance:100 \
--group_reporting
done
Pareto-distributed seeks reveal latency characteristics that correlate to physical head movement distances.
Windows-Specific Microbenchmarking
On Windows, use Crystal Disk Info or DiskSpd for geometry analysis:
# Using DiskSpd for geometry testing
$diskid = "1" # Physical disk number
$blocksize = 4096
# Download DiskSpd: https://github.com/microsoft/diskspd
.\DiskSpd.exe -b$blocksize -t1 -o16 -d30 -S -L PhysicalDrive$diskid
Watch for throughput changes at specific block sizes—these indicate zone transitions.
Interpreting Microbenchmark Results
Performance Signatures
Sequential throughput plateaus often align with:
- 64KB-256KB: Common buffer sizes for 7200 RPM drives
- 256KB-1MB: Typical for high-capacity enterprise drives
- Drops at 1MB+: Indicates multi-track access penalties
Latency Patterns
When you see latency spike at specific seek distances, you've found a geometry boundary:
- 10-20ms jumps: Track-to-track seeks (indicates head movement)
- 30-40ms jumps: Zone transitions or track group boundaries
- 100ms+ jumps: Platter rotation waiting (indicates cylinder boundary crossing)
Automated Geometry Detection Tool
Here's a Python script that automates detection and reports findings:
import subprocess
import json
import re
def detect_disk_geometry(device, output_file=None):
"""Microbenchmark disk to infer physical geometry"""
results = {}
throughputs = []
block_sizes = [512, 1024, 4096, 8192, 16384, 32768, 65536, 131072, 262144, 524288, 1048576]
for bs in block_sizes:
cmd = [
'fio', '--name=test',
f'--filename={device}',
'--rw=read',
f'--bs={bs}',
'--size=50M',
'--numjobs=1',
'--output-format=json',
'--quiet'
]
try:
output = subprocess.run(cmd, capture_output=True, text=True, timeout=60)
data = json.loads(output.stdout)
iops = data['jobs'][0]['read']['iops']
throughput = (iops * bs) / (1024 * 1024) # MB/s
throughputs.append({'block_size': bs, 'throughput': throughput})
print(f"Block size {bs:7d}B: {throughput:8.2f} MB/s")
except Exception as e:
print(f"Error testing block size {bs}: {e}")
continue
# Find performance cliffs
for i in range(1, len(throughputs)):
diff_percent = ((throughputs[i]['throughput'] - throughputs[i-1]['throughput']) / throughputs[i-1]['throughput']) * 100
if abs(diff_percent) > 20: # >20% change indicates geometry boundary
print(f"\n⚠️ Performance cliff at {throughputs[i]['block_size']}B (±{abs(diff_percent):.1f}%)")
return throughputs
if __name__ == '__main__':
import sys
device = sys.argv[1] if len(sys.argv) > 1 else '/dev/sdb'
detect_disk_geometry(device)
Best Practices for Accurate Results
- Disable caching temporarily: Use
hdparm -Fto flush drive caches - Perform multiple runs: Timing variations average out with 3-5 iterations
- Use dedicated test environments: Background I/O skews results significantly
- Test both cold and warm states: Seek latency differs with drive spindle spinning
- Monitor temperature: Drive thermal throttling affects benchmarks
Common Pitfalls to Avoid
- Testing on mounted filesystems: Use raw device access (
/dev/sdb) not mount points - Insufficient test duration: Use at least 10-30 second runs for stable numbers
- Ignoring warm-up effects: First 100 operations often show different behavior
- Testing across system load: Run benchmarks on idle systems for consistency
Next Steps
Once you've identified physical geometry, optimize your systems:
- Align partitions to detected boundary sizes
- Configure database block sizes to match physical sectors
- Adjust kernel I/O schedulers based on actual geometry
For continuous monitoring of disk behavior, consider tools like Grafana with disk-specific metrics collectors to track performance over time.
Recommended Tools
- VercelDeploy frontend apps instantly with zero config
- DigitalOceanCloud hosting built for developers — $200 free credit for new users
- RenderZero-DevOps cloud platform for web apps and APIs