Measure Hard Disk Geometry with Sequential Read Benchmarking on Linux
Measure Hard Disk Geometry with Sequential Read Benchmarking on Linux
When optimizing storage performance or debugging I/O bottlenecks, understanding your hard disk's actual physical geometry—cylinder count, head count, and sectors per track—becomes essential. While modern systems abstract away these details, the underlying physics still impact real-world performance characteristics.
This guide walks you through using sequential read microbenchmarking to reverse-engineer your disk's physical geometry on Linux, enabling you to make informed decisions about partition alignment, workload optimization, and performance tuning.
Why Disk Geometry Matters for Performance Tuning
Disk geometry directly affects:
- Seek time consistency: Understanding cylinder layout helps predict access patterns
- Partition alignment: Misaligned partitions cause unnecessary head movements
- Workload optimization: Knowing sector layout lets you optimize hot data placement
- Capacity calculations: Physical geometry reveals actual usable capacity vs. marketed capacity
Modern SSDs obscure geometry entirely, but traditional HDDs still exhibit measurable physical characteristics that influence performance under sustained workloads.
Sequential Read Benchmarking Methodology
Sequential reads expose disk geometry because:
- Track boundaries are detectable: Seek time spikes occur when the head moves to a new cylinder
- Sector organization is consistent: Regular patterns emerge across multiple sequential passes
- Performance plateaus reveal limits: Maximum sustained throughput matches the geometry's rotational constraints
Step 1: Prepare Your Test Environment
Ensure no other processes access the disk during testing:
# Check disk activity
iotstat -x 1
# Unmount test partition (if not root filesystem)
sudo umount /dev/sdX1
# Verify exclusive access
sudo fuser -m /dev/sdX
Target a dedicated test partition or an external USB drive to avoid corrupting active data.
Step 2: Create Sequential Read Benchmark
Use dd with specific block sizes to measure throughput at different granularities:
# Test 1MB blocks across entire disk
dd if=/dev/sdX of=/dev/null bs=1M count=10000 iflag=direct 2>&1 | grep -E "bytes|copied"
# Test 4KB blocks (sector-aligned)
dd if=/dev/sdX of=/dev/null bs=4K count=100000 iflag=direct 2>&1 | grep -E "bytes|copied"
# Test 64KB blocks (track-sized on many drives)
dd if=/dev/sdX of=/dev/null bs=64K count=50000 iflag=direct 2>&1 | grep -E "bytes|copied"
The iflag=direct bypasses filesystem caching, measuring actual disk performance.
Step 3: Analyze Performance Variations
Performance variations indicate geometry boundaries:
| Block Size | Expected Pattern | Interpretation | |-----------|-----------------|----------------| | 4KB | Lowest throughput | Single-sector reads | | 64KB | Moderate improvement | Multi-sector within track | | 256KB+ | Plateau or slight variance | Multiple tracks per head pass | | 1MB+ | Stable high throughput | Natural alignment with cylinder boundaries |
When throughput plateaus, you've exceeded the physical track size on that cylinder.
Step 4: Measure Seek Time Patterns
Seek spikes reveal cylinder boundaries:
# Use hdparm to measure seek times across different LBAs
sudo hdparm -t /dev/sdX
# For detailed random access patterns:
# Install fio benchmark tool
sudo apt-get install fio
# Create random access workload
fio --name=random-4k --ioengine=libaio --iodepth=4 --rw=randread \
--bs=4k --direct=1 --numjobs=1 --runtime=30 --filename=/dev/sdX \
--output=random-test.log
Analyze the latency distribution—spikes correspond to head movements.
Interpreting Results for Physical Geometry
Once you have benchmark data:
- Peak throughput ÷ rotational speed ≈ bytes per rotation
- Bytes per rotation ÷ 4KB sector size = sectors per track
- Seek time spikes occur at boundaries between tracks
For a 7200 RPM drive showing 180 MB/s peak throughput:
- Bytes per rotation: (180 MB/s) × (60s / 7200 rotations) ≈ 1.5 MB
- Sectors per track: 1,500,000 bytes / 4,096 bytes ≈ 366 sectors
Common Pitfalls When Benchmarking
Mistake 1: Caching interference
Solution: Always use iflag=direct with dd, or disable system cache:
sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
Mistake 2: Testing mounted filesystems
Solution: Unmount the partition first, or test a different disk entirely.
Mistake 3: Insufficient data volume
Solution: Run benchmarks across at least 10GB of disk space to capture full geometry patterns.
Mistake 4: Background I/O
Solution: Boot into single-user mode or test during off-hours:
sudo systemctl isolate rescue.target
Advanced: Automated Geometry Detection
For repeated testing across multiple drives, create a benchmarking script:
#!/bin/bash
DISK=$1
echo "Testing disk: $DISK"
for BS in 4K 64K 256K 1M; do
RESULT=$(dd if=$DISK of=/dev/null bs=$BS count=50000 iflag=direct 2>&1 | grep copied)
THROUGHPUT=$(echo $RESULT | awk '{print $8 " " $9}')
echo "Block size: $BS - Throughput: $THROUGHPUT"
done
Using Results for Real-World Optimization
Once you understand your disk geometry:
- Align partitions: Use partition tools that respect discovered sector boundaries
- Optimize application I/O: Request block sizes matching detected track size
- Tune filesystem parameters: Set
blocksizein mkfs operations to match physical layout - Predict latency: Model access patterns knowing actual seek distances
Conclusion
Microbenchmarking reveals hidden disk geometry that affects real-world I/O performance. While modern abstractions hide these details, developers working with high-performance storage systems benefit from understanding the physical reality beneath the software interface. Sequential benchmarking on Linux provides a straightforward method to reverse-engineer this geometry without specialized tools.
For production systems, this knowledge enables informed optimization decisions that can improve throughput by 5-15% through proper alignment and workload placement.
Recommended Tools
- DigitalOceanCloud hosting built for developers — $200 free credit for new users
- Akamai Cloud (Linode)Developer-friendly cloud infrastructure
- VultrHigh-performance cloud compute — deploy in 60 seconds