How to Detect Hard Disk Sector Size and Physical Geometry with Microbenchmarking Tools

How to Detect Hard Disk Sector Size and Physical Geometry with Microbenchmarking Tools

When working with storage systems, databases, or kernel-level I/O optimization, understanding your hard disk's actual physical geometry is critical. Modern drives report logical sector sizes (often 4096 bytes) that differ from physical properties, leading to performance issues and alignment problems. This guide walks you through discovering true disk geometry using microbenchmarking techniques.

Why Physical Disk Geometry Matters

Operating systems and applications typically rely on advertised disk parameters, but these often don't reflect actual hardware behavior. Hard disk drives use internal buffers, caching strategies, and zone-bit recording that affect performance patterns. When you microbenchmark sequential access patterns, you can infer the true underlying geometry.

Understanding physical geometry helps you:

  • Optimize partition alignment for better performance
  • Predict performance degradation patterns
  • Debug I/O bottlenecks in storage-intensive applications
  • Make informed decisions about database block sizes

Common Issues with Reported Disk Parameters

Most tools report logical sector sizes (512 or 4096 bytes), but drives often have different physical characteristics:

  • Logical vs. Physical sectors: A 4K logical sector might be composed of multiple smaller physical sectors
  • Zone bit recording: Different tracks have different data densities
  • Cylinder/head arrangements: Modern drives hide these, but performance cliffs still exist
  • Buffer effects: Intelligent caching masks true sequential performance

Setting Up Microbenchmarking on Linux

Prerequisites

You'll need tools for measuring I/O timing with nanosecond precision:

# Install required tools
sudo apt-get install blktrace blkparse fio sysbench linux-tools-generic

# For detailed timing analysis
git clone https://github.com/axboe/fio.git
cd fio
./configure && make && sudo make install

Measuring Sequential Access Patterns

Create a microbenchmark script that tests sequential reads at various block sizes to identify geometry:

#!/bin/bash
# disk_geometry_test.sh

DISK="/dev/sdb"  # Change to your test disk
OUTPUT="geometry_results.txt"

echo "Testing disk geometry on $DISK" > $OUTPUT

# Test various block sizes from 512B to 1MB
for blocksize in 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576; do
    echo "\n--- Block Size: $blocksize bytes ---" >> $OUTPUT
    fio --name=seq_read \
        --filename=$DISK \
        --rw=read \
        --bs=$blocksize \
        --size=100M \
        --numjobs=1 \
        --runtime=10 \
        --group_reporting \
        --output-format=normal >> $OUTPUT 2>&1
done

cat $OUTPUT

Identifying Performance Cliffs

When you plot the throughput results against block size, you'll notice distinct performance plateaus and cliffs:

| Block Size | Expected Pattern | Indicates | |-----------|-----------------|----------| | 512 B - 2K | Lower throughput | Sub-optimal alignment | | 4K - 32K | Steady increase | Zone recording transitioning | | 64K - 256K | Peak throughput | Optimal track buffer size | | 256K+ | Plateau or drop | Exceeding physical buffer |

The inflection points reveal the actual physical sector organization and buffer boundaries.

Advanced Microbenchmarking with Seek Patterns

Beyond sequential reads, seek patterns expose cylinder geometry:

#!/bin/bash
# seek_pattern_test.sh

DISK="/dev/sdb"

# Test random access across different LBA ranges
echo "Testing seek distances..."

for seek_distance in 512 4096 65536 262144 1048576 4194304; do
    echo "\nSeek distance: $seek_distance bytes"
    fio --name=random_seek \
        --filename=$DISK \
        --rw=randread \
        --bs=4096 \
        --size=1G \
        --iodepth=1 \
        --numjobs=1 \
        --runtime=30 \
        --random_generator=pareto:$seek_distance:100 \
        --group_reporting
done

Pareto-distributed seeks reveal latency characteristics that correlate to physical head movement distances.

Windows-Specific Microbenchmarking

On Windows, use Crystal Disk Info or DiskSpd for geometry analysis:

# Using DiskSpd for geometry testing
$diskid = "1"  # Physical disk number
$blocksize = 4096

# Download DiskSpd: https://github.com/microsoft/diskspd
.\DiskSpd.exe -b$blocksize -t1 -o16 -d30 -S -L PhysicalDrive$diskid

Watch for throughput changes at specific block sizes—these indicate zone transitions.

Interpreting Microbenchmark Results

Performance Signatures

Sequential throughput plateaus often align with:

  • 64KB-256KB: Common buffer sizes for 7200 RPM drives
  • 256KB-1MB: Typical for high-capacity enterprise drives
  • Drops at 1MB+: Indicates multi-track access penalties

Latency Patterns

When you see latency spike at specific seek distances, you've found a geometry boundary:

  • 10-20ms jumps: Track-to-track seeks (indicates head movement)
  • 30-40ms jumps: Zone transitions or track group boundaries
  • 100ms+ jumps: Platter rotation waiting (indicates cylinder boundary crossing)

Automated Geometry Detection Tool

Here's a Python script that automates detection and reports findings:

import subprocess
import json
import re

def detect_disk_geometry(device, output_file=None):
    """Microbenchmark disk to infer physical geometry"""
    results = {}
    throughputs = []
    block_sizes = [512, 1024, 4096, 8192, 16384, 32768, 65536, 131072, 262144, 524288, 1048576]
    
    for bs in block_sizes:
        cmd = [
            'fio', '--name=test',
            f'--filename={device}',
            '--rw=read',
            f'--bs={bs}',
            '--size=50M',
            '--numjobs=1',
            '--output-format=json',
            '--quiet'
        ]
        
        try:
            output = subprocess.run(cmd, capture_output=True, text=True, timeout=60)
            data = json.loads(output.stdout)
            iops = data['jobs'][0]['read']['iops']
            throughput = (iops * bs) / (1024 * 1024)  # MB/s
            throughputs.append({'block_size': bs, 'throughput': throughput})
            print(f"Block size {bs:7d}B: {throughput:8.2f} MB/s")
        except Exception as e:
            print(f"Error testing block size {bs}: {e}")
            continue
    
    # Find performance cliffs
    for i in range(1, len(throughputs)):
        diff_percent = ((throughputs[i]['throughput'] - throughputs[i-1]['throughput']) / throughputs[i-1]['throughput']) * 100
        if abs(diff_percent) > 20:  # >20% change indicates geometry boundary
            print(f"\n⚠️  Performance cliff at {throughputs[i]['block_size']}B (±{abs(diff_percent):.1f}%)")
    
    return throughputs

if __name__ == '__main__':
    import sys
    device = sys.argv[1] if len(sys.argv) > 1 else '/dev/sdb'
    detect_disk_geometry(device)

Best Practices for Accurate Results

  1. Disable caching temporarily: Use hdparm -F to flush drive caches
  2. Perform multiple runs: Timing variations average out with 3-5 iterations
  3. Use dedicated test environments: Background I/O skews results significantly
  4. Test both cold and warm states: Seek latency differs with drive spindle spinning
  5. Monitor temperature: Drive thermal throttling affects benchmarks

Common Pitfalls to Avoid

  • Testing on mounted filesystems: Use raw device access (/dev/sdb) not mount points
  • Insufficient test duration: Use at least 10-30 second runs for stable numbers
  • Ignoring warm-up effects: First 100 operations often show different behavior
  • Testing across system load: Run benchmarks on idle systems for consistency

Next Steps

Once you've identified physical geometry, optimize your systems:

  • Align partitions to detected boundary sizes
  • Configure database block sizes to match physical sectors
  • Adjust kernel I/O schedulers based on actual geometry

For continuous monitoring of disk behavior, consider tools like Grafana with disk-specific metrics collectors to track performance over time.

Recommended Tools

  • VercelDeploy frontend apps instantly with zero config
  • DigitalOceanCloud hosting built for developers — $200 free credit for new users
  • RenderZero-DevOps cloud platform for web apps and APIs