Why RAID Controllers Refuse to Rebuild by Design
Rebuild Won’t Start — All Drives Healthy is a RAID failure state that confuses even experienced administrators.
A disk has been replaced.
All remaining drives report Online, Optimal, or Unconfigured Good.
SMART looks clean.
And yet — no rebuild begins.
No progress bar.
No error message.
No warning that tells you what’s wrong.
This is not a drive failure.
It is a controller confidence failure.
Across enterprise RAID platforms — including LSI MegaRAID, Dell PERC, and HP SmartArray — rebuilds are intentionally blocked when the controller cannot guarantee parity correctness. In these situations, the array may look healthy, but the controller no longer trusts its metadata well enough to proceed.
MegaRAID is a common place where this behavior appears, which is why this page uses it as a reference platform — but the underlying failure state is controller-agnostic.
What follows explains why rebuilds are blocked when drives are healthy, what actions permanently destroy recovery chances, and how stalled rebuild states can often be recovered if handled correctly.
1. What You See
- Rebuild does not auto-start after inserting a replacement drive
- VD remains “Degraded” with no active reconstruction
- Replacement drive shows “Unconfigured Good” or “Foreign”
- Event logs show no rebuild, only identity or metadata warnings
- MegaRAID Storage Manager or BIOS shows “No Operations”
- All drives pass SMART — misleadingly healthy
2. Why It Happens (Controller Parity-Confidence Behavior)
MegaRAID is more cautious than people realize.
- Controller detected metadata epoch mismatch between members
- Previous rebuild attempt aborted mid-way
- A survivor returned UNC/CRC errors, so MegaRAID paused permanently
- Identity markers (sequence, timestamps, controller IDs) no longer align
- A foreign config exists that conflicts with the active VD
- MegaRAID refuses to rebuild if it cannot guarantee parity consistency
- The failed drive’s replacement lacks the “trust markers” to rejoin
MegaRAID will not rebuild without a verifiable parity baseline.
3. What NOT To Do
- Do not “Force Online” a questionable drive
- Do not clear foreign config without analysis
- Do not convert the replacement drive to “Rebuild” manually
- Do not attempt VD recreation (disaster)
- Do not delete/re-add drives
- Do not run MegaRAID “Check Consistency” — may rewrite parity
One wrong action can overwrite the map that proves which data blocks belong where.
4. What You CAN Do
- Export current MegaRAID configuration
- Review controller logs for “Inconsistent Metadata” or “Aborted Rebuild”
- Identify which survivor triggered errors (UNC, pending sectors, timeouts)
- Capture NVRAM/cache state if possible
- Clone all members before any forced operations
- Validate the RAID geometry:
- level
- stripe size
- parity rotation
- start offset
- member order
- From images, determine if the stalled rebuild left partial writes
5. What This Means for Your Data
- Drives may appear healthy, but metadata is not
- Controller is refusing rebuild to avoid irreversible corruption
- Stalled rebuilds often leave “dirty stripes” that must be analyzed
- With imaging and parity-matching, the original layout can be reconstructed
- Most arrays are recoverable if steps are taken before metadata changes
Diagnostic Overview
- Applies To: LSI MegaRAID (9260/9270/9361 families), Dell PERC, similar enterprise RAID controllers
- Observed State: Rebuild Will Not Start / Drives Healthy
- Likely Cause: Metadata epoch mismatch or stalled prior rebuild: see ADR Technical Note TN-R6-001
- Do NOT: Force Online, clear foreign, or run consistency check: see Foreign Config Detected on RAID 6 — Import or Not?
- Recommended Action: Export config; review logs; clone drives; validate parity and geometry