ICAM Applied to Maintenance Failures
- Luke Dam
- 5 days ago
- 6 min read

Introduction
When a machine fails, it’s tempting to point the finger at the technician, the faulty part, or “bad luck.” But seasoned investigators know maintenance failures are rarely random events- they are predictable outcomes of system weaknesses.
The ICAM (Incident Cause Analysis Method) framework provides a structured, learning-focused way to analyse these failures. By examining Human and Organisational Factors, ICAM reveals how decisions, processes, and culture shape maintenance outcomes. This article explores how ICAM can be applied to maintenance-related incidents to uncover root causes, strengthen reliability, and build organisational resilience.
1. The Nature of Maintenance Failures
Maintenance failures manifest in many forms:
Incorrect maintenance (wrong component fitted, wrong torque, missed steps)
Deferred maintenance (tasks postponed due to production pressure)
Poor-quality maintenance (shortcuts, inadequate testing)
Design or documentation gaps (ambiguous procedures, missing instructions)
Planning and scheduling errors (poor sequencing, conflicting tasks)
Each of these outcomes may appear to be a “technical issue,” but ICAM reminds us that technical errors are usually the symptoms, not the causes.
2. Why ICAM Is the Right Lens
ICAM is built on the Reason Model (Swiss Cheese) and focuses on identifying both active failures and latent conditions. In maintenance incidents, ICAM’s structured approach enables investigators to:
Map the sequence of events leading to the failure
Identify the Absent/Failed Defences (e.g. inspections, QA checks, alarms)
Analyse Individual/Team Actions (technician decisions, omissions)
Explore Task/Environmental Conditions (time pressure, lighting, access)
Examine Organisational Factors (procedures, training, planning, supervision)
This layered analysis ensures we move beyond “human error” toward understanding the systemic contributors.
3. Applying ICAM Step-by-Step
3.1. Event and Chronology
Establishing a precise timeline:
When was the last maintenance performed?
Who did it, under what conditions?
What inspections or verifications followed?
When did the failure manifest?
Example: A compressor fails after an overhaul. Investigation shows a retaining bolt loosened, damaging blades. The timeline reveals that the torque step was missed during reassembly, and the QA check did not detect it.
3.2. Identify Absent or Failed Defences
Defences could include:
Double-person verification
Torque marking
Post-maintenance functional tests
Condition monitoring alarms
Ask: What should have prevented the failure? In our example, a final verification checklist existed but was not signed. The defence was present on paper but absent in practice.
3.3. Analyse Individual and Team Actions
Look at what technicians and supervisors did or did not do. Rather than judging, ICAM asks why these actions made sense at the time:
Did they have incomplete information?
Were they fatigued or rushed?
Did procedures match real-world conditions?
The torque step was skipped because the job was running overtime and a second tech was unavailable to verify. The technician believed hand-tightening was sufficient, based on prior experience- a local rationality issue.
3.4. Task and Environmental Conditions
These are immediate influences on human performance:
Poor lighting or cramped access
Inadequate tooling or missing torque wrenches
Noise, heat, distractions
Conflicting job demands
In our case, the work bay was crowded, the torque wrench was being used elsewhere, and the task list was lengthy- creating performance shaping factors.
3.5. Organisational Factors
ICAM examines systemic conditions that allowed the failure path to exist:
Planning: Was the maintenance window sufficient?
Procedures: Were they clear and verified?
Training: Did technicians have their competency sign-off?
Supervision: Was oversight adequate?
Resource allocation: Were tools and staff available?
Culture: Was there pressure to release equipment quickly?
The investigation finds that the maintenance plan underestimated task durations, that no contingency existed for tool sharing, and that supervision was stretched across multiple jobs- all organisational factors.
3.6. Reconstructing the ICAM Chart
By mapping the sequence of events, failed defences, and factor categories, the ICAM chart visualises how latent conditions aligned to permit the failure. This chart becomes the foundation for recommendations addressing causes, not symptoms.
4. Common Patterns in Maintenance ICAMs
4.1. Inadequate Procedures
Steps missing or ambiguous (“tighten bolts” without specifying torque)
Outdated drawings or mismatched part numbers
Complex tasks relying on memory
4.2. Competence and Training Gaps
Workers “signed off” without a practical assessment
Limited refresher training on infrequent tasks
Contractors unfamiliar with site-specific procedures
4.3. Planning and Scheduling Pressures
Maintenance squeezed between production demands
Shift handovers mid-task, causing continuity errors
“Catch-up” culture leading to corners being cut
4.4. Supervision and Verification Weaknesses
Supervisors are responsible for too many teams
Incomplete independent inspections
Overreliance on self-checks
4.5. Resource Constraints
Shared tools, missing spares
Inadequate workspaces
Deferred maintenance due to budget limits
4.6. Culture and Priorities
“Get it running” mindset overrides procedure
Fear of reporting errors
Normalisation of deviations (“we always do it this way”)
ICAM helps reveal these patterns across incidents, enabling trend analysis and strategic improvement.
5. Case Study: Hydraulic Pump Failure
Incident
A hydraulic pump seized hours after an overhaul, causing a production shutdown.
Event Sequence
Pump disassembled and inspected- wear observed
Replacement bearings installed
Reassembly completed without aligning the coupling
No vibration check performed
Pump started; abnormal vibration ignored
Seizure occurred within hours
Failed Defences
Alignment check omitted
No sign-off on the QA sheet
Condition monitoring alert ignored
Individual/Team Actions
The technician skipped alignment due to tool unavailability
The supervisor is unaware due to multiple concurrent jobs
Task/Environmental Conditions
Time pressure from delayed parts delivery
Shared laser alignment tool in another area
Organisational Factors
Planning failed to account for tool sharing
No escalation path for unavailable tools
Production pressure prioritised the schedule over quality
Outcomes
$250k downtime cost
Rework and safety exposure
Recommendations
Establish tool availability check in pre-plan
Implement go/no-go QA sign-off
Create an escalation protocol for missing resources
Reinforce alignment training and verification
This ICAM revealed a systemic weakness in planning and QA, not just technician error.
6. Lessons from ICAM Applied to Maintenance
Latent conditions dominate: Most failures stem from organisational weaknesses rather than isolated mistakes.
Defences must be verified, not assumed: Paper systems fail without behavioural reinforcement.
Local rationality matters: Understanding why actions made sense at the time builds trust and learning.
Cross-functional collaboration is essential: Engineering, planning, and operations all influence maintenance quality.
Data integration improves insight: Linking ICAM findings with reliability data (MTBF, defect logs) strengthens continuous improvement.
7. Turning Findings into Action
7.1. Strengthening Procedures
Use visual work instructions and torque diagrams
Include verification points with signatures
Regularly review procedures with frontline technicians
7.2. Building Competence
Competency frameworks linked to task criticality
Mentoring and peer verification
Simulation or VR training for complex tasks
7.3. Improving Planning and Scheduling
Incorporate tool/resource availability checks
Allocate realistic task durations
Integrate production and maintenance planning meetings
7.4. Enhancing Supervision
Define supervisor workload limits
Use digital checklists with mandatory sign-offs
Introduce independent QA inspections for critical tasks
7.5. Fostering a Learning Culture
Reward error reporting
Conduct after-action reviews for all significant maintenance events
Share ICAM learnings across sites
8. Metrics and Monitoring
Use ICAM findings to drive measurable improvement:
Reduction in repeat maintenance errors
Increased compliance with QA steps
Fewer emergency call-outs
Improved Mean Time Between Failures (MTBF)
Track recommendations through an Action Plan with owners, deadlines, and verification of effectiveness- a critical ICAM step often overlooked.
9. Challenges in Applying ICAM to Maintenance
Data availability: Limited documentation or informal practices hinder analysis.
Cultural resistance: Technicians may fear blame.
Investigation fatigue: Frequent small failures may not all warrant full ICAMs.
Scope creep: Distinguishing between isolated issues and systemic patterns requires discipline.
Mitigation strategies include scaled ICAMs (lite versions for minor events) and strong communication of learning intent.
10. The Payoff: From Reactive to Proactive
By systematically applying ICAM to maintenance failures, organisations move from reactive repairs to proactive risk control. Key benefits include:
Enhanced asset reliability
Reduced downtime and costs
Stronger safety and compliance
An empowered workforce engaged in learning
Every maintenance failure becomes a learning opportunity, not just a repair job.
Conclusion
Maintenance failures are not inevitable accidents; they are signals of deeper system vulnerabilities. ICAM transforms these events into powerful catalysts for organisational learning. By examining the defences, actions, conditions, and organisational factors, leaders gain clarity on what truly drives reliability- and how to build a culture where doing it right is the easy, obvious choice.
In the long run, applying ICAM to maintenance isn’t just about preventing the next breakdown- it’s about building a workplace where learning never stops, and excellence becomes routine.




Comments