top of page

ICAM Applied to Maintenance Failures

  • Luke Dam
  • 5 days ago
  • 6 min read

Introduction

When a machine fails, it’s tempting to point the finger at the technician, the faulty part, or “bad luck.” But seasoned investigators know maintenance failures are rarely random events- they are predictable outcomes of system weaknesses. 


The ICAM (Incident Cause Analysis Method) framework provides a structured, learning-focused way to analyse these failures. By examining Human and Organisational Factors, ICAM reveals how decisions, processes, and culture shape maintenance outcomes. This article explores how ICAM can be applied to maintenance-related incidents to uncover root causes, strengthen reliability, and build organisational resilience.


1. The Nature of Maintenance Failures

Maintenance failures manifest in many forms:


  • Incorrect maintenance (wrong component fitted, wrong torque, missed steps)

  • Deferred maintenance (tasks postponed due to production pressure)

  • Poor-quality maintenance (shortcuts, inadequate testing)

  • Design or documentation gaps (ambiguous procedures, missing instructions)

  • Planning and scheduling errors (poor sequencing, conflicting tasks)


Each of these outcomes may appear to be a “technical issue,” but ICAM reminds us that technical errors are usually the symptoms, not the causes.


2. Why ICAM Is the Right Lens

ICAM is built on the Reason Model (Swiss Cheese) and focuses on identifying both active failures and latent conditions. In maintenance incidents, ICAM’s structured approach enables investigators to:


  • Map the sequence of events leading to the failure

  • Identify the Absent/Failed Defences (e.g. inspections, QA checks, alarms)

  • Analyse Individual/Team Actions (technician decisions, omissions)

  • Explore Task/Environmental Conditions (time pressure, lighting, access)

  • Examine Organisational Factors (procedures, training, planning, supervision)


This layered analysis ensures we move beyond “human error” toward understanding the systemic contributors.


3. Applying ICAM Step-by-Step

3.1. Event and Chronology

Establishing a precise timeline:


  • When was the last maintenance performed?

  • Who did it, under what conditions?

  • What inspections or verifications followed?

  • When did the failure manifest?


Example: A compressor fails after an overhaul. Investigation shows a retaining bolt loosened, damaging blades. The timeline reveals that the torque step was missed during reassembly, and the QA check did not detect it.


3.2. Identify Absent or Failed Defences

Defences could include:


  • Double-person verification

  • Torque marking

  • Post-maintenance functional tests

  • Condition monitoring alarms


Ask: What should have prevented the failure? In our example, a final verification checklist existed but was not signed. The defence was present on paper but absent in practice.


3.3. Analyse Individual and Team Actions

Look at what technicians and supervisors did or did not do. Rather than judging, ICAM asks why these actions made sense at the time:


  • Did they have incomplete information?

  • Were they fatigued or rushed?

  • Did procedures match real-world conditions?


The torque step was skipped because the job was running overtime and a second tech was unavailable to verify. The technician believed hand-tightening was sufficient, based on prior experience- a local rationality issue.


3.4. Task and Environmental Conditions

These are immediate influences on human performance:


  • Poor lighting or cramped access

  • Inadequate tooling or missing torque wrenches

  • Noise, heat, distractions

  • Conflicting job demands


In our case, the work bay was crowded, the torque wrench was being used elsewhere, and the task list was lengthy- creating performance shaping factors.


3.5. Organisational Factors

ICAM examines systemic conditions that allowed the failure path to exist:


  • Planning: Was the maintenance window sufficient?

  • Procedures: Were they clear and verified?

  • Training: Did technicians have their competency sign-off?

  • Supervision: Was oversight adequate?

  • Resource allocation: Were tools and staff available?

  • Culture: Was there pressure to release equipment quickly?


The investigation finds that the maintenance plan underestimated task durations, that no contingency existed for tool sharing, and that supervision was stretched across multiple jobs- all organisational factors.

3.6. Reconstructing the ICAM Chart

By mapping the sequence of events, failed defences, and factor categories, the ICAM chart visualises how latent conditions aligned to permit the failure. This chart becomes the foundation for recommendations addressing causes, not symptoms.


4. Common Patterns in Maintenance ICAMs

4.1. Inadequate Procedures


  • Steps missing or ambiguous (“tighten bolts” without specifying torque)

  • Outdated drawings or mismatched part numbers

  • Complex tasks relying on memory


4.2. Competence and Training Gaps


  • Workers “signed off” without a practical assessment

  • Limited refresher training on infrequent tasks

  • Contractors unfamiliar with site-specific procedures


4.3. Planning and Scheduling Pressures


  • Maintenance squeezed between production demands

  • Shift handovers mid-task, causing continuity errors

  • “Catch-up” culture leading to corners being cut


4.4. Supervision and Verification Weaknesses


  • Supervisors are responsible for too many teams

  • Incomplete independent inspections

  • Overreliance on self-checks


4.5. Resource Constraints


  • Shared tools, missing spares

  • Inadequate workspaces

  • Deferred maintenance due to budget limits


4.6. Culture and Priorities


  • “Get it running” mindset overrides procedure

  • Fear of reporting errors

  • Normalisation of deviations (“we always do it this way”)


ICAM helps reveal these patterns across incidents, enabling trend analysis and strategic improvement.


5. Case Study: Hydraulic Pump Failure

Incident

A hydraulic pump seized hours after an overhaul, causing a production shutdown.

Event Sequence


  1. Pump disassembled and inspected- wear observed

  2. Replacement bearings installed

  3. Reassembly completed without aligning the coupling

  4. No vibration check performed

  5. Pump started; abnormal vibration ignored

  6. Seizure occurred within hours


Failed Defences


  • Alignment check omitted

  • No sign-off on the QA sheet

  • Condition monitoring alert ignored


Individual/Team Actions


  • The technician skipped alignment due to tool unavailability

  • The supervisor is unaware due to multiple concurrent jobs


Task/Environmental Conditions


  • Time pressure from delayed parts delivery

  • Shared laser alignment tool in another area


Organisational Factors


  • Planning failed to account for tool sharing

  • No escalation path for unavailable tools

  • Production pressure prioritised the schedule over quality


Outcomes


  • $250k downtime cost

  • Rework and safety exposure


Recommendations


  • Establish tool availability check in pre-plan

  • Implement go/no-go QA sign-off

  • Create an escalation protocol for missing resources

  • Reinforce alignment training and verification


This ICAM revealed a systemic weakness in planning and QA, not just technician error.


6. Lessons from ICAM Applied to Maintenance


  1. Latent conditions dominate: Most failures stem from organisational weaknesses rather than isolated mistakes.

  2. Defences must be verified, not assumed: Paper systems fail without behavioural reinforcement.

  3. Local rationality matters: Understanding why actions made sense at the time builds trust and learning.

  4. Cross-functional collaboration is essential: Engineering, planning, and operations all influence maintenance quality.

  5. Data integration improves insight: Linking ICAM findings with reliability data (MTBF, defect logs) strengthens continuous improvement.


7. Turning Findings into Action

7.1. Strengthening Procedures


  • Use visual work instructions and torque diagrams

  • Include verification points with signatures

  • Regularly review procedures with frontline technicians


7.2. Building Competence


  • Competency frameworks linked to task criticality

  • Mentoring and peer verification

  • Simulation or VR training for complex tasks


7.3. Improving Planning and Scheduling


  • Incorporate tool/resource availability checks

  • Allocate realistic task durations

  • Integrate production and maintenance planning meetings


7.4. Enhancing Supervision


  • Define supervisor workload limits

  • Use digital checklists with mandatory sign-offs

  • Introduce independent QA inspections for critical tasks


7.5. Fostering a Learning Culture


  • Reward error reporting

  • Conduct after-action reviews for all significant maintenance events

  • Share ICAM learnings across sites


8. Metrics and Monitoring

Use ICAM findings to drive measurable improvement:


  • Reduction in repeat maintenance errors

  • Increased compliance with QA steps

  • Fewer emergency call-outs

  • Improved Mean Time Between Failures (MTBF)


Track recommendations through an Action Plan with owners, deadlines, and verification of effectiveness- a critical ICAM step often overlooked.


9. Challenges in Applying ICAM to Maintenance


  • Data availability: Limited documentation or informal practices hinder analysis.

  • Cultural resistance: Technicians may fear blame.

  • Investigation fatigue: Frequent small failures may not all warrant full ICAMs.

  • Scope creep: Distinguishing between isolated issues and systemic patterns requires discipline.


Mitigation strategies include scaled ICAMs (lite versions for minor events) and strong communication of learning intent.


10. The Payoff: From Reactive to Proactive

By systematically applying ICAM to maintenance failures, organisations move from reactive repairs to proactive risk control. Key benefits include:


  • Enhanced asset reliability

  • Reduced downtime and costs

  • Stronger safety and compliance

  • An empowered workforce engaged in learning


Every maintenance failure becomes a learning opportunity, not just a repair job.


Conclusion

Maintenance failures are not inevitable accidents; they are signals of deeper system vulnerabilities. ICAM transforms these events into powerful catalysts for organisational learning. By examining the defences, actions, conditions, and organisational factors, leaders gain clarity on what truly drives reliability- and how to build a culture where doing it right is the easy, obvious choice.

In the long run, applying ICAM to maintenance isn’t just about preventing the next breakdown- it’s about building a workplace where learning never stops, and excellence becomes routine.

 
 
 

Comments


bottom of page