Charli Howe, ADINJC General Council

Introduction

One of the standout sessions at ICE Live 2026 was the Workshop on Evaluation for Intervention, delivered by Ian Edwards MSc.

It was a thoughtful, grounded, and at times challenging exploration of what it really means to evaluate virtual reality (VR) and other safety interventions.

This is especially true in road safety, where good intentions don’t always lead to good outcomes.

What Do We Mean by “Evaluation”?

Ian offered a refreshingly direct definition:

“Evaluation is a process that produces objective evidence of the outcome of an intervention.”

The key word here is objective.

Ian framed evaluation as a question:
Is there sufficient evidence to stand up in a court of law?
If the answer is no, then regardless of how convincing or innovative an intervention feels, it cannot be justified.

As Ian put it:

“If you can’t objectively demonstrate that something works, you may be wasting money and worse, you could be making the problem bigger.”

When Interventions Backfire

To underline this point, Ian shared an example from a review of novice driver skid-control programmes. These interventions were designed to improve vehicle handling and reduce crashes.

Instead, the evaluation revealed the opposite effect.

The programmes:

  • Encouraged drivers to underestimate danger
  • Increased driver confidence without increasing judgement
  • Led to higher speeds and increased collision rates

This is a well-documented phenomenon in road safety known as risk compensation. When people feel more skilled or protected, they may take greater risks.

Without evaluation, these unintended consequences would have remained invisible.

The Kirkpatrick Model: A Practical Framework

A significant part of the session focused on the Kirkpatrick Model (1959), a four-level framework widely used to evaluate training and interventions.

Ian highlighted its particular value for practitioners because it forces clarity about what is being measured and why.

Together, the four levels provide a structured way to assess impact.

Level 1: Reaction

This is the most basic and easiest level of evaluation.

  • Did participants enjoy it?
  • Was it engaging or immersive?
  • Did they find it useful or thought-provoking?
  • Would they recommend it?

Reaction data is helpful, but on its own it tells us very little about effectiveness.

Level 2: Learning

This level asks whether learning has actually occurred.

  • Has understanding changed?
  • Has knowledge increased?
  • What new insights have participants gained?

Ian shared post-VR headset data showing clear increases in knowledge and understanding after participants experienced the VR content.

This is encouraging but still not enough.

Level 3: Behaviour

This is where many interventions falter. Learning does not automatically translate into behaviour change.
Ian used smoking as a simple example: most smokers understand the risks extremely well, yet behaviour persists.

Behaviour change:

  • Is influenced by habit, context, incentives, and social norms
  • Requires more than information alone

In the VR evaluation data, behavioural intention scores did increase, but not as strongly as learning outcomes, a pattern commonly seen across safety education.

Level 4: Results

The final and most important level asks:

  • Did the intervention succeed?
  • Were the outcomes what we expected?
  • Did it reduce harm, risk, or negative outcomes?

This is also the hardest level to measure, as it often requires long-term data, comparison groups, and careful control of confounding factors.

Together, these four levels form the backbone of a robust evaluation.

Designing Evaluation the Right Way Round

A key theme running through Ian’s session was that evaluation must be designed before the intervention, not retrofitted afterwards.

The evaluation process should include:

What needs to be measured

  • Choose an intervention that fits your objective, not the other way around
  • Be clear on what success looks like and understand the entire intervention, not just the VR element

Evaluation design

  • Decide early how outcomes will be assessed and compared

Data requirements

  • Quantitative data (scores, measures, rates)
  • Qualitative data (feedback, perceptions, lived experience)

Measures

Ensure measures are:

  • Reliable
  • Valid
  • Capable of detecting improvement

Sampling

Effect size matters.

  • Small effects require large samples
  • Larger effects can be detected with smaller samples, but with higher risk of error
  • As a rough guide, Ian suggested a minimum sample size of around 100 participants

Data analysis and reporting

  • Results should be published where possible so others can scrutinise and learn from them
  • In road safety, the Road Safety GB Knowledge Centre is a strong platform for sharing findings

Ethics and Legal Considerations

Evaluation isn’t just technical, It’s ethical.

Ian emphasised the importance of:

  • Informed consent
  • Minimising harm to participants
  • The right to withdraw data
  • Confidentiality and data protection
  • Professional competence
  • Openness and honesty in reporting results

These considerations are especially important when immersive technologies like VR are involved, where emotional and physical responses can be stronger than traditional training methods.

Practical Tools for Practitioners

For those looking to put this into practice, there is an interactive evaluation methods document available via the National Fire Chiefs Council (NFCC).

It provides structured guidance on evaluation design and is a genuinely useful resource for anyone working with interventions, not just VR.

You can find it here.

Final Thoughts

This workshop was a timely reminder that innovation alone does not equal impact.

VR can be immersive, engaging, and powerful but without rigorous evaluation, we simply don’t know whether it is helping, doing nothing, or quietly making things worse.

As Ian Edwards demonstrated so clearly, good evaluation isn’t about proving we’re right. It’s about making sure we’re not wrong and that’s a responsibility no safety practitioner can afford to ignore.

Was this article helpful?
YesNo