say hello!

© 2026 Alex McElravy

Reframing XAI from model outputs to system understanding

Role

Design Researcher

Timeline

9 weeks

Company

CMU SEI

Description

Previous prototypes explained individual matches. I shifted focus to system-level data quality monitoring that addressed real workflow needs.

Problem

The SEI had built explainable AI prototypes users weren't engaging with and didn't know what to create next.

The needfinding had been too narrow. We knew how researchers used the matching interface. We didn't know why they were matching in the first place.

What we knew

How researchers use the matching interface: upload images, review suggestions, verify identifications.

What we didn't know

Why researchers were matching animals and what they were trying to accomplish beyond the interface.

Solution

I expanded the needfinding and shifted the prototype direction to system explainability.

By understanding the full research workflow, I reframed explainable AI (XAI) from algorithm-level outputs to system-level data transparency.

Deliverables

Analysis of 37 use cases

Coded researcher workflows and data collection methods to understand why researchers match animals.

Data quality monitoring prototype

A proof-of-concept that shifted explainability from individual match explanations to system-level data source transparency.

Impact

The research shifted how both WildMe and the SEI thought about XAI implementation.

"

Knowing these use cases is extremely helpful, we will be data mining this for a long time.

– Executive Director, WildMe

Ripple into research

The project's findings contributed to a new direction in the SEI's XAI research, moving toward questions about what happens when XAI is imperfect and how flawed explanations affect human decision-making in practice.

Approach

Navigating uncharted waters

Problem

Previous prototypes didn't resonate with users.

SEI had already created XAI prototypes that explained individual match decisions that included saliency maps and confidence indicators. But during testing, users weren't engaging with them.

"

This is great, but I'm not sure it will actually speed up my matching process. It might make my teammembers more confused.

– Shark researcher

Pivot

Sketching lead to new ideas, but revealed gaps in our knowledge of researcher workflows.

A sketch of an interactive map that can be scrubbed like a video to view animal spottings over time in a certain area.

Early sketches assumed researchers wanted to analyze patterns in their data.

But I couldn't answer a more basic question:

Why were researchers matching animals in the first place?

Research

I went back to research with a wider lens to understand why researchers match.

Without time for extensive user interviews, I analyzed 37 academic papers published by WildMe users, looking for their research questions, data collection methods, workflows, and pain points.

Findings

1

Researchers pulled data from multiple sources with varying quality.

73%

of papers relied on crowd-sourced data outside the research team.

Camera traps, citizen scientists, ecotourism operators, and external research teams each produced images with different quality and conditions. Managing data source variety was a constant struggle.

which meant,

2

Poor data quality caused inaccurate matches, which undermined projects.

A bad image upstream didn't just produce a wrong match. It corrupted population counts, skewed movement models, and invalidated fieldwork.

which mattered because,

3

Matching was never the destination, it was just a piece of something bigger.

Researchers were matching animals to feed statistical models that calculated things like population demographics and movement tracking. The match was a data point, not a finding. Getting it wrong had consequences beyond the interface.

Key insight

Users needed explainability at the system level, not the algorithm level.

To make accurate research deductions, researchers needed accurate matches. To make accurate matches, they needed good input data. Solving data quality at the source would let researchers focus on their findings, not debug their methods.

Design direction

I shifted the prototype focus from algorithm explainability to system explainability.

Rather than explaining individual AI outputs, I explored interfaces that helped researchers understand data collection over time.

Key feature

Dashboard surfaces gaps in image source coverage and collection quality across time periods and locations.

Key feature

Source-by-source breakdowns provide insight into individual collection method performance.

Why this was different

Previous prototypes explained problems after the fact, but this one prevented issues at the source.

Solving data quality at the source meant researchers could focus on the findings and impact of their work, not on debugging the methods that got them there.

Previous prototypes

Reactive explainability

Explained why the AI made a specific match after it happened. Helpful, but researchers already had workarounds.

"This match has high confidence."

New prototype

Proactive transparency

Surfaced data quality issues before they produced bad matches. Helped researchers intervene upstream, where the actual problem lived.

"Camera A's image quality has degraded. Let's reposition it to get better matches."

Tradeoffs

I chose narrow scope and a new direction. What did that cost?

What I gave up

Breadth and the original brief

Breadth and the original brief

vs

What I moved toward

Depth and real user needs

Depth and real
user needs

Depth and real user needs

Narrow scope to camera trap data quality.

Gave up

Coverage of manual surveys, citizen science, and ecotourism data sources. Each has its own quality challenges that went unaddressed.

The case

Gave WildMe a concrete, extendable idea rather than a broad prototype that didn't solve researchers challenges.

Redirect from evaluating XAI techniques to defining what XAI should do.

Gave up

The SEI's original goal of formally evaluating existing XAI methods. That evaluation didn't happen.

The case

The SEI and WildMe had different definitions of success. This direction served both more honestly than building another prototype users would ignore.

Reflections

Asking why earlier would have changed everything.

This project was ambiguous from the start. The stakeholders had different goals, the problem space was undefined, and XAI itself is still largely unproven in real-world systems.

1

Don't design on assumptions you haven't tested

I spent the first weeks sketching prototypes built on assumptions about what researchers wanted. A single question, asked earlier, would have redirected the work before it started.

2

Ask why until the answer changes the problem

The most valuable thing I did on this project was press on a question everyone else had skipped. "Why are researchers matching animals?" seemed obvious, but asking questions reframed the entire design direction.

3

Ambiguity is not a blocker

Without a clear end goal and with two stakeholders pulling in different directions, it was easy to lose the thread. Going back to users when everything else was unclear was the only step that consistently produced direction.