Informing High-Stakes Procurement Through Comparative Usability Evidence

Federal Health Agency

Informing High-Stakes Procurement Through Comparative Usability Evidence

Context & Problem

A 300,000-person federal health agency needed to select a new point-of-care (PoC) Clinical Information Resource (CIR) vendor. Vendor proposals alone could not answer critical questions such as:

Which system best supports real clinical information-seeking under time pressure?
How do usability differences translate into efficiency, accuracy, and perceived usefulness?
Where does prior familiarity bias distort evaluation results?

Choosing the wrong system would scale inefficiencies across thousands of clinicians, patients, and medical personnel.

My Role

Senior UX Researcher: I supported the study design, task development, data collection, synthesis, and reporting.

I worked together with a 6-person evaluation team that included licensed independent practitioners, clinical informaticists, and usability engineers.

Research Strategy & Rationale

We designed a comparative usability study that balanced research rigor with real-world constraints, intentionally combining:

Quantitative performance metrics
Standardized perception measures through the technology acceptance model (TAM) framework
Qualitative observations during live clinical tasks

This combination allowed decision-makers to compare products beyond surface-level satisfaction scores.

Methods

Moderated usability testing sessions were conducted with 45 clinicians. These 60-minute sessions included:

Scripted clinical scenarios with standardized information-seeking tasks
Counter-balanced product order to reduce familiarity bias
A mix of quantitative and qualitative measures, including:
- Task completion time
- Search success and clinical correctness
- Perceived usefulness and ease of use (TAM) scores

Key Risks Addressed

Familiarity Bias

All participants had prior experience with the incumbent system. I helped design recruitment and orientation protocols to reduce this bias in interpretation.

Measurement Accuracy

Manual observation introduced variability in timed tasks. We implemented dual-observer reconciliation and clear task-timing boundaries to improve reliability in the average times.

Insights

This evaluation surfaced meaningful, statistically significant differences between clinical information resource systems across:

Efficiency of information retrieval
Accuracy of clinical interpretation
Perceived usefulness under real task conditions

In addition, qualitative observations helped distinguish true usability issues versus clinician knowledge gaps or environmental distractions. This prevented over-attributing failures to the tools themselves.

Impact

The findings of this evaluation directly supported:

Evidence-based vendor comparison
- This study replaced vendor claims and anecdotal preferences with comparative, statistically grounded usability evidence, directly supporting a major procurement decision.
Risk-aware procurement decisions
- Measures such as task-time, search success, and clinical correctness translated individual user behaviors into system-level implications for productivity and patient safety.
Clear tradeoffs between familiarity, efficiency, and long-term scalability
- By separating prior familiarity bias from true usability performance, the research clarified which advantages were temporary and which would scale across the organization.

Strategic Outcome

Leadership teams could move forward with confidence, backed by empirical evidence rather than vendor claims or anecdotal feedback.

Without controlled comparative testing, the agency risked selecting a system based on familiarity or perception – potentially scaling inefficiencies across thousands of clinicians, patients, and other medical personnel.

Federal Health Agency