The problem

Would customers actually understand what the bar was telling them?

Lifeway was introducing a Savings Progress Bar to the checkout experience — a UI showing customers their current discount tier and free shipping eligibility as they built their cart. Three design variations were created, each communicating the same pricing system through different visual hierarchies, labels, and progress indicators.

The question wasn't which design looked better. It was whether customers would understand it. If users misread the pricing system, they'd face unexpected totals at checkout — eroding trust at the highest-friction point in the purchase journey.

Objective 01

Progress bar comprehension

Do users understand what the two tracking bars represent — savings thresholds vs. free shipping eligibility?

Objective 02

Savings understanding

Can users correctly identify what discounts are active and what they need to spend to unlock the next tier?

Objective 03

Comparative design clarity

Across all three design concepts, which communicates the pricing system most accurately?

The approach

Unmoderated and comparative — each participant saw one design only.

A between-subjects design eliminated order bias. Participants reviewed one design and answered a structured set of questions — multiple choice comprehension tasks, 5-point rating scales, and open-ended questions — analyzed together for a complete picture. Results were compared across participant groups post-study.

Feb 6

Request received

Feb 12

Study in field

Feb 13

Analysis complete

Feb 16

Executive shareout

4 days

The results

Design 3 felt easiest. Design 2 was understood best. Those are not the same thing.

Users rated Design 3 highest on ease of use (4.50) and savings clarity (4.50). But Design 2 outperformed on every actual comprehension measure — by a significant margin.

Winner on comprehension

Design 2

92%

Core accuracy

85%

Overall understanding

Outperformed both other designs on every comprehension measure. Savings clarity rating: 4.30. Ease of use: 3.90. Users understood what was happening even when they rated it slightly harder to process.

Felt easiest — but

Design 3

76%

Core accuracy

69%

Overall understanding

Highest ease rating (4.50) and highest savings clarity rating (4.50) — but underperformed Design 2 on every actual comprehension measure. It felt simple. It was not understood.

Perceived ease ≠ actual understanding.

A design that feels simpler can still ship users into confusion — especially when the stakes involve pricing. Optimizing for feeling over comprehension is a risk this data made visible.

Recommendation

Ship Design 2 — and selectively incorporate Design 3 where it genuinely outperformed.

Primary

Ship Design 2

Highest core accuracy (92%) and overall comprehension (85%). The recommendation is grounded in evidence across all three comprehension metrics — not perception. This is the design that correctly communicates the pricing system to the most users.

Secondary

Incorporate Design 3 refinements

Design 3 excelled at perceived savings clarity. Selective copy and visual elements from Design 3 can be layered into Design 2 without compromising comprehension — getting the accuracy of 2 with the clarity signals of 3 where they don't conflict.

If stakeholders remain divided

Run a live A/B test

A live A/B between Design 2 and a refined hybrid provides real behavioral data at scale. If the stakeholder conversation continues after refinements are applied, the A/B gives you production-level evidence to close it.

Reflection

What worked and what I'd do differently.

What worked well

Splitting quant and qual analysis produced a richer picture than either alone — the open-ended responses added texture to the comprehension numbers.
Screener was designed to match actual Lifeway customer segments rather than general population panels. Results were more applicable because of it.
The TL;DR summary was written for exec audiences, not just researchers. The team could act on it immediately without a translation layer.

What I'd do differently

Add a think-aloud component for a subset of participants to capture in-the-moment confusion — post-task reflection misses what happens in the moment of processing.
Test a fourth hybrid design combining Design 2 comprehension with Design 3 perceived clarity before recommending A/B. The recommendation was logical but the hybrid wasn't validated. That's a gap.

Common questions

Comprehension studies and comparative design evaluation

What is a comprehension study in UX research?

A comprehension study measures whether users correctly understand what a design is communicating — not just whether they find it easy to use. It uses structured comprehension tasks alongside rating scales and open-ended questions to separate perceived clarity from actual accuracy. This type of study is especially valuable when the design communicates complex information like pricing systems, multi-tier discounts, or eligibility thresholds.

How do you run a comparative design study without order bias?

Each participant sees only one design, eliminating order effects that occur when participants evaluate designs sequentially. Results are then compared across participant groups post-study. This between-subjects approach requires more participants than a within-subjects design but produces cleaner data — especially important when measuring comprehension, where seeing one design first would color perception of the next.

Why does perceived ease not always predict actual understanding?

Perceived ease reflects how effortful a design feels — a subjective experience. Actual comprehension measures whether users correctly understood what the design communicated — an objective outcome. Designs that feel simple sometimes achieve that by omitting information or reducing visual complexity in ways that prevent users from forming an accurate mental model. This study is a clear example: Design 3 felt easiest but produced the lowest comprehension scores.

What research methods work best for evaluating checkout UI?

Checkout UI evaluation benefits from a combination of comprehension tasks (to test accuracy), rating scales (to measure perceived ease and clarity), and open-ended questions (to capture in-the-moment language and confusion points). Adding a think-aloud component — even for a subset of participants — surfaces confusion at the moment of processing rather than in post-task reflection. For high-stakes checkout elements like pricing systems, behavioral data from A/B tests in production complements comprehension study findings.

Checkout Savings Comprehension Study

Stakeholders

Timeline

Research methods

Outputs