Two AI product interfaces side by side, with the left example flagged for unclear states and the right example showing labelled confidence and editable outputs.
Table of contents

Designing AI Products: UX Principles That Actually Work

AI products break the usual contract between user and interface. Outputs are probabilistic, the system can be confidently wrong, and users don't know whether to trust it. Here are eight UX principles we use at UntilNow to design AI products that people actually keep coming back to.
Berty Bhuruth

Berty Bhuruth

May 3, 2026

10

min read

TL;DR — AI products break the usual contract between user and interface. Outputs are probabilistic, the system can be confidently wrong, and users don't know whether to trust it. The UX work isn't decorative — it's what makes the technology usable. Here are eight principles we use at UntilNow when we design AI products that people actually keep coming back to.

Why AI product design is different

Traditional UX design operates on a fairly reliable contract. The user clicks a button, something predictable happens. Forms validate. Pages load. Cause and effect are clear.

AI breaks that contract.

When you're designing AI products, you're working with outputs that are probabilistic, not deterministic. The same input can produce different results. The system can be confident and wrong, or right and unable to explain why. And users — reasonably — don't know whether to trust it.

That's what makes AI product design genuinely hard. You're not just designing screens; you're designing the relationship between a person and a system that behaves more like a colleague than a calculator. A colleague who's sometimes brilliant, sometimes confused, and never quite tells you how they got to the answer.

The teams shipping the AI products people actually use aren't the ones with the best models. They're the ones who've thought carefully about how the model meets the user — about uncertainty, trust, and control. The eight principles below are the patterns we keep coming back to when we work on AI products at UntilNow.

Principle 1: Set expectations clearly

Users need to know what the AI can do, what it can't, and roughly how well it does both. This sounds obvious. Almost nobody does it well.

ChatGPT gets this partly right — opening a new conversation, it warns the model can make mistakes and suggests checking important information. The placement is correct; the problem is most users skim past it.

Grammarly does it better. Its writing suggestions come labelled — Clarity, Engagement, Delivery — so you understand the scope of the AI's opinion before you accept it.

What this looks like in practice:

  • Onboarding copy that's specific, not generic. "This AI summarises meeting notes and suggests action items. It won't always catch context from side conversations" beats "AI-powered productivity."
  • Inline capability boundaries. When Notion AI generates text, it places the output in context and clearly marks it as AI-generated. There's no ambiguity about what you're looking at.
  • Honest naming. Don't call it an "AI Assistant" if it does three things. Call it what it is.

The point isn't to undersell. It's to build trust by being straight with people about what the experience will actually be.

Principle 2: Design for uncertainty

AI outputs aren't binary. They sit on a spectrum of confidence, and your UX should reflect that.

Google Photos handles this well. When its AI identifies faces, it doesn't label them definitively — it groups similar ones and asks "Is this the same person?" Uncertainty becomes a collaborative interaction, not a potential mistake.

Linear takes a different but equally smart approach. Its AI project estimates come as ranges, not single numbers. "3–5 days" reads as honest in a way "3 days" doesn't.

Patterns that work:

  • Confidence indicators. Show users how sure the AI is. It doesn't have to be a percentage — "Best match" vs "Other suggestions" is enough.
  • Multiple options, ranked. Instead of one AI output, offer two or three with clear ranking. Spotify's Discover Weekly is a playlist, not a song, because the AI is openly guessing about your taste.
  • Visual differentiation. AI-generated content should look different from user-created content. A subtle border, background tint, or icon. Notion AI's purple sparkle does this without being intrusive.

When users can see how confident the system is, they trust it more — even when the underlying output is identical. Perception is part of the experience.

Principle 3: Make it correctable

If users can't fix what the AI gets wrong, they'll stop using it. Every AI output should be editable, undoable, and overridable.

Figma AI nails this. Whatever its AI generates — design suggestions, auto-layouts — lands on the canvas as regular, fully editable layers. No "AI lock," no special mode. It's just another design element from the moment it appears.

ChatGPT's Canvas mode is similar. AI suggestions show as tracked changes that can be accepted, rejected, or modified individually. The user stays in control of every edit.

Grammarly understood this from day one. Every suggestion is one click to accept or dismiss. You can ignore the AI on any given suggestion without affecting future ones.

Correction patterns to build in:

  • Inline editing of AI outputs. Users should be able to click into AI-generated text and just type. No special mode required.
  • Undo that actually works. Reverting to the pre-AI state, not just undoing the last character. If the AI rewrote three paragraphs, one undo should bring all three back.
  • Feedback loops. Let users tell the AI why it was wrong — "Not relevant," "Too formal," "Wrong topic." This trains better outputs and gives users agency.
  • Override without friction. If the AI auto-categorises an email, changing the category should be one click, not three.

The principle is simple: AI should suggest, not dictate. The moment users feel locked into an AI's decision, trust evaporates.

Principle 4: Progressive disclosure

Don't dump every AI feature on users at once. Standard UX wisdom — but especially important for AI, where capabilities are unintuitive and the learning curve is steep.

Notion AI does this well. You start with a simple / command. As you use it, you discover it can summarise, translate, extract action items, change tone — but none of that complexity hits you on day one.

Spotify has run this play for years with its recommendation engine. Simple home feed first. Over time, Discover Weekly, then Daily Mixes, then AI DJ, then daylist — each layered in as you engage with the previous one.

How to layer AI features:

  • Start with one clear AI capability. What's the single most valuable thing your AI does? Lead with that.
  • Unlock based on usage. If someone hasn't used the basic summarisation feature, don't push the advanced "generate report from multiple sources" feature.
  • Use contextual prompts. Show AI capabilities when they're relevant — "AI can help you get started" on a blank page; "AI can spot anomalies" on a data view.
  • Let power users opt in. Custom prompts, model selection, advanced settings — keep them available but tucked away for the people who want them.

Principle 5: Explain without lecturing

Users want to understand enough about how the AI works to trust it. They don't want a machine learning lecture.

Spotify Wrapped is the perfect balance. It tells you the AI "noticed" patterns in your taste — without ever mentioning collaborative filtering or neural networks. The explanation is in the user's language, not the engineer's.

Google Photos teaches its search capability with examples: "Try searching for 'beach' or 'birthday'." You learn the AI understands scene content without anyone explaining computer vision.

Transparency that actually helps:

  • Show the "why," not the "how." "We recommended this because you liked similar items" beats "Our collaborative filtering algorithm identified a cosine similarity score of 0.87."
  • Use plain-language labels. "Based on your recent activity" beats "Personalised using machine learning."
  • Offer detail on demand. A small "Why this suggestion?" link lets curious users dig deeper without cluttering the interface for everyone else.
  • Be honest about limitations. "This summary might miss nuance from longer discussions" is more useful than pretending the AI is perfect.

Transparency builds trust, but only when it's accessible. Match the depth of explanation to the consequence of getting it wrong — a writing tool needs less; a medical assistant needs much more.

Principle 6: Design the empty state

What happens before the AI has enough data to be useful? This is the moment most AI products fall apart.

If your recommendation engine needs 50 interactions to be accurate, what does the user see for the first 49? If your writing assistant needs context about someone's voice, what does it do on the first document?

Spotify solved this with onboarding. New users pick artists they like, giving the engine enough signal to start making useful suggestions immediately. The empty state is the training phase, disguised as a fun activity.

Linear pre-populates AI features with team-level data rather than waiting for individual usage. When you join a team, the AI already knows typical sprint velocities and project patterns. The empty state is never truly empty.

Empty state strategies:

  • Collect useful input during onboarding. Preferences, examples, goals — anything that gives the AI a head start.
  • Use sensible defaults. Pre-populate with industry benchmarks, popular choices, or team-level data until individual data accumulates.
  • Be transparent about the ramp-up. "Your recommendations will improve as you use the app. Here are popular picks to start with" sets expectations cleanly.
  • Make the empty state useful on its own. Even without personalisation, can the AI offer generic value — templates, popular content, getting-started flows? Don't let the empty state be literally empty.

[Inline image — three stages of an AI product: empty state with onboarding prompts, partially trained state, fully personalised state. Replace with image in Designer.]

Principle 7: Handle errors gracefully

AI will be wrong. Not occasionally — regularly. Your error handling isn't a nice-to-have; it's what determines whether users give the AI another chance.

ChatGPT has improved here. When it generates code that doesn't work, it can acknowledge the error, explain what went wrong, and try again. The conversational format makes recovery feel natural rather than like a system failure.

Grammarly handles false positives gracefully — flagging something that's actually correct is dismissable in one click, and the AI adjusts confidence for similar patterns.

Error patterns for AI:

  • Distinguish AI errors from system errors. "The AI misunderstood your request" is different from "Something went wrong." Users need to know whether to retry, rephrase, or report a bug.
  • Offer recovery paths. When the AI is wrong, give users clear next steps — "Try rephrasing," "Choose from these alternatives," "Edit the output directly."
  • Don't pretend it didn't happen. If the AI generates something inaccurate or nonsensical, acknowledge it. "That doesn't look right — let me try again" goes a long way.
  • Learn from corrections visibly. When a user fixes the AI and the AI incorporates the fix, show it. "Got it — I'll remember that preference" builds confidence the product is improving.
  • Set up graceful fallbacks. When the AI genuinely can't help, hand off to a non-AI path. A search bar, a support link, a manual workflow — never leave users at a dead end.

Users are surprisingly forgiving of AI that gets things wrong, as long as the recovery path is obvious. They're much less forgiving of AI that fails silently.

Principle 8: Keep humans in the loop

Automation is powerful. The moment users feel the AI is making decisions for them rather than with them, you've lost them.

Figma AI gets this. Its design tools suggest layouts and components, but the designer chooses what to accept. The AI accelerates the work without removing the human from the creative process.

Notion AI follows the same pattern. It'll draft content, but the draft sits in a clearly marked block you choose to keep, edit, or discard. The AI never overwrites your existing work without permission.

Google Photos, when it creates auto-generated albums or collages, presents them as suggestions you can accept, modify, or ignore. It never permanently reorganises your photo library without consent.

Human-in-the-loop patterns:

  • Confirm before acting. AI should preview its planned actions before executing them. "I'm about to reorganise these 200 files. Here's a preview — proceed?"
  • Offer manual alternatives. Every AI-automated workflow should have a manual override. If the AI sorts your inbox, you should be able to sort it yourself just as easily.
  • Respect corrections. If a user overrides the AI, don't re-suggest the same thing five minutes later. That's not persistent; it's annoying.
  • Separate suggestions from actions. AI that suggests you delete old files is helpful. AI that deletes old files is terrifying. The gap between suggestion and action should always involve explicit consent.
  • Let users adjust the automation level. Some users want aggressive automation. Others want minimal suggestions. Give them a dial, not a switch.

The eight principles at a glance

# Principle Core question Real product example
1 Set expectations clearly Does the user know what the AI can and can't do? Grammarly's labelled suggestion types
2 Design for uncertainty Can users see how confident the AI is? Google Photos' face-grouping confirmations
3 Make it correctable Can users edit, undo, and override AI outputs? Figma AI's fully editable design layers
4 Progressive disclosure Are AI features introduced gradually? Notion AI's / command discovery
5 Explain without lecturing Do users understand the "why" without the "how"? Spotify Wrapped's plain-language insights
6 Design the empty state Is the product useful before the AI has data? Spotify's artist-picking onboarding
7 Handle errors gracefully Can users recover when the AI is wrong? ChatGPT's conversational error recovery
8 Keep humans in the loop Do users feel in control of AI actions? Google Photos' suggested (not automatic) albums

How we put this into practice

If you're building an AI product, the most useful thing we can tell you is this: bring a UX designer into the conversation before you've finished the model, not after.

The biggest mistake we see is teams treating AI UX as a skin you apply at the end. It isn't. How you handle uncertainty, errors, and user control shapes the architecture of the product itself. These aren't cosmetic decisions.

A useful external reference if you want to browse real-world AI patterns: Shape of AI maintains a curated library of interface patterns from shipping products. Worth a scroll when you're trying to figure out how others have solved a specific UX problem.

At UntilNow we work with founders and product teams on digital product design where these principles sit at the centre of the process — from concept through to shipped product. If your team is designing something with AI in it and you want it to actually work for real users, we should talk.

FAQ

What's the biggest UX mistake in AI product design?

Overpromising and underdelivering. Most AI products position themselves as magical, then fail to meet the expectations they've created. The fix is straightforward: be specific about what the AI does, show its limitations upfront, and design for the cases where it gets things wrong. Users are surprisingly forgiving of imperfect AI — as long as they weren't promised perfection.

How do you test UX for AI products when outputs are non-deterministic?

This is one of the trickiest parts of UX design for AI applications. You can't run a standard usability test where every user sees the same thing. Instead, test across a range of AI outputs — best case, average case, worst case. Prototype with realistic AI responses (including wrong ones) and watch how users react. Test the error recovery flows as rigorously as you test the happy path.

Should AI-generated content be visually labelled in the interface?

Yes, almost always. Users should be able to tell the difference between content they created, content the AI generated, and content the AI modified. It doesn't have to be heavy — a subtle icon, a different background colour, a small label. Notion AI and ChatGPT both do this with distinctive markers for AI-generated blocks.

How much should you explain about how the AI works?

Enough to build trust, not so much that you overwhelm people. The right amount depends on the audience and the stakes involved. A creative writing tool might only need "Inspired by your recent stories." A medical AI needs much more — "This assessment considered your symptoms, history, and current research; here are the primary factors." Match the depth of explanation to the consequence of getting it wrong.

When should you NOT use AI in a product?

When the cost of an error is too high and you can't design adequate safeguards. When a simpler, deterministic solution would work just as well. When the AI doesn't meaningfully improve the user's experience and you're adding it because investors expect it. AI for the sake of AI is bad product design — and users can spot it instantly.

Recent News

Copied to Clipboard