Almost every complaint I hear about AI sounds like a complaint about a model. "It hallucinates." "It is too slow." "It loses track of what I asked it to do." When you actually sit with the person making the complaint, though, what you usually find is that the model performed fine and the surface failed them.
The hallucination was unflagged. The wait was unindicated. The lost thread was the result of a session boundary that the product never showed. The model did the thing; the design did not catch the user when the thing went sideways.
Three patterns that keep failing
The first is the unindicated wait. Users will wait a remarkably long time for an AI response if the product is honest about what is happening. They will lose patience in five seconds if the interface goes silent. Streaming is a design choice, not a technical one.
The second is the unverifiable claim. The model produces a confident-sounding answer; the interface presents it as fact. The user has no easy way to ask "where did this come from" without re-asking the question. Citations, source highlights, and provenance trails are not nice-to-haves. They are the difference between an answer the user can trust and one they cannot.
The third is the lost thread. The model has a context window. The product has a session. The user has a project that spans days. When those three things are not aligned, the user ends up doing the coordination work themselves — remembering what they told the model yesterday, re-pasting context they should not have to re-paste, learning to be a context manager for software that is supposed to manage context for them.
The model did the thing. The design did not catch the user when the thing went sideways.
What good design adds
Good design in this space adds three things the model alone cannot. It adds verification surfaces — the citations, the previews, the "show your work" affordances that let a user trust an answer without re-doing the work. It adds recovery paths — the easy edits, the alternative drafts, the undo, the "try again with this constraint" button. And it adds continuity — the visible record of what the agent has been told, what it remembers, what it has done on the user's behalf.
The Nielsen Norman Group's research on AI UX patterns has been making this case carefully for a while now: the bar for good AI design is higher than for traditional software, because the system is doing more on the user's behalf and the user has less to anchor their trust to.
A short field guide
If you are auditing an AI feature for design quality, three questions will get you most of the way. What does the user see while they wait? If the answer is "a spinner," you have work to do. What does the user do when the model is wrong? If the answer is "start over," you have work to do. What does the user see when they come back tomorrow? If the answer is "nothing," you have work to do.
The capability is mostly there. The surface, mostly, is not. That is the work.