You ship a new onboarding flow. You ask one of your most engaged customers what she thinks. She smiles, says it is much cleaner than before, and tells you she would definitely use it.
Three weeks later, activation has not moved. The workaround she said she would abandon is still visible in session recordings. The feature is not broken. The feedback was.
That does not mean she lied to you. She was being kind. She wanted the call to go well. She could see that you cared about the answer. So she answered the social question in front of her, not the product question you thought you had asked.
This is social desirability bias: the pressure to answer in a way that sounds competent, cooperative, generous, modern, or acceptable to the person asking. In user research, it is one of the reasons founder interviews produce so many warm feelings and so little decision-quality evidence.
If you want honest product feedback, do not start by telling users to be honest. Design the conversation so honesty costs them less.
What social desirability bias means
The Marlowe-Crowne Social Desirability Scale traces the construct to Douglas Crowne and David Marlowe’s 1960 work on people’s tendency to seek approval through culturally acceptable answers. Springer’s reference entry describes the same bias as a tendency to respond in socially sanctioned ways, which is why the concept appears so often in survey and psychology research.
For product teams, the plain-English version is enough:
People shade answers towards the version of themselves they would rather present.
That version is not always fake. A user may sincerely believe they are organised, price-insensitive, security-conscious, eager to adopt new workflows, and likely to recommend good products to peers. In practice, they may ignore onboarding emails, avoid configuration work, delay renewals, forget features, and choose the cheaper workaround when the invoice arrives.
The gap is human. People want to be helpful. They do not want to look confused. They do not want to hurt the founder’s feelings. They often do not have clean access to why they did something, so they reconstruct a tidy explanation after the fact.
Calling this “users lying” is tempting, but it points you towards the wrong fix. The problem is usually not dishonesty. It is the interview situation.
Why founder interviews amplify the bias
Social desirability bias gets louder when the answer is visible, sensitive, or socially costly. That is exactly the shape of many founder-led interviews.
The participant is not talking to a neutral researcher. They are talking to the person who built the product, wrote the roadmap, and clearly hopes the answer is positive. The founder’s face is on the call. Their excitement is obvious. Their follow-up questions reveal what they want to be true.
In that setting, criticism has a social price.
Telling a founder “I could not understand your onboarding” may make the participant feel incompetent. Saying “I would not pay for this” may make them feel cheap or rude. Admitting “we bought it because the champion pushed it through, but nobody else uses it” may feel politically awkward. So the answer softens:
- “It was pretty clear.”
- “The price seems reasonable.”
- “I can see us using this.”
- “It is nice.”
Survey-method research gives the mechanism some weight. Kreuter, Presser, and Tourangeau compared telephone interviewer, interactive voice response, and web surveys, using administrative records for validation. Their study found that self-administered modes tended to reduce socially desirable reporting compared with interviewer-administered ones. Gnambs and Kaspar’s meta-analysis of sensitive-behaviour disclosure similarly found that computerised self-administration could increase reporting of socially undesirable behaviours compared with paper modes.
The practical lesson is not that every interview should become a survey. It is that the presence of an interested human changes the answer. A founder is a particularly interested human.
The evidence problem: what users say is not what they do
Social desirability bias is only one part of a larger evidence problem: self-report is a weak guide to behaviour.
Nielsen Norman Group’s “First Rule of Usability” article is blunt about this. In Nielsen and Levy’s analysis of 113 pairwise interface comparisons, stated preference and task performance correlated at 0.44. A later website-focused dataset raised the correlation to 0.53. That is not zero, but it is far too weak to treat preference as a substitute for behaviour.
Nielsen’s example still fits product work: if users say they would buy more from a site with 3D product views, that does not prove 3D views will change buying behaviour. It proves the idea sounds appealing when described in a survey.
The same pattern shows up in SaaS research:
- “Would you use automated reporting?” sounds useful.
- “Would you pay for advanced permissions?” sounds responsible.
- “Would you recommend us?” sounds friendly.
- “Would you switch if we added this integration?” sounds plausible.
None of those answers has met a deadline, a budget holder, a migration cost, a teammate, a broken Zapier flow, or a credit card form. They are social answers about imaginary behaviour.
Even sincere intention is not the same as future action. Sheeran and Webb’s work on the intention-behaviour gap is a useful reminder: people often intend to do something and still fail to do it when real constraints arrive. For user research, that means “I would use this” is discounted twice. First because the participant may want to please you, and again because intention itself is not behaviour.
The three filters between reality and feedback
Nielsen’s diagnostic is useful because it shows how far a verbal answer sits from the thing you actually care about.
First, people edit for the room. They say what feels socially acceptable or what they think the interviewer wants to hear. That is social desirability bias proper.
Second, they remember imperfectly. Most product decisions depend on small moments: the field they skipped, the report they rebuilt, the teammate who ignored the invite, the point where they opened a spreadsheet instead. Memory compresses those details into a cleaner story.
Third, they rationalise. A participant may say they missed the button because it was too small. Maybe. All you know for sure is that they missed it. The explanation may be a theory that protects their self-image.
Stack those filters and the problem becomes obvious. “What did you think?” asks a user to summarise a messy event, preserve their self-image, and manage your feelings at the same time.
The answer can be pleasant and still be weak evidence.
How to design for honesty
You cannot remove social desirability bias by asking, “Please be honest.” Most participants already think they are being honest. The better move is to lower the social cost of the truth and anchor the conversation in evidence that already happened.
Ask about the past, not the future
This is the core lesson of The Mom Test and the Y Combinator customer-interview transcript that popularises the same pattern for founders: talk about the person’s life, ask about specifics in the past, and talk less than you want to.
“Would you use this?” invites a kind prediction. “Tell me about the last time you tried to do this” invites a story.
Stories are not perfect. People still forget, compress, and rationalise. But a recent story gives you people, tools, timing, constraints, workarounds, and consequences. A hypothetical gives you optimism.
Ask for scenes, not summaries
Teresa Torres recommends questions that get people to describe recent concrete examples. That is the useful move. A summary like “we usually handle that in Slack” hides the exceptions. A scene like “walk me through the last time this came up” reveals who posted, who replied, what was missing, how long it took, and what happened next.
Good prompts sound like this:
- “When did this last happen?”
- “What were you trying to do?”
- “What did you do next?”
- “Who else was involved?”
- “What did you use instead?”
- “What made that hard?”
Those questions do not ask the participant to evaluate your idea. They ask them to reconstruct reality.
Make the negative answer easy
Some questions make the flattering answer too easy.
“Was onboarding clear?” asks the user to admit confusion. “Where did you hesitate during onboarding?” assumes hesitation is normal and asks them to locate it.
“Does the price feel fair?” asks them to call you expensive. “Walk me through the last tool like this you paid for, and how you decided it was worth it” asks them to describe a purchasing event.
“Do you like the new dashboard?” asks for approval. “What did you try to do first, and where did you end up?” asks for behaviour.
Normalising the undesirable answer matters. You can say, “A lot of people get stuck somewhere in this flow; I am trying to find the rough spots.” That gives the participant permission to point at the rough spot without feeling like they failed.
Reduce your visible investment
If you are the founder, you cannot become neutral. You can become quieter.
Do not pitch before the interview. Do not explain the roadmap. Do not defend the confusing part. Do not reward praise with visible relief. Do not punish criticism with a long silence and a wounded face.
Open with the job:
I am trying to find what is broken, confusing, or not worth the effort. Blunt examples are more useful than polite approval.
Then act as if you meant it.
When someone criticises the product, thank them and ask for the scene. When someone praises it, thank them and ask for the scene. Your emotional response trains the participant on what kind of answer is safe.
Watch behaviour wherever you can
The strongest antidote to social desirability bias is behaviour.
Usage data, session recordings, support tickets, search logs, renewal notes, failed imports, ignored invites, and repeated exports all tell you what people did when no interviewer was watching.
Interviews should explain behaviour, not replace it. If the analytics show a sharp drop after setup, recruit people who dropped and ask them to reconstruct the last session. If the interview says a workflow is painful, inspect whether behaviour shows the same friction. If stated preference and behaviour disagree, treat the mismatch as the research question.
Probe every compliment
Praise is not useless. It is just incomplete.
When someone says “this is great”, ask:
- “What specifically made it useful?”
- “When did you last need that?”
- “What would it replace?”
- “What would happen if you did not have it?”
- “Who else would need to care?”
Sometimes the compliment resolves into evidence. Sometimes it dissolves into politeness. Both outcomes are better than writing “loved it” in your notes and calling it a finding.
A pricing example
Suppose you want to know whether a GBP 49 monthly plan is viable.
The obvious question is:
Does GBP 49 a month feel fair for what you get?
That question is socially loaded. Saying no may feel rude. Saying yes costs nothing. The participant can approve your pricing without ever facing a purchase decision.
Rewrite the question around past behaviour:
- “Walk me through the last tool in this category you paid for.”
- “What was happening when you decided it was worth paying?”
- “Who had to approve it?”
- “What did you compare it with?”
- “What subscription did you cancel recently, and what tipped the decision?”
- “When price came up internally, what did people say?”
Now the answer has context. You learn whether the buyer owns budget, whether the purchase solved an urgent problem, whether the comparison set is a spreadsheet or a funded competitor, and what kind of value proof survives a real approval process.
You still may not get a perfect answer. But you are no longer asking someone to perform generosity in front of the person who wants the money.
Where Maren fits
Maren is built around the idea that honesty is partly a design problem.
The participant gets a high-entropy link and can answer asynchronously. There is no founder on the call waiting for reassurance. Maren can ask patient follow-ups when an answer is vague, keep the conversation anchored in recent behaviour, and bring back themes across interviews without turning one polished compliment into a roadmap.
That does not remove the founder’s responsibility. You still choose the research question. You still recruit the right participants. You still decide what the evidence means. But the conversation itself can happen in a lower-pressure setting, with a moderator that is not visibly invested in being liked.
That is the practical opportunity: not “AI makes people truthful”, but “a well-designed interview can make truthful answers easier.”
The short version
Social desirability bias is not a character flaw in your users. It is a predictable response to a social situation.
If you ask invested, visible, future-tense questions, you will collect invested, visible, future-tense answers. They will sound warm. They may even be sincere. They will still be weak evidence.
To get closer to the truth, ask about recent past behaviour. Ask for scenes. Make criticism normal. Stay neutral when praise arrives. Compare self-report with behavioural data. Treat every compliment as the start of a probe, not the end of a finding.
The honest conversation is not the one where you beg users to be brave. It is the one where telling the truth is the easiest thing to do.