The Craft

Every question is
designed to land

The engineering behind 2400 questions.

Can AI create questions that actually open people up?

That was the experiment. Not 'can AI generate content' — anyone can prompt GPT to spit out icebreakers. The real question was whether we could build a system that consistently produces questions worth asking.

This isn't about AI generating content. It's about AI as a craft tool.

A carpenter doesn't just swing a hammer. They choose the right wood, measure twice, sand the rough edges. The tool is just part of a larger process of care.

That's what we built: a multi-stage pipeline where AI generates, scores, filters, and translates — but always in service of human connection.

The pipeline

Every question passes through seven distinct stages before it reaches you.

1

Generate

AI creates candidate questions based on context, deck theme, and target depth.

2

Validate

Structural checks ensure proper formatting and language quality.

3

Deduplicate

Semantic filtering removes near-duplicates before scoring — saving LLM costs.

4

Score

Each question is rated across 8 quality dimensions (more on this below).

5

Select

Top-scoring questions are chosen based on context-specific weights.

6

Translate

Questions are localized into 5 languages, preserving cultural nuance.

7

Assemble

Final questions are organized into decks across 8 contexts.

The result: 2400 questions across 8 relationship contexts, available in 5 languages. Each one scored, filtered, and curated.

Keeping the corpus fresh

Two questions can use completely different words but ask the same thing.

"What's your biggest fear?" and "What scares you most?" — different words, same territory.

If we let near-duplicates slip through, you'd burn through the deck faster than you should. Same conversation, just rephrased. So we compare questions by meaning, not just words — and only keep the ones that open genuinely new ground.

How it works

Each question gets converted into a mathematical fingerprint that captures its meaning. Questions that ask similar things have similar fingerprints — even if they use completely different words. When two questions are too similar, we keep the higher-scoring one and discard the other.

Life of a question

Every question you see has survived a gauntlet. Here's what it takes for a single question to make it into a deck:

Born from context

AI generates a candidate based on deck theme, target depth, and relationship context. "Couples intimacy deck, depth 7, exploring vulnerability."

Passes validation

Structural checks: Is it actually a question? Is the grammar clean? Is it the right length?

Survives deduplication

Compared against all 2400 existing questions. If it's too similar to something we already have, it's discarded.

Scores high enough

Rated across 8 quality dimensions. Only questions that hit the threshold for their context make it through.

Translated with nuance

Localized into 5 languages by native speakers who understand cultural context, not just word-for-word translation.

Reaches you

Finally lands in a deck, ready to spark a real conversation.

Most candidates don't survive. That's the point. The questions that reach you have earned their place.

You shape the corpus

The pipeline built the initial 2400 questions. But the corpus isn't frozen — it learns from how people actually use it.

After each card, you see:

These aren't just UI decorations.

Every vote teaches us something. Questions that consistently get thumbs-up in a particular context rise in priority. Questions that fall flat get flagged for review.

The algorithm got us started. Real conversations refine us.

Over time, the corpus evolves — shaped by thousands of real moments between real people. The questions that actually land, stay. The ones that don't, make room for better ones.

A living system

This isn't a static product. Every game you play, every vote you cast, helps shape which questions future players will experience. You're not just using SonderSync — you're improving it.

The 8 quality dimensions

Not all questions are created equal. A great conversation starter needs to hit multiple notes at once — surprising enough to be interesting, specific enough to get a real answer, calibrated to the moment.

We score every question across 8 dimensions:

Surprise

Does this take an unexpected angle? The best questions make you pause and think 'huh, I've never considered that.'

Specificity

Will this get a real answer, not a generic one? 'What's your favorite memory?' is too broad. 'What smell takes you back to childhood?' lands.

Entertainment

Is this fun to answer? Even deep questions should feel engaging, not like a therapy homework assignment.

Trojan Horse

Simple surface, unexpected depth. The question seems light but opens a door you didn't see coming.

Sonder Twist

Shifts perspective beyond your own ego. Helps you see through someone else's eyes, or see yourself through theirs.

Intensity Match

Calibrated to the moment. A first-date question shouldn't feel like a job interview. A deep check-in shouldn't feel like small talk.

Tone Match

Feels right for the relationship. Questions for couples feel different than questions for coworkers, even at the same depth.

Uniqueness

Fresh, not déjà vu. You've never seen this question on a list of '50 conversation starters.' It feels genuinely new.

Context-adaptive weights

Different contexts prioritize different dimensions. Icebreaker decks weight surprise and entertainment heavily. Couples decks prioritize trojan horse and sonder twist. The scoring adapts to what each relationship needs.

Depth calibration

Every question has a vulnerability score from 1 to 10. This isn't arbitrary — it's calibrated to match how much trust the question requires.

1-3

Safe icebreaker

Fun, low-stakes questions anyone can answer. No vulnerability required.

"If your name was a magic spell, what would it do?"

4-6

Personal territory

Sharing opinions, memories, preferences. Some self-disclosure, but still comfortable.

"What's a compliment you received that you still think about?"

7-10

Trust-requiring deep dive

Fears, failures, desires, regrets. These require a secure relationship to answer honestly.

"What behavior do you judge harshly because you fear it in yourself?"

The depth score isn't just a label. It affects how questions are weighted during selection. An icebreaker deck shouldn't accidentally include a question that requires deep trust. A couples' intimacy deck shouldn't include questions you could ask a stranger.

The foundation

All of this engineering is grounded in relationship science. The dimensions and depth calibration come from decades of research:

  • Social Penetration Theory1 — Altman & Taylor's model of how relationships deepen through gradual self-disclosure, layer by layer.
  • Attachment Theory2 — Bowlby's work on how secure attachment enables vulnerability and exploration.
  • Dunbar's Number3 — The cognitive limit on stable relationships, and why your closest 5 people matter most.
  • Narrative Identity4 — McAdams' research on how we construct our sense of self through the stories we tell.
  • Psychological Safety5 — Edmondson's work on the conditions required for people to take interpersonal risks.

The AI is a tool. The science is the blueprint.

References

1 Altman, I., & Taylor, D. A. (1973). Social Penetration: The Development of Interpersonal Relationships. Holt, Rinehart & Winston.

2 Bowlby, J. (1988). A Secure Base: Parent-Child Attachment and Healthy Human Development. Basic Books.

3 Dunbar, R. I. M. (1992). "Neocortex size as a constraint on group size in primates". Journal of Human Evolution, 22(6).

4 McAdams, D. P. (2001). "The Psychology of Life Stories". Review of General Psychology, 5(2).

5 Edmondson, A. (1999). "Psychological Safety and Learning Behavior in Work Teams". Administrative Science Quarterly, 44(2).

Try the questions yourself

2400 questions. 8 contexts. 5 languages. All crafted with care.