Anna Graniczna is 31, has fear of abandonment, and doesn't exist
How we build synthetic patients for research and testing without risking the privacy of real people.
What we publish and why
Every AI tool in mental health needs test data. Real session recordings are not an option for us — for three reasons. Hence synthetic patients. Anna Graniczna is the first fully published case from our pipeline.
Table of contents
Three problems with real recordings
The simplest path to a dataset would be anonymized recordings of real therapy sessions. In practice, that path is closed — for three concrete reasons.
Privacy is non-transferable
No real patient knowingly consents to the publication of 20 hours of transcripts
And if they do consent — it's worth considering, from the perspective of the strength of consent, whether this isn't consent coerced by context (the relationship with the therapist, pressure of the clinical environment). In practice: even carefully anonymized recordings contain information that re-identifies the patient for people in their immediate circle. A specific memory, a specific phrase, a specific biographical event — all of these are a "fingerprint".
Open corpora are English-language
The Polish patient barely exists in clinical datasets
Almost all publicly available datasets of therapy transcripts originate in the US and UK. The Polish language of therapy — with its tonal specifics (the formal "Pani" forms, the shift to "ty", directness vs politeness), its cultural specifics (the role of the mother, transgenerational patterns, parental alcoholism as a frequent context), and its institutional specifics (the NFZ public health fund, private practices, modalities available on the market) — requires a separate dataset.
You can't 'order' real recordings
A specific clinical profile, a specific modality — at product pace
Every week the team needs different cases: testing feature X requires a BPD-spectrum patient, validating model Y requires a man with depression after losing his wife, a conference presentation needs a PTSD case. Real recordings are not available "on demand". Waiting on recruitment for each new profile means months.
Open layer and closed layer
After generating a dozen or so synthetic patients we know one thing: the quality of the transcript is the quality of the sheet. Without a good sheet, even the best generator produces flat, textbook sessions. With a good sheet — sessions sound real.
Patient sheet — the layer we publish
A 2-4 page document with biography, schemas, modes, and the character's language
The patient sheet is everything the model receives as input. The person's profile — biographical facts, key figures, schemas from YSQ-R3, modes with names, characteristic linguistic turns of phrase, prior treatment, what brings resistance and what brings resource. The more concrete the material (quotations, scenes, dates) — the better the resulting sessions.
We publish this layer in full. Anna Graniczna's sheet (~6 KB markdown) is one of the files in this dataset. The empty template with comments is too. Anyone can try to write their own patient — that is a valuable experience in its own right, regardless of the rest of the pipeline.
What's not in the dataset
The closed layer — part of the TherapySupport platform
- Therapy arc — what changes from session 1 → 20
- Session plan — cliffhangers, regressions, breakthroughs
- Style guide — Anna's and the therapist's characteristic phrasing
- Sequential-coherence engine — quotes from S2 returning in S15
- Generative pipeline
Output format — ASR transcript
Just utterances, timestamps, no bracketed descriptions
All 20 sessions take the form of an ASR transcript — like a recording from an automatic speech recognition device. There is no [pause, 12 seconds], no [Anna looks out the window], no [brief laugh]. Real transcription devices don't record those things.
Pauses are visible as gaps between timestamps. Backtracks, self-irony, silences, broken-off words, "mhm", "hmm", "yyy" — everything fits within the speech itself. This is a realism requirement: an ASR model used in production must see the same format on which it was validated.
Excerpt from session 1, first three minutes — fragment kept in Polish to show ASR format.
[00:01:50] Terapeuta: Można.
[00:01:52] Anna: Dobra. Spierdoliłam się. Pokłóciliśmy się z Markiem. Marek to mój facet. Trzy lata. Pokłóciliśmy się i ja...
[00:02:14] Anna: ...nacięłam się w przedramię. Lewo. Tu. Powierzchowne. Nic groźnego. Pierwsze od ośmiu lat.
Excerpt from Session 1, the first three minutes of conversation. The gap between 00:02:14 and 00:01:52 is a 22-second pause during which Anna stayed silent.
10 rules that make the difference
All the rules are in Anna's sheet — we show them here with concrete examples from her document. It's a model to follow, not a rigid rule.
Concretes instead of abstractions
Names, dates, places, sentences word-for-word
Not "cold mother" — but: Krystyna, 62, Polish-language teacher. To this day works at a high school. Never hugged Anna out of tenderness — only ritually. Closest childhood memory: 9 years old, a clothing store, mom says "Anko, the blue one suits your complexion better". That was the closest exchange with her mother that Anna remembers.
This level of concreteness makes the difference. The reader's brain (and the language model) needs the scene with the blue dress, not a diagnostic category.
One memory with a sensory detail
A concrete scene you can return to
Anna carries an image inside her: 7 years old, the kitchen in a block of flats, a green wall, pajamas with rabbits, parents arguing, no one turns around. We return to this memory in session 9 as the object of imagery rescripting, in session 14 as the backdrop for chair work, and in session 15 together with the father.
Quotations word-for-word
What exactly the mother, the father, the grandmother said
What did the mother say, exactly? "You should", "that won't be enough", "what will people say", "strong women don't get hysterical". What did the grandmother say? "My little Aneczka." "If it's hard for you, then it's hard, don't pretend it's not."
These phrases later return in the patient's voice and in the voice of her inner Krytyczka (the Critic). Without word-for-word quotations, the inner parental voice sounds generic.
At least one warm figure
Even in difficult stories
Anna has grandma Halina, with her sweet rolls and Anne of Green Gables. They read together in the evenings. Grandma used to say "my little Aneczka". Without such a figure, the sessions are flat and the patient looks like a diagnosis, not a human being.
In real psychotherapy — even very difficult patients usually have such a person somewhere in the past, though you have to dig down to find them. The absence of a warm figure in the sheet means the therapy in the generated sessions has no "anchor of hope".
Schemas with numbers
Top 5 from YSQ-R3 with concrete percentiles
Not "has a lot of schemas" — but: Abandonment 99th percentile, Defectiveness 95, Emotional Deprivation 91, Unrelenting Standards 82, Insufficient Self-Control 74.
The numbers tell the model what should be more frequent and what should be rarer in the patient's inner dialogue. A 99th-percentile schema appears in almost every session. A 65th-percentile schema — sporadically, in context.
Modes with names 'in-house'
The name = the way you talk about the mode in session
Not Detached Protector — but: Pustka (Emptiness). The patient says: "somewhere behind glass, I don't feel anything". Not Punitive Parent — but: Krytyczka (the Critic). Speaks in mom's language. Favorite sentences: "defective", "hopeless", "what will people say".
Names "in Polish" — Krytyczka, Mała Ania w piżamie (Little Ania in pajamas), Wkurzona Ania (Angry Ania), Pustka — become the way of talking about modes within the session itself. By S6 the patient names them herself.
Reason for presenting as a concrete episode
Date, context, who was there, what exactly happened
Not "crisis" — but: three weeks before the first session, after an argument with Marek (he accused her of "hysteria" when she asked whether he loved her), a 4 cm cut on the left forearm, the first in 8 years. The next day she didn't go to work, lay in bed, didn't pick up the phone.
A concrete episode determines the intensity, the context, and the frame of the first session. "Crisis" opens up a million possibilities; "a 4 cm cut at night after an argument" — exactly one.
The patient's language
5-10 characteristic phrases + a description of when she uses them
Anna says "klasyk Anna" (classic Anna) when she's distancing herself from herself. She says "no nie wiem" (well, I don't know) when she wants to think. She says "to jest jakieś dziwne" (this is somehow weird) when she has unexpected sensations. In strong emotion, once a session, "kurwa" (fuck) shows up.
Previous therapies + why they didn't work
The key to natural skepticism and transference
Anna has a year and a half of psychodynamic therapy behind her (she stopped: "I kept saying the same thing") and half a year of CBT (she stopped: "thought Y was not mine").
This is essential for the patient's natural skepticism in the first session ("another therapist who'll tell me to pull myself together"), for comparisons during the work ("you're the first who asked outright about self-harm"), and for transference work around sessions 12-13, when the abandonment schema gets projected onto the therapist.
What brings resistance, what brings resource
3-5 points each, concretely
This is the equivalent of "regulating variables" in generation.
Resistance: lateness, intellectualization when it hurts, "OK, fine, never mind" at moments of emotional closeness, a possible cancelled session around S12 (testing the relationship).
Resource: punctuality, registered for therapy on her own, knows the terminology (which can be an aid and a defense), wants change despite skepticism.
Anna Graniczna · 20 sessions · 5 months
What you can see in the transcripts. Five key moments from the entire therapy — with verbatim quotations from the patient.
| Session | Phase | Key moment |
|---|---|---|
| S2 | Assessment | The first time Anna cries when speaking about her grandmother — 4 seconds. She withdraws: "I'm not going to bawl in front of a strange woman." |
| S7 | Conceptualization | After reading the case conceptualization: "So I'm not fucked up — I just learned this back when I had no choice." |
| S9 | Imagery | First imagery rescripting (kitchen, age 7). Anna cries for 4 minutes during the session. |
| S12 | Rupture | Anna cancels the session, comes back distant: "I'm afraid you're going to leave me. Or that I'll leave you first, so it's on my terms." |
| S17 | Real-life use | After an argument with Marek she doesn't run out — she sits in the kitchen for 5 minutes, says: "I'll come back to this in an hour, I need to be alone right now." |
Therapy arc in three phases
S1-S7 · Assessment and education
Biographical interview, schemas (YSQ), modes (mode mapping), case conceptualization
Anna comes in skeptical. Cautious. The first therapist "who asked outright about self-harm and didn't make a big deal of it". In session 2 she cries for the first time, about grandma Halina. In session 3 she talks about the last phone conversation with her father — coldly, intellectually; the therapist notices: "you withdrew the moment it started to hurt". Silence, 30 seconds. In session 7, the case conceptualization. For the first time Anna cries differently than over her grandmother — without withdrawal, without an embarrassed laugh.
S8-S14 · Working with modes
Imagery rescripting, chair work, rupture and repair
The first imagery in S8 doesn't take — Pustka kicks in, Anna opens her eyes: "sorry, I can't do this, I feel stupid". The second one, in S9 — breakthrough. In S10 a fight with Marek (a thrown mug), work with Wkurzona Ania (Angry Ania). In S11 the first chair work with Krytyczka — clumsy, unfinished. S12 is the rupture — Anna cancels the session, returns distant, reveals that she was afraid the therapist would leave her. The abandonment schema active in the transference. S13 — repair. S14 three-chair work, Wkurzona Ania defending Mała Ania against Krytyczka, the first time someone shouts loudly in the consulting room.
S15-S20 · Behavioral change and autonomy
Closing the grief, experiments with Marek, letter to Mała Ania
S15 — closing the grief over her father (a letter + imagery rescripting in an imagined hospital). S16 — preparing a behavioral experiment with Marek. S17 — Marek reacts badly, Anna stays in the conflict (sits in the kitchen for 5 minutes, feels, returns to the conversation). The first real-life use of the techniques. S18 — regression (alcohol comes back), Pustka as caretaker, self-compassion instead of self-criticism. S19 — letter to Mała Ania w piżamie (Little Ania in pajamas). S20 — closing the stage with an opening to continuation: "I'm no longer the same Ania who came in here in April."
Download and what we keep closed
We send the full dataset by email after you provide an address. Not because we want to gate access — we want to have contact with the people working with the dataset, so we can reach back out with further materials.
| Element | In ZIP | Size |
|---|---|---|
| Anna Graniczna's patient sheet | ✓ | ~6 KB |
| 20 session transcripts (markdown ASR) | ✓ | ~340 KB |
| Empty patient sheet template | ✓ | ~3 KB |
| README with reading instructions | ✓ | ~2 KB |
| Therapy arc / session plan / style guide | — | closed |
| Generative pipeline | — | closed |
Download Anna Graniczna's dataset
Provide an email address to which we'll send a ZIP (148 KB) with the full patient sheet, 20 sessions in transcript form, and an empty template for your own patients.
CC-BY 4.0 · No paywall · No sales follow-up.
Dataset limitations
- One patient: Anna Graniczna is a single profile (BPD-spectrum, 31-year-old woman). For a fuller picture, more profiles are needed.
- One modality: schema therapy. CBT, psychodynamic, ISTDP, EMDR patients — generated on request.
- One language: Polish.
- Simulated therapist: dr. Joanna Kowal is also fictional. Her style is that of "a good schema therapy practitioner" — but this is still the choice of one specific style, not a representation of the entire population of therapists.
- No "blind" validation: we have not yet tested whether expert therapists could distinguish Anna from an anonymized recording of a real patient. This is a planned validation step.
Need a different patient?
The full generative pipeline — therapy arc, session plan, style guide, sequential-coherence engine — is part of our platform and we do not publish it.
If you'd like your own patient, write to us: kontakt@aitherapy.support