Abstract

Verba Labs is a mental health research company. We study and implement safe AI integration into therapy and mental health care, in collaboration with clinical psychologists, licensed therapists, and pioneers in academia.

1     Introduction

Roughly one in five U.S. adults experience a mental illness each year [1], and tens of millions live with symptoms without adequate care [2]. In many systems, the delay between first symptoms and treatment can span years [3]. The core problem is a persistent mismatch between clinical need and available care:

AI systems continue to violate mental health ethics [9], mishandle risk [10], or create a false sense of therapeutic relationship and clinical authority [11, 13]. The question is no longer whether people will inevitably consult AI in mental health contexts, but how that use can be made safer, more transparent, and clinically grounded.

2     What we build

Verba focuses on the infrastructure that would make any mental health use of AI safer: careful measurement, expert-grounded datasets, and aligned models that are designed to work alongside psychologists. Recent reviews of large language models in mental health emphasize both potential benefits and substantial risks, and call for rigorous, clinically informed evaluation frameworks [13].

Verba's mission is to build a robust, non-sycophantic foundational model that can be used to augment human care: helping therapists with CBT, training and supervision, psychoeducation, and reducing loneliness between sessions. Our work starts from first principles: measure where frontier models fail, create datasets rooted in real expertise, and test new systems in rigorous trials. Explore our three foundational pillars:

TherapyBench: limits of current language models
Dataset curation with licensed psychologists
Alignment & randomized trials

3     Why this research matters

Large language models are already part of the mental health landscape. Adults and adolescents use chatbots for depression and anxiety, relationship decisions, and late-night distress when no one else is available [7, 14]. For many, it feels easier to type into a box than to call a crisis line or schedule therapy.

At the same time, emerging evidence shows that general-purpose models can:

Without rigorous, clinician-informed research, we risk a future where AI quietly becomes one of the largest providers of "mental health advice" without the safeguards, accountability, or evidence base that we expect of clinical care. Psychologists are already reporting that they are at or near their limits as patients present with more severe symptoms [20]. Verba's work is an attempt to build that evidence and those safeguards before the defaults are set in stone.

4     How therapists & researchers can contribute

We collaborate closely with licensed clinicians and academic researchers. Our goal is not to replace your judgment, but to encode it into how AI systems behave and to stress-test those systems empirically. Participation is flexible and compensated.

4.1   Co-design safe use-cases

Help us define where an AI assistant can reasonably support care (for example, CBT homework, skills coaching, psychoeducation, trainee role-plays) and where it should clearly say "this is beyond my role." We work with you to articulate boundaries, escalation rules, and "never events" for model behavior.

4.2   Contribute expertise & annotations

Provide gold-standard responses, rate model outputs, and refine our rubrics. We run calibration sessions, adjudicate disagreements, and iterate on criteria so that the resulting datasets actually reflect how experienced clinicians think.

4.3   Collaborate on trials & evaluation

For interested groups, we're designing an RCT that examine LLM usage in therapy settings. We can share protocols, help with study design, and coordinate with your IRB or ethics board.

5     Our ethics, privacy, and data handling

We treat all contributions as sensitive, even when they are synthetic or fully de-identified. Our work is research-focused and is designed to be compatible with institutional review and professional ethics.

5.1   De-identification & rewriting

Any text we work with is stripped of direct identifiers. When tasks are inspired by real encounters, we rewrite and abstract surface details while preserving the underlying psychological and decision structure. Our aim is to learn from patterns of care, not from any individual person's story.

5.2   Research-only, opt-in use

Data are used solely for research, evaluation, and safety-focused model development. All participation is voluntary and can be withdrawn prospectively. We are happy to provide written materials, protocols, and data handling descriptions to your institution or oversight body.

Interested in collaborating or learning more?

We are happy to share a short study overview, walk through our current datasets and evals (i.e. TherapyBench), and discuss collaborations. We especially welcome feedback from psychotherapy researchers, clinical training programs, and clinicians working with high-risk or underserved populations.

Willow tree landscape