Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct 15;11(1):125.
doi: 10.1038/s41537-025-00669-z.

Collecting language, speech acoustics, and facial expression to predict psychosis and other clinical outcomes: strategies from the AMP® SCZ initiative

Zarina R Bilgrami  1 Eduardo Castro  2 Carla Agurto  2 Einat Liebenthal  3   4 Michaela Ennis  4 Justin T Baker  3   4 Isabelle Scott  5   6 Beau-Luke Colton  5   6 Kang Ik K Cho  7 Linying Li  1 Zailyn Tamayo  8 Mara Henecks  8 Habiballah Rahimi Eichi  3   4 Tae'lar Henry  1 Jean Addington  9 Luis K Alameda  10   11 Celso Arango  12 Nicholas J K Breitborde  13   14 Matthew R Broome  15   16 Kristin S Cadenhead  17 Monica E Calkins  18 Eric Yu Hai Chen  19 Jimmy Choi  20 Philippe Conus  10   21 Barbara A Cornblatt  22   23 Lauren M Ellman  24 Paolo Fusar-Poli  11   25 Pablo A Gaspar  26 Carla Gerber  27 Louise Birkedal Glenthøj  28 Leslie E Horton  29 Christy Hui  19 Joseph Kambeitz  30 Lana Kambeitz-Ilankovic  30 Matcheri S Keshavan  3 Sung-Wan Kim  31   32 Nikolaos Koutsouleris  11   33 Jun Soo Kwon  34   35 Kerstin Langbein  36 Daniel Mamah  37 Covadonga M Diaz-Caneja  12 Daniel H Mathalon  38   39 Vijay A Mittal  40 Merete Nordentoft  41   42 Godfrey D Pearlson  1   8 Jesus Perez  43   44 Diana O Perkins  45 Albert R Powers 3rd  8   46 Jack Rogers  15 Fred W Sabb  47 Jason Schiffman  48 Jai L Shah  49   50 Steven M Silverstein  51 Stefan Smesny  36 William S Stone  4   52 Walid Yassin  4   52 Gregory P Strauss  53 Judy L Thompson  54   55 Rachel Upthegrove  15 Swapna Verma  56 Jijun Wang  57 Daniel H Wolf  18 Patrick D McGorry  5   6 Rene S Kahn  58 John M Kane  23   59 Alan Anticevic  8   46 Carrie E Bearden  60   61 Dominic Dwyer  5   6 Tashrif Billah  7 Sylvain Bouix  7   62 Ofer Pasternak  7 Martha E Shenton  4   7 Scott W Woods  8 Barnaby Nelson  5   6 Accelerating Medicines Partnership® Schizophrenia (AMP® SCZ)Guillermo A Cecchi  2 Cheryl M Corcoran #  58 Phillip M Wolff #  63
Affiliations

Collecting language, speech acoustics, and facial expression to predict psychosis and other clinical outcomes: strategies from the AMP® SCZ initiative

Zarina R Bilgrami et al. Schizophrenia (Heidelb). .

Abstract

Speech-based detection of early psychosis is progressing at a rapid pace. Within this evolving field, the Accelerating Medicines Partnership® in Schizophrenia (AMP® SCZ) is uniquely positioned to deepen our understanding of how language and related behaviors reflect early psychosis. We begin with detailed standard operating procedures (SOPs) that govern every stage of collection. These SOPs specify how to elicit speech, capture facial expressions, and record acoustics in synchronized audio-video files-both on-site and through remote platforms. We then explain how we chose our sampling tasks, hardware, and software, and how we built streamlined pipelines for data acquisition, aggregation, and processing. Robust quality-assurance and quality-control (QA/QC) routines, along with standardized interviewer training and certification, ensure data integrity across sites. Using natural language processing parsers, large language models, and machine-learning classifiers, we analyzed Data Release 3.0 to uncover systematic grammatical markers of psychosis risk. Speakers at clinical high risk (CHR) produced more referential language but fewer adjectives, adverbs, and nouns than community controls (CC), a pattern that replicated across sampling tasks. Some effects were task-specific: CHR participants showed elevated use of complex syntactic embeddings in two elicitation conditions but not the third, underscoring the importance of the language sampling task. Together, these results demonstrate how computational linguistics can turn everyday speech into a scalable, objective biomarker, paving the way for earlier and more precise detection of psychosis.Video Link: https://vimeo.com/1112291965?fl=pl&fe=sh.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors declare the following competing interests: K.A. is on the Australian Cognitive Impairment Associated with Schizophrenia Advisory Board for Boehringer Ingelheim and receives honorary funds. D.D. has received honorary funds for one educational seminar for CSL Sequiris. A.A. is a cofounder, serves as a member of the Board of Directors, as a scientific adviser, and holds equity in Manifest Technologies, Inc., and is a coinventor on the following patent: Anticevic A, Murray JD, and Ji JL: Systems and Methods for NeuroBehavioral Relationships in Dimensional Geometric Embedding, PCT International Application No. PCT/US2119/022110, filed Mar 13, 2019. E.C. has received speaker fees at non-promotional educational events. P.F.-P. has received research funds or personal fees from Lundbeck, Angelini, Menarini, Sunovion, Boehringer Ingelheim, Proxymm Science, Otsuka, outside the current study. J.K. has received speaking or consulting fees from Janssen, Boehringer Ingelheim, ROVI, and Lundbeck. C.M.D.-C. has received grant support from Instituto de Salud Carlos III, Spanish Ministry of Science and Innovation, and honoraria or travel support from Angelini, Janssen, and Viatris; RU has received speaker fees at a non-promotional educational event: Otsuka: Consultancy for Viatris and Springer Healthcare. Honorary General Secretary, British Association for Psychopharmacology (unpaid). J.M.K. is a consultant to or receives honoraria and/or travel support and/or speakers bureau: Alkermes, Allergan, Boehringer-Ingelheim, Cerevel, Dainippon Sumitomo, H. Lundbeck, HealthRhythms, HLS Therapeutics, Indivior, Intracellular Therapies, Janssen Pharmaceutical, Johnson & Johnson, Karuna Therapeutics/Bristol Meyer-Squibb, LB Pharmaceuticals, Mapi, Maplight, Merck, Minerva, Neurocrine, Newron, Novartis, NW PharmaTech, Otsuka, Roche, Saladax, Sunovion, Teva; RSK provides consulting to Alkermes, Boehringer-Ingelheim. S.W.W. has received speaking fees from the American Psychiatric Association and from Medscape Features. He has been granted a US patent no. 8492418 B2 for a method of treating prodromal schizophrenia with glycine agonizts. He owns stock in NW PharmaTech. C.A. has been a consultant to or has received honoraria or grants from Acadia, Angelini, Biogen, Boehringer, Gedeon Richter, Janssen Cilag, Lundbeck, Medscape, Menarini, Minerva, Otsuka, Pfizer, Roche, Sage, Servier, Shire, Schering Plough, Sumitomo Dainippon Pharma, Sunovion, and Takeda; GDH has been a consultant for Bristol Myers Squibb. P.J.M. has been a consultant for Otsuka and TEVA; and Z.T. has been a consultant for Manifest Technologies. C.M.C. is an Associate Editor of Schizophrenia. All other authors report no competing interests.

Figures

Fig. 1
Fig. 1. AVL processing pipeline.
Files collected by the sites are uploaded to a secure cloud storage system. Raw files are separated into combined audio, diarized audio, and video files, which are sent to feature processing services or servers to produce transcripts, acoustic analyses, and facial analyses. The results of quality checks, conducted at the initial submission of the files to the data aggregate server and later, after the files are processed for features, are sent to the data visualization platform DPdash for QA/QC monitoring. Finalized files are sent to the NIMH data archive (NDA), which conducts the final curating of the data prior to releasing it to the collaboration server for further analysis, and to the general research community.
Fig. 2
Fig. 2
A fictional example of an open-ended interview transcript complete with speaker labeling, timestamps, verbatim encodings, and redactions.
Fig. 3
Fig. 3. Audio files undergo a series of processes to identify acoustic features.
Zoom allows the audio from each speaker to be saved to separate files, here labeled Recording file 1 and Recording file 2. These files are then renamed S1 and S2, corresponding to the order in which the participants speak, with S1 designating the first speaker and S2 the second. During a pre-processing step, a step function is used to identify valid speech signals. The resulting recordings are used to extract two types of acoustic features: low-level descriptors (LLDs) and higher-level ‘functional’ features, the latter of which represents global properties of a participant’s acoustic signal.
Fig. 4
Fig. 4
Face processing involves a sequence of four stages: face detection, landmark detection, face pose detection, and action unit detection.
Fig. 5
Fig. 5
Pipeline for extracting grammatical features, syntactic dependencies, and parts of speech from language samples to assess their ability to distinguish CHR individuals from CCs.
Fig. 6
Fig. 6. ORs for grammatical features, syntactic dependencies, and parts of speech across three language sample types—PSYCHS interviews (blue), open-ended interviews (green), and audio diaries (audio).
An OR of 1 indicates no association with CHR status. ORs greater than 1 suggest that higher feature values are associated with increased odds of being classified as CHR, while ORs less than 1 indicate a negative association. Features marked with an “x” were statistically significant based on the Wald z-test.

References

    1. Andreasen N. C. Scale for the assessment of thought, language, and communication (TLC). Schizophr. Bull. 10.1093/schbul/12.3.473 (1986). - PubMed
    1. Andreasen, N. C. Thought, language, and communication disorders: II. Diagnostic significance. Arch. Gen. Psychiatry36, 1325–1330 (1979). - PubMed
    1. Andreasen, N. C. Thought, language, and communication disorders. I. Clinical assessment, definition of terms, and evaluation of their reliability. Arch. Gen. Psychiatry.36, 1315–1321 (1979). - PubMed
    1. Andreasen, N. C. Scale for the assessment of negative symptoms (SANS). Br J Psychiatry.155, 53–58 (1989). - PubMed
    1. Bearden, C. E., Wu, K. N., Caplan, R. & Cannon, T. D. Thought disorder and communication deviance as predictors of outcome in youth at clinical high risk for psychosis. J. Am. Acad. Child Adolesc. Psychiatry50, 669–680 (2011). - PMC - PubMed

LinkOut - more resources