Speech-Based Detection of Cognitive Impairments in Older Adults: Longitudinal Validity Analysis and Cross-Lingual Generalization Restricted; Files Only

Zheng, Xin Ran (Spring 2025)

Permanent URL: https://etd.library.emory.edu/concern/etds/kp78gh786?locale=it
Published

Abstract

Early detection of Alzheimer’s Disease (AD)—the most common form of dementia—and its prodromal stage, Mild Cognitive Impairment (MCI), is crucial for enabling timely interventions and effective care planning. However, current diagnostic practices—relying on clinical assessments, neuropsychological testing, and biomarker analysis—are often costly, time-intensive, and inaccessible, especially in underserved or resource-limited settings. These limitations have driven the development of speech-based screening tools, which offer the advantages of being scalable, non-invasive, and accessible. Most importantly, speech is highly sensitive to early neurodegenerative changes, often reflected in fluency and acoustic patterns. However, key challenges remain regarding the generalizability and stability of such models over time and across diverse linguistic populations.

This dissertation addresses these challenges in two parts. The first part focuses on the longitudinal analysis of speech-based models to assess their ability to track cognitive changes over time. As part of this, it explores the use of both hand-crafted and deep learning-derived speech features—encompassing acoustic and linguistic aspects—for pre-screening MCI. It also examines psychological well-being measures, such as loneliness and neuroticism, as complementary indicators of cognitive status. Results show that speech-based models remain stable over time, underscoring their potential for continuous cognitive monitoring. Additionally, speech features offer moderate utility for MCI pre-screening, and well-being measures have limited predictive value.

The second section turns to the challenge of cross-lingual generalizability, a key barrier in global dementia screening efforts. To address this, the study analyzes speech-based AD detection models across English, Greek, and Slovak datasets, evaluating their robustness and adaptability in multilingual settings. Multiple transfer learning techniques are applied to enhance the transferability of models trained in one language to others. Training approaches that leverage multilingual data, along with fine-tuning, yield strong results; however, transferability varies across language pairs, highlighting the complexity of cross-lingual generalization. These findings highlight both the promise and the limitations of current transfer learning approaches, emphasizing the need for more sophisticated techniques that can bridge linguistic boundaries in speech-based cognitive screening.

Together, these investigations advance the development of speech-based tools for early, accessible, and globally applicable cognitive impairment screening, while also identifying key limitations and future directions for improving their robustness and reach.

Table of Contents

- 1. Introduction ............................................................ 1

- 2. Related Work ............................................................ 4

 - 2.1 Speech Biomarker ...................................................... 4

 - 2.2 Lack of Longitudinal Validity Analysis ................................ 7

 - 2.3 Lack of Cross-lingual Generalization Analysis ......................... 7

- 3. Longitudinal Validity Analysis for Cognitive Impairment with Speech 

   and Psychological Well-being Patterns .................................. 10

 - 3.1 Introduction ......................................................... 10

 - 3.2 Methods .............................................................. 12

  - 3.2.1 Dataset .......................................................... 12

  - 3.2.2 Outcomes and Clinical Assessment ................................. 13

  - 3.2.3 Audio Processing Pipeline Overview ............................... 16

  - 3.2.4 Preprocessing .................................................... 16

  - 3.2.5 Feature Extraction ............................................... 17

  - 3.2.6 Participant-Level Feature Aggregation ............................ 19

  - 3.2.7 Experiment Setting ............................................... 20

 - 3.3 Results .............................................................. 21

  - 3.3.1 Longitudinal Validity Analysis of Cognitive Impairment Over Time . 21

  - 3.3.2 User-Independent Audio-Based Classification of Cognitive Impairment 22

  - 3.3.3 User-Independent Classification via Well-being Scores ........... 22

 - 3.4 Discussion ........................................................... 23

 - 3.5 Conclusion ........................................................... 28

- 4. Feasibility of Cross-Lingual Audio-Based AD Classification 

   with Domain Adaptation ................................................ 29

 - 4.1 Introduction ......................................................... 29

 - 4.2 Methods .............................................................. 30

  - 4.2.1 Datasets ......................................................... 30

  - 4.2.2 Feature Extraction and Evaluation ............................... 31

  - 4.2.3 Neural Network Model ............................................ 32

  - 4.2.4 Cross-Lingual Adaptation Strategies ............................. 33

  - 4.2.5 Training and Evaluation Setup ................................... 35

 - 4.3 Results .............................................................. 36

  - 4.3.1 Monolingual Performance with Feature Set Comparison ............. 36

  - 4.3.2 Within- and Zero-Shot Cross-lingual Inference ................... 36

  - 4.3.3 Mixed-Batch Training ............................................ 38

  - 4.3.4 Fine-Tuning Results ............................................. 38

  - 4.3.5 Adversarial Learning ............................................ 41

 - 4.4 Discussion ........................................................... 42

 - 4.5 Conclusion ........................................................... 50

- 5. Conclusion .............................................................. 51

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Parola chiave
Committee Chair / Thesis Advisor
Committee Members
Ultima modifica Preview image embargoed

Primary PDF

Supplemental Files