Machine Learning Methods for Quantification of Depression Severity and Prediction of Recovery Trajectory using Longitudinal Video and Audio Data, with Applications to Deep Brain Stimulation Treatment Optimization Público

Harati, Sahar (Summer 2019)

Permanent URL: https://etd.library.emory.edu/concern/etds/s1784m96x?locale=es
Published

Abstract

Predictive analytics and computational phenotyping techniques have shown promising results in several areas of medicine, including automated classification of radiology imaging, survival analysis in cancer patients, and prediction of life-threatening events in hospitalized patients. In recent years, computational psychiatry has emerged as a field that combines multiple levels and types of data and computational model- ing to improve understanding, prediction, and treatment of mental illness. Mental health patients often undergo a variety of non-invasive (e.g., cognitive counseling) and invasive (e.g., surgery) therapies before finding an effective treatment plan. Im- proved prediction of treatment response can shorten the duration of clinical trials and improve patient experience and outcomes. A key challenge of applying predictive modeling to this problem is that often, the effectiveness of a treatment regimen remains unknown for several weeks. In this thesis, we propose Machine Learning approaches to extracting audio-visual features for predicting the likely outcome of Deep Brain Stimulation (DBS) treatment several weeks in advance for patients suffering from major depressive disorder, a common psychiatric illness for which there are no objective, non-verbal, automated markers that can reliably track treatment response. We first explore the use of video analysis of facial expressivity in a cohort of severely depressed patients before and after DBS. We introduce a set of variability measurements to obtain unsupervised features from muted video recordings. We then leverage the link between short-term emotions and long-term depressed mood states and use a neural network model on the top of emotion-based audio features. The results show that unsupervised features extracted from these audio and video recordings, when incorporated in classification models, can discriminate different lev- els of depression severity during ongoing DBS treatment. Moreover, for the long term prediction and in the absence of immediate treatment-response feedback, we utilize a joint state-estimation and temporal difference learning approach to model both the trajectory of a patient’s response and the delayed nature of feedbacks using deep neural networks. The results based on longitudinal recordings of patients with depression show that the learned state values are predictive of the long-term success of DBS treatments. Our findings suggest that Machine Learning models can discover objective biomarkers of depression and patient response to treatments, which have the potential to standardize treatment protocols and enhance the design of future clinical trials. 

Table of Contents

1 Introduction 1

1.1 Machine Learning and Applications in Mental Health . . . . . . . . . 1

1.1.1 Machine Learning Methods .................. 2

1.2 Major Depressive Disorder...................... 4

1.2.1 Assessment ............................ 4

1.2.2 Treatment and Recovery..................... 5

1.3 Automatic Depression Assessment from Visual Cues and Audio . . . 7

1.4 Contributions ............................... 9

1.5 Data.................................... 10

1.5.1 Subjects and Clinical Assessment ................ 10

1.5.2 Video Collection ......................... 11

2 Visual Feature Extraction and Analysis 13

2.1 Introduction................................ 13

2.2 Preprocessing ............................... 15

2.2.1 Face detection........................... 15

2.2.2 Normalization........................... 16

2.2.3 Face Alignment and Registration ................ 16

2.2.4 Downsampling .......................... 17

2.3 Methods.................................. 17 

2.3.1 Multi-scaleEntropy........................ 17

2.3.2 Switching Linear DynamicalSystems . . . . . . . . . . . . . . 18

2.4 Results................................... 19

2.4.1 Visualization of Features..................... 19

2.5 Conclusion................................. 22

3 Depression Classification via Visual Features 25

3.1 Introduction................................ 25

3.2 Methods.................................. 26

3.2.1 Evaluation Methods and Statistical Analysis . . . . . . . . . . 26

3.2.2 Feature Selection ......................... 28

3.3 Results................................... 29

3.4 Conclusion................................. 31

4 Depression Classification via Audio Features 33

4.1 Introduction................................ 33

4.2 Methods.................................. 34

4.2.1 Preprocessing ........................... 34

4.2.2 Basic Features........................... 35

4.2.3 Emotion Features......................... 35

4.2.4 Aggregation............................ 38

4.2.5 Prediction............................. 38

4.2.6 Baselines.............................. 38

4.3 Results................................... 40

4.4 Conclusion................................. 43

5 Treatment Outcome Prediction 44

5.1 Introduction................................ 44

5.2 Methods.................................. 46 

5.2.1 Feature Extraction ........................ 46

5.2.2 Temporal DifferenceLearning .................. 48

5.2.3 Baselines and PerformanceMeasure . . . . . . . . . . . . . . . 51

5.3 Results................................... 53

5.4 Conclusion................................. 57

6 Conclusion 59

Appendix A Switching Linear Dynamical Systems 64

A.1 Modeling.................................. 64

A.2 SystemIdentification........................... 65

A.3 Experimental Setup............................ 66

A.4 Latent Dynamical Analysis........................ 66

Appendix B Elastic Net Ordinal Logistic Regression 69

Appendix C Audio Features 72

Appendix D Data 74

Bibliography 79 

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Palabra Clave
Committee Chair / Thesis Advisor
Committee Members
Última modificación

Primary PDF

Supplemental Files