Ph.D.,South Dakota School of Mines & Technology
Decoding the Microbiome’s Role in Neuropsychiatric Disorders Using Computational and Machine Learning Approaches
Neuropsychiatric disorders (NPDs)—including autism spectrum disorder, major depressive disorder, schizophrenia, bipolar disorder, and Alzheimer’s disease—are highly prevalent, multifactorial conditions that affect millions globally. These disorders are notoriously difficult to treat due to their complex etiology, variability across individuals, and the lack of therapies that address underlying biological mechanisms. Emerging research suggests that the gut microbiome plays a critical role in regulating brain function and behavior through the gut-brain axis. Disruption of the microbial balance, or dysbiosis, is increasingly implicated in the pathogenesis of these disorders. Understanding the microbiome’s influence on mental health could unlock transformative therapeutic strategies. This project integrates computational biology, systems microbiology, and machine learning to investigate microbiome-driven mechanisms in NPDs and to identify novel microbial biomarkers and therapeutic leads. The research is divided into three interconnected stages:
(1) We will curate publicly available microbiome datasets from major repositories such as the Human Microbiome Project (HMP), Human Connectome Project (HCP), and CNGBdb. These datasets include metagenomic, metabolomic, and transcriptomic data from healthy and affected individuals. Standardized preprocessing using tools like QIIME2, Trimmomatic, and PICRUSt will ensure high-quality, reproducible data for downstream analysis. Clinical metadata (age, sex, diet, lifestyle, comorbidities) will be integrated to enhance demographic representation and control for confounders.
(2) In this phase, we will extract key features such as microbial taxa, functional genes, and metabolite profiles that distinguish healthy individuals from those with NPDs. Diversity analyses (Shannon, Simpson, Chao1 indices for α-diversity and Bray-Curtis, Jaccard, and UniFrac for β- diversity) will be performed to assess community structure. Statistical modeling will help validate microbial features linked to disease phenotypes. Manual curation of selected datasets will reinforce interpretability and ensure that domain knowledge complements machine-driven outputs.
(3) We will apply advanced machine learning methods—including support vector machines (SVMs), random forests, convolutional neural networks (CNNs), and graph neural networks (TSP- GNN, GCNN)—to classify disease states and identify predictive microbial signatures. Embedding algorithms will be used to map co-occurrence patterns and microbe-metabolite interactions.
We will further extend the project into therapeutic discovery by screening microbial metabolites for agonistic/antagonistic effects on human enzymes using molecular simulations. Promising metabolites will guide the in silico design of small-molecule drugs evaluated for drug-likeness, blood-brain barrier permeability, and ADME properties using ML pipelines and cheminformatics tools.
This project directly supports the SD INBRE mission by combining innovation, core bioinformatics resource utilization, and undergraduate training. Students will be actively engaged in every stage—from dataset handling and annotation to ML implementation and result interpretation— gaining hands-on experience in biomedical research. The project will foster a culture of inquiry- driven learning and mentorship while contributing open-access data and workflows for the broader scientific community. Results from this research may pave the way toward microbiome-based diagnostics and personalized interventions for neuropsychiatric disorders.