Dissertation Defense: Yutong Luo
Candidate: Yutong Luo
Major: Biostatistics
Advisor: Ruzong Fan, Ph.D.
Deconvolution Analysis and Differential Expression Inference for Bulk Tissues and Spatial Transcriptomics
Understanding cell type composition and differential gene expression of RNA-sequencing data is crucial for comprehending phenotypic variability and detecting key factors that influence disease susceptibility of complex traits. Detecting cell type specific expression fraction, pattern, and differential expression is important in studying the cellular components and gene expression of individual cell classes and structural architecture. By using single-cell RNA-sequencing reference data, I develop linear fixed and mixed effect models to perform deconvolution analysis for bulk tissues, and mixed effect multiplicative-additive Poisson-Gamma models to perform deconvolution analysis and cell type specific inference for spatial transcriptomic data. The rationale is that the cell type-specific gene expression information from one single-cell RNA-sequencing dataset can be transferred to bulk RNA-sequencing data and spatial transcriptomics data.
To detect expression fractions of bulk RNA sequencing data, the mean parameters estimated from reference single-cell data are used to build linear fixed effect models, while the linear mixed effect models are built by utilizing both mean and variance-covariance parameters, which can reflect within-cell type stochasticity. For deconvolution analysis of spatial transcriptomics data, gene expression counts are treated as dependent variables and the mean and variance parameters of single-cell RNA sequencing data are used to construct independent variables to explain the dependent variables on the basis of the hierarchical Poisson-Gamma mixture. One novelty of the proposed mixed models is that the variance parameters of scRNA-seq are used to describe the within-cell-type variations of stochasticity. For detecting cell type specific differential expression genes of spatial transcriptomics data, the parameters of expression fractions and their covariances from the deconvolution analysis will be used to build Poisson-Gamma mixture models. By simulation study and real data analysis, the proposed models are found to perform better than or similar to existing well-performed models.