Integrating T-cell receptor and transcriptome for large-scale single-cell immune profiling analysis
Abstract
Recent advancements in single-cell immune profiling that enable the measurement of the transcriptome and T-cell receptor (TCR) sequences simultaneously have emerged as a promising approach to study immune responses at cellular resolution. Yet, combining these different types of information from multiple datasets into a joint representation is complicated by the unique characteristics of each modality and the technical effects between datasets. Here, we present mvTCR , a multimodal generative model to learn a unified representation across modalities and datasets for joint analysis of single-cell immune profiling data. We show that mvTCR allows the construction of large-scale and multimodal T-cell atlases by distilling modality-specific properties into a shared view, enabling unique and improved data analysis. Specifically, we demonstrated mvTCR’s potential by revealing and separating SARS-CoV-2-specific T-cell clusters from bystanders that would have been missed in individual unimodal data analysis. Finally, mvTCR can enable automated analysis of new datasets when combined with transfer-learning approaches. Overall, mvTCR provides a principled solution for standard analysis tasks such as multimodal integration, clustering, specificity analysis, and batch correction for single-cell immune profiling data.