Shenzhen Bay Laboratory

Computational and Disease Genomics Lab


Genotype Tissue-Plasma Omics Project (GTOP)


Current functional genomics resources, most notably GTEx, are overwhelmingly based on European-ancestry cohorts and short-read sequencing. This leaves two critical blind spots: East Asian populations lack dedicated regulatory references, and structural variants (SVs) and tandem repeats (TRs) are systematically missed. We lead the GTOP project to close both gaps simultaneously, integrating PacBio HiFi whole-genome sequencing, full-length transcriptomics, and proteomics across 33 tissues from 160 East Asian donors. GTOP has revealed that SVs/TRs are 4.5-fold enriched among high-confidence causal variants — demonstrating that a large fraction of GWAS signals previously attributed to SNVs are in fact driven by structural variation and that over half of East Asian disease loci cannot be explained by existing European resources, establishing an indispensable foundation for East Asian precision medicine.


Multi-dimensional Noncoding Variant Interpretation


Genome-wide association studies have identified thousands of noncoding disease variants, yet functional interpretation remains a major challenge because existing approaches focus almost exclusively on gene expression. Our work on alternative polyadenylation (APA) is the first of its kind to systematically link 3′UTR regulation to disease genetics: we constructed the first human 3′aQTL atlas (Nature Genetics, 2021), showing that APA alone explains ~16.1% of previously uninterpretable disease heritability. We then constructed the first immune-response APA regulatory map (Nature Communications, 2023) and has since expanded into a complete regulatory framework — from 3′aQTL to the discovery of 5′aQTL, a new class of transcription-initiation QTL (Science Advances, 2024; Nature Communications, 2026), to the first genetic effect map of enhancer RNAs (Advanced Science, 2025) — collectively revealing that the majority of post-transcriptional causal genes are invisible to conventional eQTL analysis. We have further demonstrated clinical translation potential by showing that 3′UTR shortening drives tumor immune evasion and developing a CRISPR-based 3′UTR-targeted therapeutic strategy (Nature Biomedical Engineering, 2026).


Single-cell Genetic Regulation


Tissue-level analyses average out cell-type-specific effects, which are often most relevant to disease. Our sc-xQTL framework addresses this by simultaneously mapping gene expression, APA, and RNA editing QTLs at single-cell resolution across 31 immune cell types from over 6 million cells. The key finding is striking, as post-transcriptional QTLs explain substantially more of the heritability of immune disease than traditional expression QTLs, and 70% of the causal genes identified through APA and RNA editing are entirely independent of gene expression, fundamentally redefining the regulatory landscape of immune disease. Supporting this direction, we have built scQTLbase, the largest integrated single-cell eQTL database covering 57 cell types and 95 cell states (Nucleic Acids Research, 2024) and developed MAAS for multi-modal single-cell integration (Genome Medicine, 2026).