- We are a Bioinformatics and Computational Genomics research group at Peking University, Beijing. Our research is focused on developing computational methods to decipher the “coded messages” in genome via mining biological “big data”. During past decades, we developed multiple bioinformatics tools to analyze, integrate and visualize omics data, with applications to profile and model the function and evolution of gene regulatory systems in human and various model organisms.
- Corresponding to the interdisciplinary nature of our research, members of our group have very diverse backgrounds. People from Biology, Medicine, Chemistry, as well as Mathematics and Computer Science have been working together closely in the lab, to advance the boundaries of knowledge.
- We are seeking for talented and motivated Students, PostDocs, Technicians as well as Research Scientists. Please do not hesitate to contact us if you are!
scRNA-seq has seen rapid development over the recent years, providing powerful technological support for studying important biological problems including cellular functions and gene regulatory networks. In a typical scRNA-seq study, cell annotation regarding cell types and cell differentiation progress is usually the first step. However, the standard annotation process is rather tedious, and cannot guarantee Read more about Searching large-scale scRNA-seq databases via unbiased cell embedding with Cell BLAST[…]
Long noncoding RNAs (lncRNAs) have been demonstrated to play many important regulatory roles via various mechanisms (1), and many characteristics have been used to infer their functions. For example, analyzing the natural selection pressures on lncRNAs can reveal whether they are functional (2), and their subcellular localization indicates where they perform functions (3). Functional enrichment Read more about AnnoLnc2: the one-stop portal to systematically annotate novel lncRNAs for human and mouse[…]
By analyzing the largest compendium of 14,166 samples from normal and tumor tissues, Ge Gao’s team significantly expand the landscape of human long noncoding RNA with a high-quality atlas: RefLnc (Reference catalog of LncRNA, http://reflnc.gao-lab.org/). Considered the influence RefLnc has made in human lncRNA field，RefLnc has being awarded as China’s top ten bioinformatics database of Read more about RefLnc for China’s top ten bioinformatics database of 2019[…]
Well done, Dr. Jiang! We had a happy time also with some meaningful challenges in past few years. Wish you more success and prosperity in days to come!
Convolutional neural network (CNN) has been widely used in sequence classification and mechanism mining of omic datasets since the publication of the two pilot studies, DeepBind (1) and DeepSEA (2). The intrinsic operation of CNN itself, however, has long been taken as a ‘black-box’. Recently some studies have attempted to tackle this problem; for example, Read more about Expectation Pooling: a novel pooling layer with explicit probabilistic interpretation[…]
On November 8, 2019, the research group of Gao Ge, a researcher at the State Key Laboratory of Protein and Plant Gene Research at Peking University, published a research paper entitled “PlantRegMap: charting functional regulatory maps in plants” in the journal Nucleic Acids Research (https://doi.org/10.1093/nar/gkz1020). The study mapped the functional transcriptional regulation map of 63 Read more about PlantRegMap: charting functional regulatory maps in plants[…]
Long noncoding RNAs (lncRNAs) are defined as non-coding transcripts longer than 200 nt (1,2). They have been demonstrated to conduct diverse functions in multiple biological and pathological processes (3). A high-quality and comprehensive lncRNA annotation is a cornerstone requirement of subsequent functional investigation. However, large discrepancies still exist in the current major annotations. By analyzing Read more about RefLnc: An expanded landscape of human long noncoding RNA[…]
Well done, Dr. Kang! Wish you more success and prosperity in days to come!
A Comprehensive Database for Genetic Variants Associated with Esophageal Squamous Cell Carcinoma in Chinese Population
Esophageal squamous-cell carcinoma (ESCC) is one of the most lethal malignancies in the world and occurs at particularly higher frequency in China. While several genome-wide association studies (GWAS) of germline variants and whole-genome or whole-exome sequencing studies of somatic mutations for ESCC in Chinese population have been published, there is no comprehensive database publically available. Read more about A Comprehensive Database for Genetic Variants Associated with Esophageal Squamous Cell Carcinoma in Chinese Population[…]
Transcription factors bind to cis-regulatory elements (transcription factor binding site, TFBS) to regulate the transcription of the downstream gene (Latchman, 1997). Variants within TFBSs can impact the binding strength of transcription factors and participate in the biogenesis of human diseases, including cancers (Huang et al., 2014; Liu et al., 2017). Although many tools have been Read more about Accurately annotating compound effects resulting from multiple variants at promoter regions[…]