A Review of Gene and Isoform Expression Analysis across Multiple Experimental Platforms
Wang Kaili1, Zhang Li2, Liu Xuejun1*
1(College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China) 2(College of Informaiton Science and Technology, Nanjing Forestry University, Nanjing 210037, China)
Abstract:Transcriptomics study has become a hot topic in life science and medical research in recent years. From the expression point of view, the foundation of transcriptomics study is the measurement of gene expression levels. Differential expression (DE) analysis of genes is very important for understanding the function of genes. DE analysis of isoforms is a feasible method to reflect the change of alternative splicing. Currently, there are mainly two large-scale experimental platforms for measuring gene expression levels, including microarray and high-throughput sequencing technology, RNA-Seq. At the beginning of this paper, we introduced the technical principles of the four mainstream experimental platforms: Affymetrix's traditional 3' GeneChip, Exon array, Human Transcriptome Array 2.0 and Illumina platform based on RNA-Seq. We then reviewed the mainstream analysis methods and our methods on each platform for the calculation of gene expression levels and DE analysis. We also showed the comparison results of expression measurement and DE analysis across various platforms under a well-defined benchmark data set.
王凯莉, 张 礼, 刘学军. 多实验平台下基因及异构体表达分析综述[J]. 中国生物医学工程学报, 2017, 36(2): 211-218.
Wang Kaili, Zhang Li, Liu Xuejun. A Review of Gene and Isoform Expression Analysis across Multiple Experimental Platforms. Chinese Journal of Biomedical Engineering, 2017, 36(2): 211-218.
[1] Schena M, Shalon D, Davis RW, et al. Quantitative monitoring of gene expression patterns with a complementary DNA microarray [J]. Science, 1995, 270(5235): 467-470. [2] Wang Zhong, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics [J]. Nature Reviews Genetics, 2009, 10(1): 57-63. [3] Marioni JC, Mason CE, Mane SM, et al. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays [J]. Genome Research, 2008, 18(9): 1509-1517. [4] Xu Xiao, Zhang Yuanhao, Williams J, et al. Parallel comparison of Illumina RNA-Seq and Affymetrix microarray platforms on transcriptomic profiles generated from 5-aza-deoxy-cytidine treated HT-29 colon cancer cells and simulated datasets. BMC Bioinforma 14:S1 [J]. Bmc Bioinformatics, 2013, 14(9):1-14. [5] Bemmo A, Benovoy D, Kwan T, et al. Gene expression and isoform variation analysis using Affymetrix exon arrays [J]. Bmc Genomics, 2008, 9(1):1-15. [6] Zhao Shanrong, Fung-Leung Wai-Ping, Bittner A, et al. Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells [J]. PLoS ONE, 2014, 9(1): e78644. [7] Shi Leming, Reid LH, Jones WD, et al. The MicroArray Quality Control (MAQC) project shows inter-and intraplatform reprodu-cibility of gene expression measurements [J]. Nature Biotechnology, 2006, 24(9): 1151-1161. [8] MAQC Consortium. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models [J]. Nature Biotechnology, 2010, 28(8): 827-838. [9] Seqc/Maqc-Iii Consortium. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium [J]. Nature Biotechnology, 2014, 32(9): 903-914. [10] Dalma-Weiszhausz DD, Warrington J, Tanimoto EY, et al. The Affymetrix GeneChip Platform: An Overview [J]. Methods in Enzymology, 2006, 410: 3-28. [11] Southern E, Mir K, Shchepinov M. Molecular interactions on microarrays [J]. Nature Genetics, 1999, 21(1 Suppl):5-9. [12] Affymetrix: Affymetrix Gene Chip exon array design [R]. 2005. [13] Affymetrix: GeneChip Human Transcriptome Array 2.0 [R]. 2013. [14] Valenzuela A, Talavera D, Orozco M, et al. Alternative splicing mechanisms for the modulation of protein function: conservation between human and other species [J]. Journal of Molecular Biology, 2004, 335(2): 495-502. [15] Wang ET, Sandberg R, Luo S, et al. Alternative isoform regulation in human tissue transcriptomes [J]. Nature, 2008, 456(7221): 470-476. [16] 王曦, 汪小我, 王立坤, 等. 新一代高通量 RNA 测序数据的处理与分析[J]. 生物化学与生物物理进展, 2010, 37(8): 834-846. [17] Irizarry RA, Hobbs B, Collin F, et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data [J]. Biostatistics, 2003, 4(2): 249-264. [18] Liu Xuejun, Milo M, Lawrence ND, et al. A tractable probabilistic model for Affymetrix probe-level analysis across multiple chips [J]. Bioinformatics, 2005, 21(18): 3637-3644. [19] Liu Xuejun, Gao Zhenzhu, Zhang Li, et al. puma 3.0: improved uncertainty propagation methods for gene and transcript expression analysis [J]. Bmc Bioinformatics, 2013, 14(3):1-15. [20] Risue o A, Fontanillo C, Dinger ME, et al. GATExplorer: Genomic and Transcriptomic Explorer; mapping expression probes to gene loci, transcripts, exons and ncRNAs [J]. Bmc Bioinformatics, 2010, 11(1):1-12. [21] Dai Manhong, Wang Pinglang, Boyd AD, et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data [J]. Nucleic Acids Research, 2005, 33(20): e175-e175. [22] Pasaniuc B, Zaitlen N, Halperin E. Accurate estimation of expression levels of homologous genes in RNA-seq experiments [J]. Journal of Computational Biology, 2011, 18(3): 459-468. [23] Costa V, Angelini C, De FI, et al. Uncovering the Complexity of Transcriptomes with RNA-Seq [J]. Biomed Research International, 2010, 2010(1):853916. [24] Trapnell C, Williams BA, Pertea G, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation [J]. Nature Biotechnology, 2010, 28(5): 511-515. [25] Turro E, Su SY, Gon alves , et al. Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads [J]. Genome Biology, 2011, 12(2):81-89. [26] Bray NL, Pimentel H, Melsted P, et al. Near-optimal probabilistic RNA-seq quantification [J]. Nature Biotechnology, 2016, 34(5):525 -527. [27] Pertea M, Pertea GM, Antonescu CM, et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads [J]. Nature Biotechnology, 2015, 33(3): 290-295. [28] Liu Xuejun, Zhang Li, Chen Songcan. Modeling exon-specific bias distribution improves the analysis of RNA-seq data [J]. PLoS ONE, 2015, 10(10): e0140032. [29] Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies [J]. Nucleic Acids Research, 2015, 43(7):e47. [30] Vasiliu D, Clamons S, Mcdonough M, et al. A regression-based differential expression detection algorithm for microarray studies with ultra-low sample size [J]. PLoS ONE, 2015; 10(3): e0118198. [31] Liu Xuejun, Milo M, Lawrence ND, et al. Probe-level measurement error improves accuracy in detecting differential gene expression [J]. Bioinformatics, 2006, 22(17): 2107-2113. [32] Anders S, Huber W. Differential expression analysis for sequence count data [J]. Genome Biology, 2010, 11(10):1-12. [33] Li Jun, Tibshirani R. Finding consistent patterns: A nonparametric approach for identifying differential expression in RNA-Seq data [J]. Statistical Methods in Medical Research, 2013, 22(5):519-536. [34] Trapnell C, Roberts A, Goff L, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks [J]. Nature Protocols, 2012, 7(3): 562-578. [35] Schweikert G, Cseke B, Clouaire T, et al. MMDiff: quantitative testing for shape changes in ChIP-Seq data sets [J]. Bmc Genomics, 2013, 14(48):5492-5500. [36] Frazee AC, Pertea G, Jaffe AE, et al. Ballgown bridges the gap between transcriptome assembly and expression analysis [J]. Nature Biotechnology, 2015, 33(3): 243-246. [37] 王黎, 刘学军, 张礼. 基于模型选择的差异基因和异构体检测[J]. 数据采集与处理,2016,31(5):965-973.