Genome-Wide Smoke Related Methylation Signature Genes Identification for Lung Adenocarcinomas
Wang Shixiang1, Zhang Fei1, Wang Ling2, Song Kai1, 3*
1(School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China) 2(First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning, China) 3(University of Texas Southwestern Medical Center, Dallas 75235, USA)
Abstract：To understand the biological mechanism of never smoker lung adenocarcinomas, we focused on the genome-wide methylation values (ME) to discover signature genes for the distinguishing of current/never smokers. In order to overcome the disadvantages of small-size-high-dimension, high noise and to overcome the predominate influence of the whole genome to the dozens of signature genes, a new integrative selection method was used iteratively to uncover the real signature genes. To do this, instead of using only one criteria for gene selection, we identified genes according to their significance test performance, the relationship between their methylation levels and expression levels, the biological function and the contribution to the current/never smoker classification. As a result, 48 genes were identified as ME smoke related signature genes based on the 127 lung adenocarcinoma samples downloaded from TCGA database. Then we used 64 EDRN lung adenocarcinoma samples as an independent validation set. Only using the methylation values of these 48 signature genes, the current/never smoker classification accuracy of TCGA training set is87.5% (SN=87.2%, SP=87.8%) and for EDRN validation set is 76.4% (SN=80.2%, SP=73.6%), respectively. Cross-study proved the highly cancer related of 17 important genes in our 48 signature genes. Addition to these results, we proved the importance of their corresponding methylation values. The ingenuity pathway (IPA) and Kyoto encyclopedia of genes and genomes (KEGG) pathways analysis indicated the relationships among these genes on the genetic network level and pathway levels. They also indicated they are involved in the highly cancer-related pathways.
王世祥, 张飞, 王玲, 宋凯. 肺腺癌吸烟相关甲基化模式识别分类模型及特征基因的识别研究[J]. 中国生物医学工程学报, 2016, 35(3): 301-309.
Wang Shixiang, Zhang Fei, Wang Ling, Song Kai. Genome-Wide Smoke Related Methylation Signature Genes Identification for Lung Adenocarcinomas. Chinese Journal of Biomedical Engineering, 2016, 35(3): 301-309.
 Figueroa JD, Han SS, Garcia-Closas M, et al. Genome-wide interaction study of smoking and bladder cancer risk [J]. Carcinogenesis, 2014, 35(8): 1737-1744.  Figueroa JD, Han SS, Garcia-Closas M, et al. Genome-wide interaction study of smoking and bladder cancer risk [J]. Carcinogenesis, 2014, 35(8): 1737-1744.  Toh CK, Gao F, Lim WT, et al. Never-smokers with lung cancer: epidemiologic evidence of a distinct disease entity [J]. Journal of Clinical Oncology, 2006, 24(15): 2245-2251.  Kiyohara C, Wakai K, Mikami H, et al. Risk modification by CYP1A1 and GSTM1 polymorphisms in the association of environmental tobacco smoke and lung cancer: a case-control study in Japanese nonsmoking women [J]. International Journal of Cancer, 2003, 107(1): 139-144.  Gabrielson E. Worldwide trends in lung cancer pathology [J]. Respirology, 2006, 11(5): 533-538.  Radzikowska E, Glaz P, Roszkowski K. Lung cancer in women: age, smoking, histology, performance status, stage, initial treatment and survival [J]. Annals of Oncology, 2002, 13(7): 1087-1093.  Allison DB, Cui X, Page GP, et al. Microarray data analysis: from disarray to consolidation and consensus [J]. Nature Reviews Genetics, 2006, 7(1): 55-65.  Kim SC, Jung Y, Park J, et al. A high-dimensional, deep-sequencing study of lung adenocarcinoma in female never-smokers [J]. PLoS ONE, 2013, 8(2): e55596.  Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for illumina sequence data [J]. Bioinformatics, 2014, 30(15): 2114-2120.  Lee KW, Pausova Z. Cigarette smoking and DNA methylation [J]. Frontiers in Genetics, 2013, 4(1): 132-142.  Selamat SA, Galler JS, Joshi AD, et al. DNA methylation changes in atypical adenomatous hyperplasia, adenocarcinoma in situ, and lung adenocarcinoma [J].PLoS ONE, 2011, 6(6): e21443.  Liu Yang, Lan Qing, Siegfried JM, et al. Aberrant promoter methylation of p16 and MGMT genes in lung tumors from smoking and never-smoking lung cancer patients [J]. Neoplasia, 2006, 8(1): 46-51.  Wu Fang, Lu Min, Qu Lu, et al. DNA methylation of hMLH1 correlates with the clinical response to cisplatin after a surgical resection in non-small cell lung cancer [J]. International Journal of Clinical and Experimental Pathology, 2015, 8(5): 5457-5463.  Selamat SA, Chung BS, Girard L, et al. Genome-scale analysis of DNA methylation in lung adenocarcinoma and integration with mRNA expression [J]. Genome Research, 2012, 22(7): 1197-1211.  Jones PA, Laird PW. Cancer-epigenetics comes of age [J]. Nature Genetics, 1999, 21(2): 163-167.  Jones PA. The DNA methylation paradox[J]. Trends in Genetics, 1999, 15(1):34-37.  George G, Raj VC. Review on feature selection techniques and the impact of SVM for cancer classification using gene expression profile [J]. International Journal of Computer Science & Engineering Survey, 2011, 2(3): 42-55.  Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response [J]. Proceedings of the National Academy of Sciences, 2001, 98(9): 5116-5121.  Zhang Chunying, Girard L, Das A, et al. Nonlinear quantitative radiation sensitivity prediction model based on NCI-60 cancer cell lines [J]. The Scientific World Journal, 2014, 2014(5): 602-612.  Phillips T. The role of methylation in gene expression [J]. Nature Education, 2008, 1(1): 116-121.  Abdi H. Partial least square regression (PLS regression) [J]. Encyclopedia for Research Methods for the Social Sciences, 2003, 6(4): 792-795.  Sun Guangyuan, Liu Bing, He Jin, et al. Expression of EGFR is closely related to reduced 3-year survival rate in Chinese female NSCLC [J]. Medical Science Monitor, 2015, 21(1): 2225-2231.  Bublil EM, Yarden Y. The EGF receptor family: spearheading a merger of signaling and therapeutics [J]. Current Opinion in Cell Biology, 2007, 19(2): 124-134.  Koromilas AE, Sexl V. The tumor suppressor function of STAT1 in breast cancer [J]. Jak-Stat, 2013, 2(2): 1-5.  Ma PC, Jagadeeswaran R, Jagadeesh S, et al. Functional expression and mutations of c-Met and its therapeutic inhibition with SU11274 and small interfering RNA in non-small cell lung cancer [J]. Cancer Research, 2005, 65(4): 1479-1488.  Liu G, Wheatley-Price P, Zhou Wei, et al. Genetic polymorphisms of MDM2, cumulative cigarette smoking and nonsmall cell lung cancer risk [J]. International Journal of Cancer, 2008, 122(4): 915-918.  Sabel MS, Yamada M, Kawaguchi Y, et al. CD40 expression on human lung cancer correlates with metastatic spread [J]. Cancer Immunology Immunotherapy, 2000, 49(2): 101-108.  Nakajima T, Elovaara E, Anttila S, et al. Expression and polymorphism of glutathione S-transferase in human lungs: risk factors in smoking-related lung cancer [J]. Carcinogenesis, 1995, 16(4): 707-711.  Korpanty GJ, Graham DM, Vincent MD, et al. Biomarkers that currently affect clinical practice in lung cancer: EGFR, ALK, MET, ROS-1, and KRAS [J]. Frontiers in Oncology, 2014, 4(1): 204-211.  Heist RS, Engelman JA. SnapShot: non-small cell lung cancer [J]. Cancer Cell, 2012, 21(3): 448-448.e2.  Wistuba II, Gazdar AF. Lung cancer preneoplasia [J]. Annu Rev Pathol Mech Dis, 2006, 1(1): 331-348.  Sanborn JZ, Salama SR, Grifford M, et al. Double minute chromosomes in glioblastoma multiforme are revealed by precise reconstruction of oncogenic amplicons [J]. Cancer Research, 2013, 73(19): 6036-6045.  Westra WH, Baas IO, Hruban RH, et al. K-ras oncogene activation in atypical alveolar hyperplasias of the human lung [J]. Cancer Research, 1996, 56(9): 2224-2228.  Sozzi G, Pastorino U, Moiraghi L, et al. Loss of FHIT function in lung cancer and preinvasive bronchial lesions[J]. Cancer Research, 1998, 58(22):5032-5037.