Abstract:Breast cancer pathology report is the main basis for the diagnosis and treatment of breast cancer. However, sometimes there may loss of histological information in the clinical practices. In this study, imaging features of the lesion area of the dynamic enhanced magnetic resonance imaging (DCE-MRI) were combined with the histological information of the corresponding breast cancer patients to establish a non-negative matrix factorization based radiomics model to achieve the imputation of missing molecular subtypes and Cytokeratin 5/6 gene expression. A total of 139 cases of breast cancer patients were collected before surgery or before chemotherapy and were randomly divided into 89 cases as training set and 50 cases as test set. Breast tumor areas were segmented and the morphological and texture features were extracted from the lesion area and statistically analyzed. The cross-validated support vector machine recursive feature elimination (SVM-RFECV) method was used for the feature selection, and the image features were further filtered through a union-based method. Combining the clinical pathological information of breast cancer, a non-negative matrix factorization (NMF) imputation model and a collaborative filtering (CF) imputation model were established, and the AUC was calculated to evaluate the imputation performance of the model. When the clinical pathological information missing rate was different, the AUC value of the NMF model was higher than that of the CF model, the highest AUC was 0.772, and the NMF imputation effect was significantly better (P<0.05) than the CF method when the missing rate was between 20% and 40%. In the case of quantitative image features, the AUC value of the NMF model was higher than that of the CF model, the highest AUC was 0.780, and the difference between the two was statistically significant (P<0.05) when 140 image features were used. These experimental results showed that DCE-MRI radiomics combined with non-negative matrix factorization effectively filled the missing molecular subtypes and CK5/6 clinical indicators.
付振宇, 范明, 厉力华. 基于DCE-MRI影像组学非负矩阵分解的乳腺癌病理信息缺失填充研究[J]. 中国生物医学工程学报, 2021, 40(4): 401-409.
Fu Zhenyu, Fan Ming, Li Lihua. DCE-MRI Radiomics Based Non-negative Matrix Factorization for Imputation of Missing Histological Information of Breast Cancer. Chinese Journal of Biomedical Engineering, 2021, 40(4): 401-409.
[1] Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J].CA Cancer J Clin,2018, 68(6): 394-424. [2] 郑荣寿, 孙可欣, 张思维, 等.2015年中国恶性肿瘤流行情况分析 [J].中华肿瘤杂志,2019, 41 (1): 19-28. [3] Troyanskaya O, Cantor M, Sherlock G, et al. Missing value estimation methods for DNA microarrays[J]. Bioinformatics, 2001, 17 (6): 520-525. [4] Dorri F, Azmi P, Dorri F. Missing value imputation in DNA microarrays based on conjugate gradient method [J].Computers in Biology Medicine,2012, 42 (2): 222-227. [5] Oba S, Sato MA, Takemasa I, et al. A Bayesian missing value estimation method for gene expression profile data [J]. Bioinformatics, 2003, 19 (16): 2088-2096. [6] Wang X, Li A, Jiang Z, et al. Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme [J]. BMC Bioinformatics, 2006, 7 (1): 32. [7] Candes EJ, Tao T. The Power of convex relaxation: Near-optimal matrix completion [J]. IEEE Transactions on Information Theory, 2010, 56 (5): 2053-2080. [8] 史加荣, 郑秀云, 魏宗田, 等. 低秩矩阵恢复算法综述[J].计算机应用研究, 2013 (6): 1601-1605. [9] 黄智, 余先川, 王桂安, 等.非负矩阵分解算法在遥感图像融合中的应用[J].北京师范大学学报(自然科学版),2008, 44 (6): 599-601. [10] 郑佳. 非负矩阵分解模型选择及其在生物数据挖掘中的应用[D].武汉: 华中师范大学, 2018. [11] 贺超波, 汤庸, 张琼, 等. 基于增量式鲁棒非负矩阵分解的短文本在线聚类 [J].电子学报,2019, 47 (5): 1086-1093. [12] 李艳生, 刘园, 张毅. 基于感知掩蔽的重构非负矩阵分解单通道语音增强算法[J].计算机应用, 2019,39(3):894-898. [13] Pinker K, Helbich T, Morris E. The potential of multiparametric MRI of the breast [J].The British Journal of Radiology, 2017, 90(1069):20160715. [14] Yi AY, Hun KS, Joo KB, et al. Treatment response evaluation of breast cancer after neoadjuvant chemotherapy and usefulness of the imaging parameters of MRI and PET/CT [J]. Journal of Korean Medical Science, 2015, 30 (6): 808-815. [15] Marino MA, Helbich T, Baltzer P, et al. Multiparametric MRI of the breast: A review [J]. Journal of Magnetic Resonance Imaging, 2018, 47 (2): 301-315. [16] Hyunjin P, Yaeji L, Sook KE, et al. Radiomics signature on magnetic resonance imaging: association with disease-free survival in patients with invasive breast cancer [J].Clinical Cancer Research, 2018, 24: 4705-4714. [17] Fang J, Zhang B, Wang S, et al. Association of MRI-derived radiomic biomarker with disease-free survival in patients with early-stage cervical cancer[J].Theranostics, 2020, 10 (5): 2284-2292. [18] Li H, Zhu Y, Burnside ES, et al. Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA data set [J]. NPJ Breast Cancer, 2016, 2: 16012. [19] Ashirbani S, Harowicz MR, Grimm LJ, et al. A machine learning approach to radiogenomics of breast cancer: a study of 922 subjects and 529 DCE-MRI features [J]. British Journal of Cancer, 2018, 119: 508-516. [20] Fan M, Zhang P, Wang Y, et al. Radiomic analysis of imaging heterogeneity in tumours and the surrounding parenchyma based on unsupervised decomposition of DCE-MRI for predicting molecular subtypes of breast cancer [J]. European Radiology, 2019: 4456-4467. [21] 张承杰, 厉力华. 基于空间FCM与MRF方法的乳腺MRI序列三维病灶分割研究 [J].中国生物医学工程学报, 2014, 33 (2): 202-211. [22] Griethuysen JJMV, Fedorov A, Parmar C, et al. Computational radiomics system to decode the radiographic phenotype [J].Cancer Research, 2017, 77 (21): e104-e107. [23] Zhao ZD, Shang MS. User-based collaborative-filtering recommendation algorithms on Hadoop [C]//The Third International Conference on Knowledge Discovery and Data Mining. Phuket: IEEE Computer Society, 2010: 478-481. [24] Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization [J]. Nature,1999, 401 (6755): 788-791. [25] Keerin P, Kurutach W, Boongoen T. An improvement of missing value imputation in DNA microarray data using cluster-based LLS method[C]//The 13th International Symposium on Communications and Information Technologies (ISCIT). Surat Thani: IEEE, 2013: 559-564. [26] Scheel I, Aldrin M, Glad IK, et al. The influence of missing value imputation on detection of differentially expressed genes from microarray data [J].Bioinformatics 2005, 21 (23): 4272-4279. [27] Berthoumieux S, Brilli M, De Jong H, et al. Identification of metabolic network models from incomplete high-throughput datasets [J]. Bioinformatics, 2011, 27 (13): i186-i195. [28] Tuikkala J, Elo L, Nevalainen OS, et al. Improving missing value estimation in microarray data with gene ontology [J]. Bioinformatics, 2006, 22 (5): 566-572.