|
|
Diagnosis of Breast Cancer Based on Tumor Parameters and Visualization of the Attribute Partial Order Structure Diagram |
Liang Huaixin1, Song Jialin1, Zheng Cunfang1,2, Hong Wenxue1* |
1(Institute of Electrical Engineering, Yanshan University, Qinhuangdao 066004, Hebei, China) 2(LiRen College, Yanshan University, Qinhuangdao 066004, Hebei, China) |
|
|
Abstract In order to realize the visualization of the rules of breast cancer data, a method based on the combination of Lasso and incremental learning, was proposed, using the optimized attribute partial order structure diagram as a tool. Firstly, having the dimensions reduced by using Lasso to select the features of the breast cancer data, and four attributes that gained the largest correlation were selected from nine features. Granulation process was completed under the Gini index, generating the formal context by means of the incremental learning algorithm. Next, the second Lasso process was completed, which made the dimensions reduced from 17 to 3. Meanwhile, a new method processing the rows and columns of the formal context based on the Gini index and the covering theory was proposed to generate the attribute partial order structure diagram to visualize the rules concerned. As there have been seven rules extracted by analyzing the diagram reported in literatures,we compared the proposed classification accuracy of the method with those classical mainstream classifiers. Results showed that the classification precision of our method reached 96.52%, higher than the other five classifiers including Random Forest (94.25%), Adaboost (90.00%), 1NN (91.33%), 3NN (90.67%), and SVM (95.00%). At last, different incremental proportional (10%-90%) data were used to verify the effect of incremental learning algorithm, results showed that the model had been completed when the amount of data reached 30%, and the precision was almost approaching to that of support vector machine, which proved that the proposed method represented an effective means of visualizing the diagnosis rules of breast cancer.
|
Received: 20 June 2017
|
|
Corresponding Authors:
E-mail: hongwx@ysu.edu.cn
|
|
|
|
[1] 沈艳, 郭筱兰. 早期乳腺癌的影像学筛查现状与进展[J]. 中华乳腺病杂志, 2017, 11(2):114-116. [2] 陈万青, 郑荣寿. 中国女性乳腺癌发病死亡和生存状况[J]. 中国肿瘤临床, 2015, 42(13):668-674. [3] 左婷婷, 陈万青. 中国乳腺癌全人群生存率分析研究进展[J]. 中国肿瘤临床, 2016, 43(14):639-642. [4] 叶华容, 杨怡, 林萱,等. BP神经网络在高频彩超特征诊断乳腺癌中的应用[J]. 中国卫生统计, 2016, 33(1):71-72. [5] 饶飘雪,叶枫. 基于Logistic回归、ANN、SVM的乳腺癌复发影响因素研究[J]. 计算机系统应用,2016,25(7):259-263. [6] 吴辰文, 李长生, 王伟,等. 一种改进的SVM算法在乳腺癌诊断方面的应用[J]. 计算机工程与科学, 2017, 39(3):562-566. [7] 毛利锋, 瞿海斌. 一种基于决策树的乳腺癌计算机辅助诊断新方法[J]. 江南大学学报(自然科学版), 2004, 3(3):227-229. [8] 邓泽林,谭冠政,叶吉祥,等.一种用于乳腺癌诊断的免疫分类算法[J].中南大学学报(自然科学版),2010,41(4):1485-1490. [9] 邱天宇, 申富饶, 赵金熙. 自组织增量学习神经网络综述[J]. 软件学报, 2016, 27(9):2230-2247. [10] 徐敏政, 何宗宜, 刘亚虹,等. 双向渐进式概念格生成算法[J]. 小型微型计算机系统, 2014, 35(1):172-176. [11] 王爱平,万国伟,程志全,等. 支持在线学习的增量式极端随机森林分类器[J]. 软件学报, 2011,29(9):2059-2074. [12] 王爱平, 万国伟, 程志全,等. 支持在线学习的增量式极端随机森林分类器[J]. 软件学报, 2011, 22(9):2059-2074. [13] 曾舒如. 基于多模态增量学习模型的目标物体检测方法研究[D].南昌:南昌大学,2016. [14] Robert T. Regression shrinkage and selection via the lasso[J]. Journal of the Royal Statistical Society. Series B (Methodological). 1996,58(1): 267-288. [15] Vliet MHV, Wessels LF, Reinders MJ. Knowledge driven decomposition of tumor expression profiles[J]. BMC Bioinformatics, 2009, 10(Suppl 1):1-12. [16] Erdem C, Nagle Alison M, Casa Angelo J, et al. Proteomic screening and lasso regression reveal differential signaling in insulin and insulin-like growth factor I (IGF1) pathways[J]. Molecular & Cellular Proteomics : MCP, 2016, 15(9):3045-3057. [17] Wille R. Restructuring lattice theory: An approach based on hierarchies of concepts[M]//Formal Concept Analysis. Berlin:Springer Berlin Heidelberg, 2009:314-339. [18] 樊凤杰,洪文学,宋佳霖,等.方剂配伍规律的可视化表示方法与知识发现[J].中国生物医学工程学报,2016,35(6):764-768. [19] 张仲鹏. 基于属性偏序原理的脑功能近红外光谱分析方法研究[D]. 秦皇岛:燕山大学,2016. [20] 郝连旺,洪文学,魏鹍. 基于优选特征属性偏序结构分析的白细胞图像分类规则发现[J].高技术通讯,2015,25(10/11):871-877. [21] 靖鲲鹏, 宋之杰. 基于属性偏序结构图的文本型灾情多元信息可视化[J]. 灾害学, 2014,29(3):57-63. [22] 屈华. 基于属性偏序结构图原理的《伤寒论》知识发现方法研究[D]. 广州:广州中医药大学,2013. [23] Hong Wenxue, Luan Jingmin, Li Shaoxiong. The complete definitions of covering and properties description based on partial ordered theory[J]. ICIC Express Letters Part B: Applications, 2015, 6(4):1055-1060. [24] 杨律, 丁守鸿, 谢志峰,等. Lasso整脸形状回归的人脸配准算法[J]. 计算机辅助设计与图形学学报, 2015,27(7):1313-1319. [25] 王金甲,卢阳. 特征交互lasso用于肝病分类[J]. 生物医学工程学杂志,2015,36(06):1227-1232. [26] Bradley E, Trevor H, Iain J, et al. Least angle regression[J]. The Annals of Statistics, 2004, 32(2), 407-499. [27] Hong Wenxue, Yu Jianping, Cai Fei, et al. A new method of attribute reduction for decision formal context[J]. ICIC Express Letters Part B:Applications, 2012,3(5):1061-1068. [28] 李少雄,闫恩亮,宋佳霖,等. 偏序结构图的一种计算机生成算法[J]. 燕山大学学报,2014,38(5):403-408. |
|
|
|