|
|
A Novel Evaluation Criterion for the Discrimination of Features and its Application in the Diagnosis of Erythemato-Squamous Disease#br# |
1 School of Computer Science, Shaanxi Normal University, Xi′an 710062, China
2 School of Electronic Engineering, Xidian University, Xi′an 710071, China
3 School of Information Engineering, National Laboratory of Automatic Target Recognition (ATR), Shenzhen University, Shenzhen 518060, China |
|
|
Abstract In order to overcome the disadvantages of F-score which does not take into account the influence of different measure for features on the discrimination of each feature between classes, a new criterion referred to as D-score was presented in this paper for measuring the discrimination of sample features between classes without any affection come from different unit of each feature. The D-score criterion was combined with Sequential Forward Search (SFS) strategy and Sequential Forward Floating Search (SFFS) strategy, respectively, to select features, whilst Support Vector Machines (SVM) is used as a classification tool to guide feature selection by evaluating the performance of the selected feature subset in classification, so that the two new hybrid feature selection methods were proposed combining the advantages of filters and wrappers, and were applied to dermatology dataset from UCI machine learning repository to find the key features of diagnosing erythematosquamous disease to help doctors make right decisions. Tenfold cross validation experiments have been conducted to compare the performance of D-score and our previously improved F-score in evaluating the discriminability of features between classes in diagnosing erythematosquamous disease. Experimental results showed that the D-score criterion was much more effective than the improved F-score, and obtained a preferable effect in the diagnosis of erythematosquamous disease with the improvement of 1.11% in accuracy via SFS search strategy; and about 3.00% improvement in minimum accuracy among 10 folds and around 0.3% improvement of average accuracy using SFFS search strategy as well as the best accuracy of 100%. The common features among 10 folds selected via D-score with SFFS was a subset of that of the improved F-score combined with SFFS.
|
|
|
|
|
|
|
|