Vol.39 No.04

Journal of Xi'an Jiaotong Universtity

Nov.2005

retue.gif (1614 ×Ö½Ú)

zwb.gif (1647 ×Ö½Ú)

Rough Set Clustering Algorithm Based on Entropy and Information Granularity
He Ming, Feng Boqin, Ma Zhaofeng, Fu Xianghua
(Department of Computer Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China)


Abstract: Aiming at most existing clustering algorithms that only handle the numeric data or categorical data rather than the mixed data, a clustering algorithm based on entropy and information granularity was proposed by using different granular partitions under the framework of the rough set theory. Using similarity relation and calculating the entropy at each data point, the data point with minimum entropy was selected as a clustering center. The numeric granules structure is formed by aggregating all data points in which the similarity with the chosen clustering center is larger than a threshold ¦Â. It does not need to regulate the entropy value at each data point in the clustering procedure, and saves the computation time. Moreover, the character granules structure is also formed by using indiscernibility relation in rough set. The cluster analysis with mixed attribute data is accomplished by iteratively modifying and agglomerating these two granules structures. The comparison of experimental results shows that the algorithm is effective and feasible. When ¦Âis 0.8, the maximum 0.96 of the clustering validity of the algorithm can be achieved, which is higher than others under same conditions.
Keywords: rough set; entropy; clustering analysis; information granularity