基于粗糙集的复杂数据规则提取和属性约简方法 - Details

Author：

张晓 (张晓.)

Indexed by：

学位论文库

Abstract：

经典的粗糙集理论是Z.　Pawlak于1982年提出的处理数据集不一致性的数学理论.　实际中对复杂数据分析的需要推动了经典的粗糙集模型向更广义的粗糙集模型的拓展,　这些推广的粗糙集模型已经在数据挖掘和机器学习等领域得到了广泛的应用.

属性约简和规则提取是粗糙集在应用方面研究的热点问题.　考虑到实际数据一般含有冗余的属性而导致提取的规则不紧凑,　本文主要基于有关推广的粗糙集从提取紧凑规则的角度对覆盖决策系统、区间值决策系统以及模糊决策系统展开属性约简方面的研究.　主要创新工作如下:

(1)　将经典决策系统中决策规则的概念及其置信度的度量拓展到覆盖决策系统情形,　提出了覆盖粒规则的概念并研究了覆盖粒规则之间的蕴含关系,　建立了覆盖决策系统保持置信度的属性约简框架,　以提取置信度不小于给定阈值的紧凑规则,　进一步提出了计算所有约简的组合优化方法,　并通过数值试验评估了约简方法的有效性.

(2)　通过在区间值信息系统上定义恰当的二元关系,　提出了区间值粒规则的概念并给出了其置信度的度量,　研究了如何从区间值决策系统提取紧凑的确定性规则以及紧凑的置信度不小于给定阈值的可能性规则的方法,　建立了区间值决策系统保持置信度的属性约简框架以及计算所有约简的组合优化算法.　进一步,　通过数值试验对约简方法的有效性和可能性规则在决策中的作用进行了评估.

(3)　基于一般模糊关系研究了模糊粗糙集的粒结构,　基此粒结构提出了模糊粒规则的概念,　建立了模糊决策系统的属性约简框架.　进一步给出了计算模糊决策系统所有约简的布尔推理方法以及搜索一个约简的启发式算法.　作为模糊粒规则的应用,　设计了一个规则分类器,　并通过数值试验评估了分类器的有效性.

(4)　针对混合数据集基于模糊下近似的粒结构,　提出了一种新的信息熵并基此对模糊粗糙集的属性约简进行了等价刻画.　进一步基于该信息熵给出了混合数据特征选择的算法,　并通过数值试验评估了改进算法的有效性.

Keyword：

粗糙集规则提取决策系统属性约简特征选择

Author Community：

[ 1 ] 西安交通大学数学与统计学院

Reprint Author's Address：

Show more details

Translated Title

Translated Abstract

The　classical　rough　set　theory　was　proposed　by　Z.　Pawlak　in　1982　to　deal　with　the　inconsistency　in　datasets.　The　need　of　complex　data　analysis　in　practice　has　led　to　a　variety　of　extensions　of　the　classical　rough　set　model　and　the　extended　rough　sets　have　been　widely　applied　to　many　fields　such　as　data　mining　and　machine　learning.

Both　attribute　reduction　and　rule　acquisition　are　most　important　issues　in　the　applications　of　rough　sets.　In　consideration　of　the　fact　that　a　real-world　dataset　usually　contains　redundant　attributes,　which　can　lead　to　the　consequence　that　the　extracted　rules　are　not　compact,　this　dissertation　focuses　on　the　study　of　attribute　reduction　methods　for　covering　decision　systems,　interval-valued　decision　systems　and　fuzzy　decision　systems　from　the　viewpoint　of　acquiring　compact　rules.　The　main　innovative　work　includes　the　following　aspects:
　
(1)　The　definition　of　decision　rules　and　the　index　of　rule　confidence　in　classical　decision　systems　are　extended　to　the　case　of　covering　decision　systems.　A　concept　of　covering　granular　rules　is　presented,　and　the　implication　relationship　between　the　covering　granular　rules　is　investigated.　Then,　a　confidence　preserved　attribute　reduction　framework　for　covering　decision　systems　is　formulated　to　extract　the　compact　decision　rules　whose　confidence　are　not　less　than　a　pre-specified　threshold.　Furthermore,　a　combinatorial　optimization　algorithm　is　developed　to　compute　all　the　reducts　of　a　covering　decision　system.　Some　numerical　experiments　are　conducted　to　demonstrate　the　validity　of　the　proposed　reduction　method.

(2)　By　defining　a　proper　binary　relation　on　an　interval-valued　information　system,　the　concept　of　an　interval-valued　granular　rule　is　proposed　and　a　new　index　is　introduced　to　measure　the　confidence　of　an　interval-valued　granular　rule.　Then,　the　issue　of　extraction　both　compact　certain　rules　and　some　compact　possible　rules　with　certain　confidence　levels　is　studied,　and　a　confidence　preserved　attribute　reduction　framework　is　established　for　interval-valued　decision　systems.　A　combinatorial　optimization　approach　is　suggested　to　compute　all　the　reducts　of　an　interval-valued　decision　system.　Furthermore,　some　numerical　experiments　are　conducted　to　evaluate　the　performance　of　the　reduction　approach　and　the　gain　of　using　the　possible　rules　in　making　decision.

(3)　The　granular　structures　of　fuzzy　rough　sets　are　established　based　on　generic　fuzzy　relations.　With　the　granular　structures　of　the　lower　approximations,　the　concept　of　fuzzy　granular　rules　is　presented　for　fuzzy　decision　systems.　An　attribute　reduction　framework　for　fuzzy　decision　systems　is　formulated　to　extract　compact　fuzzy　granular　rules.　Furthermore,　a　Boolean　reasoning　method　is　proposed　to　compute　all　the　reducts,　and　a　heuristic　algorithm　is　also　provided　to　search　for　one　of　the　reducts.　A　classifier　is　finally　constructed　to　demonstrate　the　application　of　the　fuzzy　granular　rules,　and　some　numerical　experiments　are　conducted　to　assess　the　performance　of　the　classifier.

(4)　Based　on　the　granular　structure　of　the　fuzzy　lower　approximation,　a　novel　information　entropy　is　presented　for　mixed　data.　The　entropy　is　used　to　equivalently　characterize　the　attribute　reduction　of　fuzzy　rough　sets.　Furthermore,　a　feature　selection　algorithm　based　on　the　entropy　is　developed　for　mixed　data,　and　some　numerical　experiments　are　conducted　to　demonstrate　the　efficiency　of　the　improved　algorithm.

Translated Keyword

[Attribute reduction, Decision systems, Feature selection, Rough sets, Rule acquisition]

Research Interests

Classification

Corresponding authors email

Basic Info ：

Degree：理学博士

Student No.：

Year： 2014

Language： Chinese

Cited Count：

WoS CC Cited Count： 0

30 Days PV： 3

Affiliated Colleges：

数学与统计学院本学院/部未明确归属的数据

Location

Library Discovery Baidu Scholar Search

Type
Departments

All Years Choose Year From to