首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于有效修剪的最大频繁项集挖掘算法
引用本文:陈鹏,吕卫锋. 一种基于有效修剪的最大频繁项集挖掘算法[J]. 北京航空航天大学学报, 2006, 32(2): 218-223
作者姓名:陈鹏  吕卫锋
作者单位:北京航空航天大学 计算机学院, 北京 100083
基金项目:科技部科研项目,中国科学院资助项目
摘    要:对关联挖掘中的最大频繁项集挖掘问题进行了研究,提出了一种基于项集格修剪机制的最大频繁项集挖掘算法.采用项集格生成树的数据结构,将最大频繁项集挖掘过程转化为对项集格生成树进行深度优先搜索获取所有最大频繁节点的过程. 其中提高算法效率的一个重要措施是在遍历项集格生成树的过程中对生成树进行修剪.给出了项集格生成树的三个性质,并在此基础上提出了直接超集修剪、间接超集修剪与事务集等价修剪三种修剪机制,尽可能忽略非频繁节点及其所生成的扩展节点以减少遍历的节点数目.试验结果表明,三种修剪机制都能够有效地减少搜索空间,其中事务集等价修剪机制的效果最好,算法的性能与输入数据集的稠密程度相关. 

关 键 词:数据挖掘   关联规则   关联挖掘  
文章编号:1001-5965(2006)02-0218-06
收稿时间:2005-01-10
修稿时间:2005-01-10

Maximal frequent itemsets mining algorithm based on effective pruning mechanisms
Chen Peng,Lü Weifeng. Maximal frequent itemsets mining algorithm based on effective pruning mechanisms[J]. Journal of Beijing University of Aeronautics and Astronautics, 2006, 32(2): 218-223
Authors:Chen Peng  Lü Weifeng
Affiliation:School of Computer Science and Technology, Beijing University of Aeronautics and Astronautics, Beijing 100083, China
Abstract:The maximal frequent itemsets mining problem was studied and an algorithm based on pruning itemset lattice effectively was proposed. The itemset lattice tree data structure was adopted to translate maximal frequent itemsets mining into the process of depth-first searching the itemset lattice tree. One of the key measures to promote performance of the algorithm is to prune the itemset lattice tree while traversing it. Three properties of itemset lattice tree were given and three pruning mechanisms, direct superset pruning, indirect superset pruning and transaction sets equivalence pruning, were proposed based on them respectively to prune the infrequent nodes and their extension nodes to reduce the number of nodes while traversing the itemset lattice tree. Test results indicate that all the three pruning mechanisms can reduce the search space effectively and the transaction sets equivalence pruning has the best effect on performance of the algorithm. Test results also indicate that performance of the algorithm is related to denseness of the datasets.
Keywords:data mining   association rule    association mining   lattice
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《北京航空航天大学学报》浏览原始摘要信息
点击此处可从《北京航空航天大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号