BATCH PROCESSING FOR INCREMENTALLY MINING CLOSED ITEMSETS WITH MAPREDUCE
THANH-TRUNG NGUYEN *
Department of Computer Science, University of Information Technology, Vietnam National University, HCM City, Vietnam
HUE-MINH NGUYEN
Department of Computer Science, University of Information Technology, Vietnam National University, HCM City, Vietnam
PHI-KHU NGUYEN
Department of Computer Science, University of Information Technology, Vietnam National University, HCM City, Vietnam
*Author to whom correspondence should be addressed.
Abstract
The problem of closed frequent itemset discovery is a fundamental issue of data mining, having applications in numerous domains. The research on mining incrementally closed sets mainly uses an intermediate structure concept lattice for the purpose of updating this structure when there are changes in the data. We have proposed mining incrementally all closed itemsets with a linear list instead of the concept lattice. Besides, Map Reduce has created a complete infrastructure for parallel processing with many advantages. Thus, to continue the previous study, this paper proposes a method for batch processing for incrementally mining closed itemsets in Map Reduce. To the best of our knowledge, this is the first batch processing algorithm for incrementally mining closed itemsets in MapReduce proposed so far. The experiment initially showed the effectiveness of the proposed algorithm.
Keywords: Batch processing, closed itemsets, data mining, incremental mining, MapReduce