Document Type

Article

Publication Date

2018

DOI

10.3390/info9050119

Publication Title

Information

Volume

9

Issue

5

Pages

119 (13 pages)

Abstract

High utility itemsets (HUIs) are sets of items with high utility, like profit, in a database. Efficient mining of high utility itemsets is an important problem in the data mining area. Many mining algorithms adopt a two-phase framework. They first generate a set of candidate itemsets by roughly overestimating the utilities of all itemsets in a database, and subsequently compute the exact utility of each candidate to identify HUIs. Therefore, the major costs in these algorithms come from candidate generation and utility computation. Previous works mainly focus on how to reduce the number of candidates, without dedicating much attention to utility computation, to the best of our knowledge. However, we find that, for a mining task, the time of utility computation in two-phase algorithms dominates the whole running time of these algorithms. Therefore, it is important to optimize utility computation. In this paper, we first give a basic algorithm for HUI identification, the core of which is a utility computation procedure. Subsequently, a novel candidate tree structure is proposed for storing candidate itemsets, and a candidate tree-based algorithm is developed for fast HUI identification, in which there is an efficient utility computation procedure. Extensive experimental results show that the candidate tree-based algorithm outperforms the basic algorithm and the performance of two-phase algorithms, integrating the candidate tree algorithm as their second step, can be significantly improved.

Comments

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

(C) 2018 by the authors. Licensee MDPI, Basel, Switzerland.

Original Publication Citation

Qu, J.-F., Liu, M., Xin, C., & Wu, Z. (2018). Fast identification of high utility itemsets from candidates. Information, 9(5), 119. doi:10.3390/info9050119

Share

COinS