نویسندگان
1 استادیار، دانشگاه علوم و فنون شهید ستاری
2 استادیار، دانشگاه هوایی شهید ستاری
چکیده
کلیدواژهها
عنوان مقاله [English]
نویسندگان [English]
Mining frequent patterns over data streams is a challenging problem due to speed of input streams in real applications, processing and storage limitations. There are various models for mining frequent patterns over data streams. Time sensitive sliding window model is preferable due to modeling both concept change and varying speed of input data. Adding and removing transactions to/from sliding window leads to change in the set of frequent patterns. Approach to compute or approximate the frequency of new itemsets has a direct effect on the efficiency of the mining algorithm. In this study, for first time, a probabilistic estimation is used to approximate the support values for new frequent itemsets. Based on this approximation, a new algorithm is proposed which can mine the set of frequent pattern within a time sensitive sliding window. This algorithm benefits from a novel prefix tree based data structure to store the set of frequent patterns of the active window. Experimental evaluations performed on real life and synthetically generated datasets show the superiority of the proposed algorithm with respect to previously proposed approaches in terms of memory usage and runtime.
کلیدواژهها [English]