Horizontal Fragmentation of Data Warehouses Using Decision Trees
View/ Open
Date
2020-08-01Author
Rodríguez-Mazahua, Nidia
Rodríguez-Mazahua, Lisbeth
López-Chau, Asdrúbal
Alor-Hernández, Giner
Metadata
Show full item recordAbstract
One of the main problems faced by Data Warehouse (DW) designers is fragmentation. Several studies have proposed data mining-based horizontal fragmentation methods, which focus on optimizing the query response time and execution cost to make the DW more efficient. However, to the best of our knowledge there not exist a horizontal fragmentation technique that uses a decision tree to carry out fragmentation. Given the importance of decision trees in classification, since they allow obtaining pure partitions (subsets of tuples) in a data set using measures such as Information Gain, Gain Ratio and the Gini Index, the aim of this work is to use decision trees in the DW fragmentation. For this, the requirements necessary to carry out horizontal fragmentation using decision trees will be determined, and the fragmentation method will be designed, which will consist of determining the most frequent OLAP (On-line Analytical Processing) queries, analyzing the predicates used by the queries, and based on this build the decision tree, from which the horizontal fragments will be generated. The method will be implemented and validated using a case study in tourism.
Temas
FragmentationData Warehouse
Decision trees