Updated in 2/9/2015 10:57:41 AM      Viewed: 47 times      (Journal Article)
Special Issue on Scalable Computing for Big Data 1 (0): 23-36 (2014)

GDPS: An Efficient Approach for Skyline Queries over Distributed Uncertain Data

Xiaoyong Li , Yijie Wang , Xiaoling Li , Xiaowei Wang , Jie Yu
Abstract The skyline query as an important aspect of big data management, has received considerable attention from the database community, due to its importance in many applications including multi-criteria decision making, preference answering, and so forth. Moreover, the uncertain data from many applications have become increasing distributed, which makes the central assembly of data at one location for storage and query infeasible and inefficient. The lack of global knowledge and the computational complexity derived from the introduction of the data uncertainty make the skyline query over distributed uncertain data extremely challenging. Although many efforts have addressed the skyline query problem over various distributed scenarios, existing studies still lack the approaches to efficiently process the query. In this paper, we extensively study the distributed probabilistic skyline query problem and propose an efficient approach GDPS to address the problem with an optimized iterative feedback mechanism based on the grid summary. Furthermore, many strategies for further optimizing the query are also proposed, including the optimization strategies for the local pruning, tuple selecting and the server pruning. Extensive experiments on real and synthetic data sets have been conducted to verify the effectiveness and efficiency of our approach by comparing with the state-of-the-art approaches.
ISSN: 2214-5796