Advertisement

Finding the most influential product under distribution constraints through dominance tests

  • Bo Yin
  • Xuetao Wei
  • Yonghe Liu
Article
  • 58 Downloads

Abstract

Market analysis is crucial for companies to remain invincible in the increasingly fierce market competition. A typical application is to find the most influential product, which attracts the largest number of customers, from a collection of candidate products. Previous work assumes a random distribution of the candidates. However, in many cases, there is a set of constraints on the distribution of candidate products. In this paper, we study the most influential product problem under constraints of the distribution. We model the constraints as both non-linear and linear constraints, where the candidate products reside in a hyper-rectangle and hyper-plane of the data space, respectively. We capitalize on reverse skyline queries to define the most influential product as the product with the largest reverse skyline set. We propose a general framework to solve the problem efficiently by taking advantage of candidate distributions. More specifically, we introduce a constraint-based filtering scheme, which prunes searching space and enables quick identification of some reverse skyline points, through pre-processing based on distribution constraints. We also propose a distance-based ordering technique, such that the processing results of a candidate can be utilized for data pruning of subsequent candidates. By combining the filtering scheme and ordering technique, we present two algorithms for handling different constraint models. Our experimental results with both real and synthetic datasets demonstrate the effectiveness and efficiency of our proposed algorithms.

Keywords

Dominance tests Reverse skyline queries Influential products Potential customers Distribution constraints 

Notes

Acknowledgements

This research was supported by the Natural Science Foundation of Hunan Province under Grant Number 2016JJ3012.

References

  1. 1.
    Weng C-H, Huang TC-K (2015) Knowledge discovery of customer purchasing intentions by plausible-frequent itemsets from uncertain data. Appl Intell 43(3):598–613CrossRefGoogle Scholar
  2. 2.
    Syaekhoni MA, Lee C, Kwon YS (2016) Analyzing customer behavior from shopping path data using operation edit distance. Appl Intell 10:1–21Google Scholar
  3. 3.
    Huang J, Zhu K, Zhong N (2016) A probabilistic inference model for recommender systems. Appl Intell 45(3):686–694CrossRefGoogle Scholar
  4. 4.
    Vlachou A, Doulkeridis C, Kotidis Y, Nørvåg K (2010) Reverse top-k queries. In: Proceedings of 26th international conference on data engineering (ICDE). IEEE, pp 365–376Google Scholar
  5. 5.
    Vlachou A, Doulkeridis C, Nørvåg K, Kotidis Y (2010) Identifying the most influential data objects with reverse top-k queries. Proc VLDB Endow 3(1–2):364–372CrossRefGoogle Scholar
  6. 6.
    Koh J-L, Lin C-Y, Chen AL (2014) Finding k most favorite products based on reverse top-t queries. VLDB J 23(4):541–564CrossRefGoogle Scholar
  7. 7.
    Gkorgkas O, Vlachou A, Doulkeridis C, Nørvåg K (2015) Finding the most diverse products using preference queries. In: Proceedings of the 18th international conference on extending database technology (EDBT), pp 205–216Google Scholar
  8. 8.
    Wang S, Cheema MA, Zhang Y, Lin X (2015) Selecting representative objects considering coverage and diversity. In: Proceedings of the 2nd international ACM workshop on managing and mining enriched geo-spatial data. ACM, pp 31–38Google Scholar
  9. 9.
    Zhang Z, Jin C, Kang Q (2014) Reverse k-ranks query. Proc VLDB Endow 7(10):785–796CrossRefGoogle Scholar
  10. 10.
    Yang J, Zhang Y, Zhang W, Lin X (2016) Influence based cost optimization on user preference. In: Proceedings of 32nd international conference on data engineering (ICDE). IEEE, pp 709–720Google Scholar
  11. 11.
    Peng P, Wong RC-W (2015) k-hit query: top-k query with probabilistic utility function. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data (SIGMOD). ACM, pp 577–592Google Scholar
  12. 12.
    Gao Y, Liu Q, Chen G, Zheng B, Zhou L (2015) Answering why-not questions on reverse top-k queries. Proc VLDB Endow 8(7):738–749CrossRefGoogle Scholar
  13. 13.
    Dellis E, Seeger B (2007) Efficient computation of reverse skyline queries. In: Proceedings of the 33rd international conference on very large data bases (VLDB), VLDB Endowment, pp 291–302Google Scholar
  14. 14.
    Gao Y, Liu Q, Zheng B, Chen G (2014) On efficient reverse skyline query processing. Exp Syst Appl 41(7):3237–3249CrossRefGoogle Scholar
  15. 15.
    Arvanitis A, Deligiannakis A, Vassiliou Y (2012) Efficient influence-based processing of market research queries. In: Proceedings of the 21st ACM international conference on information and knowledge management (CIKM). ACM, pp 1193–1202Google Scholar
  16. 16.
    Islam M S, Liu C (2016) Know your customer: computing k-most promising products for targeted marketing. VLDB J 25(4):545–570CrossRefGoogle Scholar
  17. 17.
    Lian X, Chen L (2008) Monochromatic and bichromatic reverse skyline search over uncertain databases. In: Proceedings of the 2008 ACM SIGMOD International conference on management of data (SIGMOD). ACM, pp 213–226Google Scholar
  18. 18.
    Wu X, Tao Y, Wong RC-W, Ding L, Yu JX (2009) Finding the influence set through skylines. In: Proceedings of the 12th international conference on extending database technology: advances in database technology. ACM, pp 1030–1041Google Scholar
  19. 19.
    Borzsony S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of 17th international conference on data engineering (ICDE). IEEE, pp 421–430Google Scholar
  20. 20.
    Korn F, Muthukrishnan S (2000) Influence sets based on reverse nearest neighbor queries. In: ACM sigmod record, vol 29. ACM, pp 201–212Google Scholar
  21. 21.
    Koizumi K, Eades P, Hiraki K, Inaba M (2018) Bjr-tree: fast skyline computation algorithm using dominance relation-based tree structure. Int J Data Sci Anal 1–18Google Scholar
  22. 22.
    Kim J, Kim MH (2018) An efficient parallel processing method for skyline queries in mapreduce. J Supercomput 74(2):886– 935CrossRefGoogle Scholar
  23. 23.
    Wang G, Xin J, Chen L, Liu Y (2012) Energy-efficient reverse skyline query processing over wireless sensor networks. IEEE Trans Knowl Data Eng 24(7):1259–1275CrossRefGoogle Scholar
  24. 24.
    Deshpande PM, Deepak P (2011) Efficient reverse skyline retrieval with arbitrary non-metric similarity measures. In: Proceedings of the 14th international conference on extending database technology (EDBT). ACM, pp 319–330Google Scholar
  25. 25.
    Park Y, Min J-K, Shim K (2013) Parallel computation of skyline and reverse skyline queries using mapreduce. Proc VLDB Endow 6(14):2002–2013CrossRefGoogle Scholar
  26. 26.
    Islam MS, Liu C, Rahayu W, Anwar T (2016) Q + tree: an efficient quad tree based data indexing for parallelizing dynamic and reverse skylines. In: Proceedings of the 25th ACM international on conference on information and knowledge management (CIKM). ACM, pp 1291–1300Google Scholar
  27. 27.
    Islam MS, Zhou R, Liu C (2013) On answering why-not questions in reverse skyline queries. In: Proceedings of 29th international conference on data engineering (ICDE). IEEE, pp 973– 984Google Scholar
  28. 28.
    Gao Y, Liu Q, Chen G, Zhou L, Zheng B (2016) Finding causality and responsibility for probabilistic reverse skyline query non-answers. IEEE Trans Knowl Data Eng 28(11):2974– 2987CrossRefGoogle Scholar
  29. 29.
    Lin C-Y, Koh J-L, Chen AL (2013) Determining k-most demanding products with maximum expected number of total customers. IEEE Trans Knowl Data Eng 25(8):1732–1747CrossRefGoogle Scholar
  30. 30.
    Zhou X, Li K, Xiao G, Zhou Y, Li K (2016) Top k favorite probabilistic products queries. IEEE Trans Knowl Data Eng 28(10):2808–2821CrossRefGoogle Scholar
  31. 31.
    Xu S, Lui J (2016) Product selection problem: improve market share by learning consumer behavior. ACM Trans Knowl Discov Data 10(4):34CrossRefGoogle Scholar
  32. 32.
    Wan Q, Wong RC-W, Peng Y (2011) Finding top-k profitable products. In: Proceedings of 27th international conference on data engineering (ICDE). IEEE, pp 1055–1066Google Scholar
  33. 33.
    Peng Y, Wong RC-W, Wan Q (2012) Finding top-k preferable products. IEEE Trans Knowl Data Eng 24(10):1774–1788CrossRefGoogle Scholar
  34. 34.
    Lin X, Yuan Y, Zhang Q, Zhang Y (2007) Selecting stars: the k most representative skyline operator. In: Proceedings of 23rd international conference on data engineering (ICDE). IEEE, pp 86–95Google Scholar
  35. 35.
    Tao Y, Ding L, Lin X, Pei J (2009) Distance-based representative skyline. In: Proceedings of 25th international conference on data engineering (ICDE). IEEE, pp 892–903Google Scholar
  36. 36.
    Wang S, Cheema MA, Zhang Y, Lin X (2015) Selecting representative objects considering coverage and diversity. In: Proceedings of 2nd international ACM workshop on managing and mining enriched geo-spatial data. ACM, pp 31–38Google Scholar
  37. 37.
    Magnani M, Assent I, Mortensen ML (2014) Taking the big picture: representative skylines based on significance and diversity. VLDB J 23(5):795–815CrossRefGoogle Scholar
  38. 38.
    Sarma AD, Lall A, Nanongkai D, Lipton RJ, Xu J (2011) Representative skylines using threshold-based preference distributions. In: Proceedings of 27th international conference on data engineering (ICDE). IEEE, pp 387–398Google Scholar
  39. 39.
    Huang J, Zhu K, Zhong N (2016) A probabilistic inference model for recommender systems. Appl Intell 45(3):686–694CrossRefGoogle Scholar
  40. 40.
    Yu Y, Wang C, Wang H, Gao Y (2017) Attributes coupling based matrix factorization for item recommendation. Appl Intell 46(3):521–533CrossRefGoogle Scholar
  41. 41.
    Mehlawat MK, Gupta P (2015) Cots products selection using fuzzy chance-constrained multiobjective programming. Appl Intell 43(4):732–751CrossRefGoogle Scholar
  42. 42.
    Cui B, Lu H, Xu Q, Chen L, Dai Y, Zhou Y (2008) Parallel distributed processing of constrained skyline queries by filtering. In: Proceedings of 24th international conference on data engineering (ICDE). IEEE, pp 546–555Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on TransportationChangSha University of Science and TechnologyChangshaChina
  2. 2.School of Computer and Communication EngineeringChangSha University of Science and TechnologyChangshaChina
  3. 3.School of Information Technology and Department of Electrical Engineering and Computing SystemsUniversity of CincinnatiCincinnatiUSA
  4. 4.Department of Computer Science and EngineeringUniversity of Texas at ArlingtonArlingtonUSA

Personalised recommendations