On solving the multiple p-median problem based on biclustering

Abstract

In this paper, we discuss the multiple p-median problem (MPMP), an extension of the original p-median problem and present several potential applications. The objective of the well-known p-median problem is to locate p facilities in order to minimize the total distance between demand points and facilities. Each demand point should be covered by its closest facility. In the MPMP, each demand point should be covered by more than one facilities closer to it, represented in total by the mc parameter. The MPMP can be applied to various location problems, e.g. the provision of emergency services where alternative facilities need to hedge against the unavailability of the primary facility, as well as to other domains, e.g. recommender systems where it may be desirable to respond to each user query with more than one available choice that satisfy their preferences. We efficiently solve the MPMP by using a biclustering heuristic, which creates biclusters from the distance matrix. In the proposed approach, a bicluster represents a subset of demand points covered by a subset of facilities. The heuristic selects appropriate biclusters taking into account the objective of the problem. Based on experimental tests performed in known benchmark problems, we observed that our method provides solutions slightly inferior to the optimal ones in significantly less computational time when compared to the CPLEX optimizer. In larger test instances, our method outperforms CPLEX both in terms of computational time and solution quality, when a time bound of 1 h is set for obtaining a solution.

This is a preview of subscription content, log in to check access.

References

  1. Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo AI et al (1996) Fast discovery of association rules. Adv Knowl Discov Data Min 12(1):307–328

    Google Scholar 

  2. Al-khedhairi A (2008) Simulated annealing metaheuristic for solving p-median problem. Int J Contemp Math Sci 3(28):1357–1365

    Google Scholar 

  3. Alp O, Erkut E, Drezner Z (2003) An efficient genetic algorithm for the p-median problem. Ann Oper Res 122(1–4):21–42

    Article  Google Scholar 

  4. Anaya-Arenas AM, Renaud J, Ruiz A (2014) Relief distribution networks: a systematic review. Ann Oper Res 223(1):53–79

    Article  Google Scholar 

  5. Avella P, Sassano A, Vasil’ev I (2007) Computational study of large-scale p-median problems. Math Program 109(1):89–114

    Article  Google Scholar 

  6. Avella P, Boccia M, Salerno S, Vasilyev I (2012) An aggregation heuristic for large scale p-median problem. Comput Oper Res 39(7):1625–1632

    Article  Google Scholar 

  7. Beasley JE (1990) Or-library: distributing test problems by electronic mail. J Oper Res Soc 41(11):1069–1072

    Article  Google Scholar 

  8. Ben-Dor A, Chor B, Karp R, Yakhini Z (2003) Discovering local structure in gene expression data: the order-preserving submatrix problem. J Comput Biol 10(3–4):373–384

    Article  Google Scholar 

  9. Bergmann S, Ihmels J, Barkai N (2003) Iterative signature algorithm for the analysis of large-scale gene expression data. Phys Rev E 67(3):031902

    Article  Google Scholar 

  10. Berkhin P (2006) A survey of clustering data mining techniques. In: Kogan J, Nicholas C, Teboulle M (eds) Grouping multidimensional data. Springer, Berlin, pp 25–71

    Google Scholar 

  11. Boutsinas B (2013a) Machine-part cell formation using biclustering. Eur Journal Oper Res 230(3):563–572

    Article  Google Scholar 

  12. Boutsinas B (2013b) A new biclustering algorithm based on association rule mining. Int J Artif Intell Tools 22(03):1350017

    Article  Google Scholar 

  13. Boutsinas B, Siotos C, Gerolimatos A (2008) Distributed mining of association rules based on reducing the support threshold. Int J Artif Intell Tools 17(06):1109–1129

    Article  Google Scholar 

  14. Bozkaya B, Zhang J, Erkut E (2002) An efficient genetic algorithm for the p-median problem. In: Drezner Z, Hamacher H (eds) Facility location: applications and theory. Springer, Berlin, pp 179–205

    Google Scholar 

  15. Busygin S, Prokopyev O, Pardalos PM (2008) Biclustering in data mining. Comput Oper Res 35(9):2964–2987

    Article  Google Scholar 

  16. Chardaire P, Lutton JL (1993) Using simulated annealing to solve concentrator location problems in telecommunication networks. In: Vidal RVV (ed) Applied simulated annealing. Springer, Berlin, pp 175–199

    Google Scholar 

  17. Cheng Y, Church G (2000) Biclustering of expression data. In: proceedings of the eighth international conference on intelligent systems for molecular biology (ismb)

  18. Chiyoshi F, Galvao RD (2000) A statistical analysis of simulated annealing applied to the p-median problem. Ann Oper Res 96(1–4):61–74

    Article  Google Scholar 

  19. Church RL, ReVelle CS (1976) Theoretical and computational links between the p-median, location set-covering, and the maximal covering location problem. Geogr Anal 8(4):406–415

    Article  Google Scholar 

  20. Daskin MS, Maass KL (2015) The p-median problem. In: Laporte G, Nickel S, Saldanha da Gama F (eds) Location science. Springer, Cham, pp 21–45

    Google Scholar 

  21. Densham PJ, Rushton G (1992) A more efficient heuristic for solving large p-median problems. Pap Reg Sci 71(3):307–329

    Article  Google Scholar 

  22. Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 269–274

  23. Drezner Z, Hamacher HW (2001) Facility location: applications and theory. Springer, Berlin

    Google Scholar 

  24. Erkut E, Myroon T, Strangway K (2000) Transalta redesigns its service-delivery network. Interfaces 30(2):54–69

    Article  Google Scholar 

  25. Fitzsimmons JA, Allen LA (1983) A warehouse location model helps texas comptroller select out-of-state audit offices. Interfaces 13(5):40–46

    Article  Google Scholar 

  26. Floyd RW (1962) Algorithm 97: shortest path. Commun ACM 5(6):345

    Article  Google Scholar 

  27. García S, Labbé M, Marín A (2011) Solving large p-median problems with a radius formulation. INFORMS J Comput 23(4):546–556

    Article  Google Scholar 

  28. Hakimi SL (1964) Optimum locations of switching centers and the absolute centers and medians of a graph. Oper Res 12(3):450–459

    Article  Google Scholar 

  29. Hakimi SL (1965) Optimum distribution of switching centers in a communication network and some related graph theoretic problems. Oper Res 13(3):462–475

    Article  Google Scholar 

  30. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87

    Article  Google Scholar 

  31. Hanjoul P, Peeters D (1985) A comparison of two dual-based procedures for solving the p-median problem. Eur J Oper Res 20(3):387–396

    Article  Google Scholar 

  32. Hansen P, Mladenović N (1997) Variable neighborhood search for the p-median. Location Sci 5(4):207–226

    Article  Google Scholar 

  33. Hartigan JA (1972) Direct clustering of a data matrix. J Am Stat Assoc 67(337):123–129

    Article  Google Scholar 

  34. Honey R, Rushton G, Lolonis P, Dalziel B, Armstrong M, De S, Densham P (1991) Stages in the adoption of a spatial decision support system for reorganizing service delivery regions. Environ Plan C Gov Policy 9(1):51–63

    Article  Google Scholar 

  35. Karatas M, Razi N, Tozan H (2016) A comparison of p-median and maximal coverage location models with q-coverage requirement. Proc Eng 149:169–176

    Article  Google Scholar 

  36. Kariv O, Hakimi SL (1979) An algorithmic approach to network location problems. i: the p-centers. SIAM J Appl Math 37(3):513–538

    Article  Google Scholar 

  37. Klastorin TD (1985) The p-median problem for cluster analysis: a comparative test using the mixture model approach. Manag Sci 31(1):84–95

    Article  Google Scholar 

  38. Lazzeroni L, Owen A (2002) Plaid models for gene expression data. Stat Sin 12:61–86

    Google Scholar 

  39. Liu J, Wang W (2003) Op-cluster: clustering by tendency in high dimensional space. In: Third IEEE international conference on data mining, 2003. ICDM 2003. IEEE, pp 187–194

  40. Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 1(1):24–45

    Article  Google Scholar 

  41. Maranzana F (1964) On the location of supply points to minimize transport costs. J Oper Res Soc 15(3):261–270

    Article  Google Scholar 

  42. Megiddo N (1986) On the complexity of linear programming. IBM Thomas J, Watson Research Division

  43. Mladenović N, Brimberg J, Hansen P, Moreno-Pérez JA (2007) The p-median problem: a survey of metaheuristic approaches. Eur J Oper Res 179(3):927–939

    Article  Google Scholar 

  44. Mucherino A, Papajorgji P, Pardalos PM (2009) A survey of data mining techniques applied to agriculture. Oper Res 9(2):121–140

    Google Scholar 

  45. Mulvey JM, Crowder HP (1979) Cluster analysis: an application of lagrangian relaxation. Manag Sci 25(4):329–340

    Article  Google Scholar 

  46. Murali T, Kasif S (2002) Extracting conserved gene expression motifs from gene expression data. In: Biocomputing 2003. World Scientific, pp 77–88

  47. Murray AT, Church RL (1996) Applying simulated annealing to location-planning models. J Heuristics 2(1):31–53

    Article  Google Scholar 

  48. Ndiaye F, Ndiaye BM, Ly I (2012) Application of the p-median problem in school allocation. Am J Oper Res 2(02):253

    Google Scholar 

  49. Owen SH, Daskin MS (1998) Strategic facility location: a review. Eur J Oper Res 111(3):423–447

    Article  Google Scholar 

  50. Panteli A, Boutsinas B (2018) Improvement of similarity-diversity tradeoff in recommender systems based on a facility location model. Technical Report. http://hdl.handle.net/10889/11695

  51. Panteli A, Boutsinas B, Giannikos I (2014) On set covering based on biclustering. Int J Inf Technol Decis Mak 13(05):1029–1049

    Article  Google Scholar 

  52. Pensa RG, Robardet C, Boulicaut JF (2005) A bi-clustering framework for categorical data. In: European conference on principles of data mining and knowledge discovery. Springer, pp 643–650

  53. Prelić A, Bleuler S, Zimmermann P, Wille A, Bühlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1129

    Article  Google Scholar 

  54. ReVelle CS, Eiselt HA (2005) Location analysis: a synthesis and survey. Eur J Oper Res 165(1):1–19

    Article  Google Scholar 

  55. ReVelle CS, Swain RW (1970) Central facilities location. Geogr Anal 2(1):30–42

    Article  Google Scholar 

  56. Rolland E, Schilling DA, Current JR et al (1997) An efficient tabu search procedure for the p-median problem. Eur J Oper Res 96(2):329–342

    Article  Google Scholar 

  57. Rosenwein MB (1994) Discrete location theory, edited by PB Mirchandani and RL Francis, John Wiley & Sons, New York, 1990, 555 pp. Networks 24(2):124–125

    Article  Google Scholar 

  58. Ruslim NM, Ghani NA (2006). An application of the p-median problem with uncertainty in demand in emergency medical services. In: Proceedings of the 2nd IMT-GT regional conference on mathematics, statistics and applications. http://math.usm.my/research/OnlineProc/OR06.pdf. Accessed 15 May 2017

  59. Snyder LV, Daskin MS (2005) Reliability models for facility location: the expected failure cost case. Transp Sci 39(3):400–416

    Article  Google Scholar 

  60. Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(suppl_1):S136–S144

    Article  Google Scholar 

  61. Tanay A, Sharan R, Shamir R (2005) Biclustering algorithms: a survey. Handb Comput Mol Biol 9(1–20):122–124

    Google Scholar 

  62. Teitz MB, Bart P (1968) Heuristic methods for estimating the generalized vertex median of a weighted graph. Oper Res 16(5):955–961

    Article  Google Scholar 

  63. Ungar L, Foster DP (1998) A formal statistical approach to collaborative filtering. CONALD98

  64. Wang HL, Wu BY, Chao KM (2009) The backup 2-center and backup 2-median problems on trees. Netw Int J 51(1):39–49

    Google Scholar 

  65. Willer DJ (1990) A spatial decision support system for bank location: a case study. Citeseer

  66. Yang J, Wang W, Wang H, Yu P (2002) d-clusters: capturing subspace correlation in a large data set. In: ICDE. IEEE, p 0517

Download references

Acknowledgements

The authors are grateful to the anonymous reviewers for their constructive comments and suggestions because through the detailed evaluation of this work, they contributed to its significant improvement. Also, the authors wish to thank Aristotelis Kompothrekas, Ph.D. candidate in Patras University, for his valuable contribution in performing the experimental tests.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Antiopi Panteli.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Panteli, A., Boutsinas, B. & Giannikos, I. On solving the multiple p-median problem based on biclustering. Oper Res Int J (2019). https://doi.org/10.1007/s12351-019-00461-9

Download citation

Keywords

  • Location
  • p-median problem
  • Data mining
  • Biclustering
  • Artificial intelligence