Skip to main content

Network-based Models and Algorithms in Data Mining and Knowledge Discovery

  • Chapter
  • 1674 Accesses

5 Concluding Remarks

In this chapter, we have addressed several issues regarding the use of network-based mathematical programming techniques for solving various problems arising in the broad area of data mining. We have pointed out that applying these approaches often proved to be effective in many applications, including biomedicine, finance, telecommunications, etc. In particular, if a real-world massive dataset can be appropriately represented as a network structure, its analysis using standard graph-theoretical techniques often yields important practical results.

However, one should clearly understand that the success or failure of applying a certain methodology essentially depends on the structure of the considered dataset, and there is no “universal recipe” that would allow one to obtain useful information from any type of data. This indicates that despite the availability of a great variety of data mining techniques and software packages, choosing an appropriate method of the analysis of a certain dataset is a non-trivial task.

Moreover, as technological progress continues, new types of datasets may emerge in different practical fields, which would lead to further research in the field of data mining algorithms. Therefore, developing and modifying mathematical programming approaches in data mining is an exciting and challenging research area for years to come.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Abello, P.M. Pardalos, and M.G.C. Resende, 1999. On maximum clique problems in very large graphs, DIM ACS Series, 50, American Mathematical Society, 119–130.

    MATH  MathSciNet  Google Scholar 

  2. J. Abello, P.M. Pardalos, and M.G.C. Resende (eds.), 2002. Handbook of Massive Data Sets, Kluwer Academic Publishers.

    Google Scholar 

  3. J. Abello and J. S. Vitter (eds.). External Memory Algorithms. Vol. 50 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science. American Mathematical Society, 1999.

    Google Scholar 

  4. L. Adamic and B. Huberman. Power-law distribution of the World Wide Web. Science, 287: 2115a, 2000.

    Article  Google Scholar 

  5. R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghvan, 1998. Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications, in Proceedings of ACM SIGMOD International Conference on Management of Data, ACM, New York, 94105.

    Google Scholar 

  6. W. Aiello, F. Chung, and L. Lu, 2001. A random graph model for power law graphs, Experimental Math. 10, 53–66.

    MATH  MathSciNet  Google Scholar 

  7. R. Albert and A.-L. Barabasi, 2002. Statistical mechanics of complex networks, Reviews of Modern Physics 74, 47–97.

    Article  MathSciNet  Google Scholar 

  8. M.R. Anderberg, 1973. Cluster Analysis for Applications, Academic Press, New York.

    MATH  Google Scholar 

  9. S. Arora and S. Safra, 1992. Approximating clique is NP-complete, Proceedings of the 33rd IEEE Symposium on Foundations on Computer Science, 2–13.

    Google Scholar 

  10. A.-L. Barabasi and R. Albert, 1999. Emergence of scaling in random networks. Science 286: 509–511.

    Article  MathSciNet  Google Scholar 

  11. A.-L. Barabasi, 2002. Linked, Perseus Publishing.

    Google Scholar 

  12. K.P. Bennett and O.L. Mangasarian, 1992. Neural Network Training via Linear Programming, in Advances in Optimization and Parallel Computing, P.M. Pardalos, (ed.), North Holland, Amsterdam, 5667.

    Google Scholar 

  13. C. Berge, 1976. Graphs and Hypergraphs. North-Holland Mathematical Library, 6.

    Google Scholar 

  14. P. Berkhin, 2002. Survey of Clustering Data Mining Techniques. Technical Report, Accrue Software, San Jose, CA.

    Google Scholar 

  15. D. Bertsimas and R. Shioda, 2002. Classification and Regression via Integer Optimization. http://pages.stern.nyu.edu/rcaldent/seminar02/Romy.pdf

    Google Scholar 

  16. V. Boginski, S. Butenko, and P.M. Pardalos, 2003. Modeling and Optimization in Massive Graphs. In: P. M. Pardalos and H. Wolkowicz, editors. Novel Approaches to Hard Discrete Optimization, American Mathematical Society, 17–39.

    Google Scholar 

  17. V. Boginski, S. Butenko, and P.M. Pardalos, 2003. On Structural Properties of the Market Graph. In: A. Nagurney (editor), Innovations in Financial and Economic Networks, Edward Elgar Publishers, 28–45.

    Google Scholar 

  18. I.M. Bomze, M. Budinich, P.M. Pardalos, and M. Pelillo, 1999. The maximum clique problem. In: D.-Z. Du and P.M. Pardalos, editors, Handbook of Combinatorial Optimization, Kluwer Academic Publishers, 1–74.

    Google Scholar 

  19. P.S. Bradley, U.M. Fayyad, and O.L. Mangasarian, 1999. Mathematical Programming for Data Mining: Formulations and Challenges. INFORMS Journal on Computing, 11(3), 217–238.

    MATH  MathSciNet  Google Scholar 

  20. P.S. Bradley, O.L. Mangasarian, and W.N. Street, 1998. Feature Selection via Mathematical Programming, INFORMS Journal on Computing 10, 209217.

    MathSciNet  Google Scholar 

  21. S. Brin and L. Page, 1998. The anatomy of a large scale hypertextual web search engine. Proc. 7th WWW.

    Google Scholar 

  22. A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener, 2000. Graph structure in the Web. Computer Networks, 33: 309–320.

    Article  Google Scholar 

  23. A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tompkins, and J. Wiener, 2000. The Bow-Tie Web. Proceedings of the 9th International World Wide Web Conference.

    Google Scholar 

  24. V. Bugera, H. Konno, and S. Uryasev, 2002. Credit Cards Scoring with Quadratic Utility Functions, Journal of Multi-Criteria Decision Analysis, 11(4–5), 197–211.

    Article  MATH  Google Scholar 

  25. V. Bugera, S. Uryasev, and G. Zrazhevsky, 2003. Classification Using Optimization: Application to Credit Ratings of Bonds. Univ. of Florida, ISE Dept., Research Report #2003-14.

    Google Scholar 

  26. C.J.C. Burges, 1998. A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery 2, 121167.

    Article  Google Scholar 

  27. S. Chakrabarti, B. Dom, D. Gibson, S. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Experiments in topic distillation. Proc. ACM SIGIR workshop on Hypertext Information Retrieval on the Web, 1998.

    Google Scholar 

  28. M. Craven and J. Shavlik, 1997. Using Neural Networks for Data Mining. Future Generation Computer Systems (Special Issue on Data Mining) 13, 211–229.

    Article  Google Scholar 

  29. M. Faloutsos, P. Faloutsos and C. Faloutsos. On power-law relationships of the Internet topology. ACM SICOMM, 1999.

    Google Scholar 

  30. T. A. Feo and M. G. C. Resende. Greedy randomized adaptive search procedures. Journal of Global Optimization, 6:109–133, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  31. T. A. Feo and M. G. C. Resende. A greedy randomized adaptive search procedure for maximum independent set. Operations Research, 42:860–878, 1994.

    MATH  Google Scholar 

  32. D.J. Felleman and D.C. Van Essen, 1991. Distributed Hierarchical Processing in the Primate Cerebral Cortex. Cereb. Cortex, 1, 1–47.

    Google Scholar 

  33. M.R. Garey and D.S. Johnson, 1979. Computers and Intractability: A Guide to the Theory of NP-completeness, Freeman.

    Google Scholar 

  34. M.H. Hassoun, 1995. Fundamentals of Artificial Neural Networks, MIT Press, Cambridge, MA.

    MATH  Google Scholar 

  35. J. Håstad, 1999. Clique is hard to approximate within n 1−ε, Acta Mathematica 182 105–142.

    Article  MATH  MathSciNet  Google Scholar 

  36. B. Hayes, 2000. Graph Theory in Practice. American Scientist, 88: 9–13 (Part I), 104–109 (Part II).

    Article  Google Scholar 

  37. S. Haykin, 1999. Neural Networks: A Comprehensive Foundation. Macmillan College Publishing Company, New York.

    MATH  Google Scholar 

  38. C.C. Hilgetag, R. Kötter, K.E. Stephen, O. Sporns, 2002. Computational Methods for the Analysis of Brain Connectivity, In: G. A. Ascoli, ed., Computational Neuroanatomy, Humana Press.

    Google Scholar 

  39. B. Huberman and L. Adamic. Growth dynamics of the World-Wide Web. Nature, 401: 131, 1999.

    Google Scholar 

  40. L.D. Iasemidis, P.M. Pardalos, J.C. Sackellares, D-S. Shiau, 2001. Quadratic binary programming and dynamical system approach to determine the predictability of epileptic seizures. J. Combinatorial Optimization 5, 9–26

    Article  MATH  MathSciNet  Google Scholar 

  41. A.K. Jain and R.C. Dubes, 1988. Algorithms for Clustering Data, Prentice-Hall, Englewood Cliffs, NJ.

    MATH  Google Scholar 

  42. H. Jeong, B. Tomber, R. Albert, Z.N. Oltvai, and A.-L. Barabasi, 2000. The large-scale organization of metabolic networks, Nature 407: 651–654.

    Article  Google Scholar 

  43. D. S. Johnson and M. A. Trick (eds.), 1996. Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, Vol. 26 of DIMACS Series, American Mathematical Society.

    Google Scholar 

  44. J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACM-SIAM SODA, 1998.

    Google Scholar 

  45. J. Kleinberg and S. Lawrence. The Structure of the Web. Science, 294:1849–50, 2001.

    Article  Google Scholar 

  46. R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins, and E. Upfal. The Web as a graph. Symposium on Principles of Database Systems, pages 1–10, 2000.

    Google Scholar 

  47. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the Web for cyber communities. Proc. 8th WWW, 1999.

    Google Scholar 

  48. O.L. Mangasarian, 1993. Mathematical Programming in Neural Networks, ORSA Journal on Computing 5, 349–360.

    MATH  Google Scholar 

  49. O.L. Mangasarian, W.N. Street, and W.H. Wolberg, 1995. Breast Cancer Diagnosis and Prognosis via Linear Programming, Operations Research 43(4), 570–577.

    Article  MATH  MathSciNet  Google Scholar 

  50. R. N. Mantegna, and H. E. Stanley, 2000. An Introduction to Econophysics: Correlations and Complexity in Finance, Cambridge University Press.

    Google Scholar 

  51. B. Mirkin and I. Muchnik, 1998. Combinatoral Optimization in Clustering. In: Handbook of Combinatorial Optimization (D.-Z. Du and P.M. Pardalos, eds.), Volume 2. Kluwer Academic Publishers, 261–329.

    Google Scholar 

  52. A. Mendelzon, G. Mihaila, and T. Milo. Querying the World Wide Web. Journal of Digital Libraries, 1: 68–88, 1997.

    Google Scholar 

  53. A. Mendelzon and P. Wood. Finding regular simple paths in graph databases. SIAM J. Comp., 24: 1235–1258, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  54. J.M. Murre and D.P. Sturdy, 1995. The Connectivity of the Brain: Multi-Level Quantitative Analysis. Biol. Cybern., 73, 529–545.

    MATH  Google Scholar 

  55. E. Osuna, R. Freund, and F. Girosi, 1997. Improved Training Algorithm for Support Vector Machines, in Proceedings of IEEE NNSP97, Amelia Island, FL, September 1997, IEEE Press, New York, 276285.

    Google Scholar 

  56. E. Osuna, R. Freund, and F. Girosi, 1997. Training Support Vector Machines: An Application to Face Detection, in IEEE Conference on Computer Vision and Pattern Recognition, Puerto Rico, June 1997, IEEE Press, New York, 130136.

    Google Scholar 

  57. P.M. Pardalos, T. Mavridou, and J. Xue, 1998. The Graph Coloring Problem: A Bibliographic Survey. In: Handbook of Combinatorial Optimization (D.-Z. Du and P.M. Pardalos, eds.), Volume 2. Kluwer Academic Publishers, 331–395.

    Google Scholar 

  58. P.M. Pardalos, W. Chaovalitwongse, L.D. Iasemidis, J.C. Sackellares, D.-S. Shiau, P.R. Carney, O.A. Prokopyev, V.A. Yatsenko, 2003. Seizure Warning Algorithm Based on Spatiotemporal Dynamics of Intracranial EEG, Submitted to Mathematical Programming.

    Google Scholar 

  59. N.G. Pavlidis, D.K. Tasoulis, G.S. Androulakis, and M.N. Vrahatis, 2004. Exchange Rate Forecasting through Distributed Time-Lagged Feedforward Neural Networks. In: Supply Chain and Finance (P.M. Pardalos, A. Migdalas, G. Baourakis, eds.), World Scientific, 283–298.

    Google Scholar 

  60. G. Piatetsky-Shapiro and W. Frawley (eds.), 1991. Knowledge Discovery in Databases, MIT Press, Cambridge, MA.

    Google Scholar 

  61. O.A. Prokopyev, V. Boginski, W. Chaovalitwongse, P.M. Pardalos, J. C. Sackellares, and P. R. Carney, 2003. Network-Based Techniques in EEG Data Analysis and Epileptic Brain Modeling. Submitted to Computational Statistics and Data Analysis.

    Google Scholar 

  62. D.E. Rumelhart and D. Zipser, 1985. Feature Discovery by Competitive Learning. Cognitive Science, 9, 75–112.

    Article  Google Scholar 

  63. J.C. Sackellares, L.D. Iasemidis, R.L. Gilmore, S.N. Roper, 1997. Epileptic seizures as neural resetting mechanisms. Epilepsia 38,S3, 189.

    Google Scholar 

  64. B. Scholkopf, C. Burges, and V. Vapnik, 1995. Extracting Support Data for a Given Task, in Proceedings of the First International Conference in Knowledge Discovery and Data Mining, AAAI Press, Menlo Park, CA. 69–88.

    Google Scholar 

  65. D.S. Shiau, Q. Luo, S.L. Gilmore, S.N. Roper, P.M. Pardalos, J.C. Sackellares, L.D. Iasemidis, 2000. Epileptic seizures resetting revisited. Epilepsia. 41,S7, 208–209

    Google Scholar 

  66. V. Vapnik, S.E. Golowich, and A. Smola, 1997. Support Vector Method for Function Approximation, Regression Estimation, and Signal Processing, in Advances in Neural Information Processing Systems 9, M.C. Mozer, M.I. Jordan, and T. Petsche (eds.), MIT Press, Cambridge, MA.

    Google Scholar 

  67. V.N. Vapnik, 1995. The Nature of Statistical Learning Theory, Springer, New York.

    MATH  Google Scholar 

  68. D. Watts, 1999. Small Worlds: The Dynamics of Networks Between Order and Randomness, Princeton University Press.

    Google Scholar 

  69. D. Watts and S. Strogatz, 1998. Collective dynamics of’ small-world’ networks, Nature 393 440–442.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Kluwer Academic Publishers

About this chapter

Cite this chapter

Boginski, V., Pardalos, P.M., Vazacopoulos, A. (2004). Network-based Models and Algorithms in Data Mining and Knowledge Discovery. In: Du, DZ., Pardalos, P.M. (eds) Handbook of Combinatorial Optimization. Springer, Boston, MA. https://doi.org/10.1007/0-387-23830-1_5

Download citation

Publish with us

Policies and ethics