5 Concluding Remarks
In this chapter, we have addressed several issues regarding the use of network-based mathematical programming techniques for solving various problems arising in the broad area of data mining. We have pointed out that applying these approaches often proved to be effective in many applications, including biomedicine, finance, telecommunications, etc. In particular, if a real-world massive dataset can be appropriately represented as a network structure, its analysis using standard graph-theoretical techniques often yields important practical results.
However, one should clearly understand that the success or failure of applying a certain methodology essentially depends on the structure of the considered dataset, and there is no “universal recipe” that would allow one to obtain useful information from any type of data. This indicates that despite the availability of a great variety of data mining techniques and software packages, choosing an appropriate method of the analysis of a certain dataset is a non-trivial task.
Moreover, as technological progress continues, new types of datasets may emerge in different practical fields, which would lead to further research in the field of data mining algorithms. Therefore, developing and modifying mathematical programming approaches in data mining is an exciting and challenging research area for years to come.
Keywords
- Data Mining
- Cluster Center
- Degree Distribution
- Maximum Clique
- Greedy Randomize Adaptive Search Procedure
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
J. Abello, P.M. Pardalos, and M.G.C. Resende, 1999. On maximum clique problems in very large graphs, DIM ACS Series, 50, American Mathematical Society, 119–130.
J. Abello, P.M. Pardalos, and M.G.C. Resende (eds.), 2002. Handbook of Massive Data Sets, Kluwer Academic Publishers.
J. Abello and J. S. Vitter (eds.). External Memory Algorithms. Vol. 50 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science. American Mathematical Society, 1999.
L. Adamic and B. Huberman. Power-law distribution of the World Wide Web. Science, 287: 2115a, 2000.
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghvan, 1998. Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications, in Proceedings of ACM SIGMOD International Conference on Management of Data, ACM, New York, 94105.
W. Aiello, F. Chung, and L. Lu, 2001. A random graph model for power law graphs, Experimental Math. 10, 53–66.
R. Albert and A.-L. Barabasi, 2002. Statistical mechanics of complex networks, Reviews of Modern Physics 74, 47–97.
M.R. Anderberg, 1973. Cluster Analysis for Applications, Academic Press, New York.
S. Arora and S. Safra, 1992. Approximating clique is NP-complete, Proceedings of the 33rd IEEE Symposium on Foundations on Computer Science, 2–13.
A.-L. Barabasi and R. Albert, 1999. Emergence of scaling in random networks. Science 286: 509–511.
A.-L. Barabasi, 2002. Linked, Perseus Publishing.
K.P. Bennett and O.L. Mangasarian, 1992. Neural Network Training via Linear Programming, in Advances in Optimization and Parallel Computing, P.M. Pardalos, (ed.), North Holland, Amsterdam, 5667.
C. Berge, 1976. Graphs and Hypergraphs. North-Holland Mathematical Library, 6.
P. Berkhin, 2002. Survey of Clustering Data Mining Techniques. Technical Report, Accrue Software, San Jose, CA.
D. Bertsimas and R. Shioda, 2002. Classification and Regression via Integer Optimization. http://pages.stern.nyu.edu/rcaldent/seminar02/Romy.pdf
V. Boginski, S. Butenko, and P.M. Pardalos, 2003. Modeling and Optimization in Massive Graphs. In: P. M. Pardalos and H. Wolkowicz, editors. Novel Approaches to Hard Discrete Optimization, American Mathematical Society, 17–39.
V. Boginski, S. Butenko, and P.M. Pardalos, 2003. On Structural Properties of the Market Graph. In: A. Nagurney (editor), Innovations in Financial and Economic Networks, Edward Elgar Publishers, 28–45.
I.M. Bomze, M. Budinich, P.M. Pardalos, and M. Pelillo, 1999. The maximum clique problem. In: D.-Z. Du and P.M. Pardalos, editors, Handbook of Combinatorial Optimization, Kluwer Academic Publishers, 1–74.
P.S. Bradley, U.M. Fayyad, and O.L. Mangasarian, 1999. Mathematical Programming for Data Mining: Formulations and Challenges. INFORMS Journal on Computing, 11(3), 217–238.
P.S. Bradley, O.L. Mangasarian, and W.N. Street, 1998. Feature Selection via Mathematical Programming, INFORMS Journal on Computing 10, 209217.
S. Brin and L. Page, 1998. The anatomy of a large scale hypertextual web search engine. Proc. 7th WWW.
A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener, 2000. Graph structure in the Web. Computer Networks, 33: 309–320.
A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tompkins, and J. Wiener, 2000. The Bow-Tie Web. Proceedings of the 9th International World Wide Web Conference.
V. Bugera, H. Konno, and S. Uryasev, 2002. Credit Cards Scoring with Quadratic Utility Functions, Journal of Multi-Criteria Decision Analysis, 11(4–5), 197–211.
V. Bugera, S. Uryasev, and G. Zrazhevsky, 2003. Classification Using Optimization: Application to Credit Ratings of Bonds. Univ. of Florida, ISE Dept., Research Report #2003-14.
C.J.C. Burges, 1998. A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery 2, 121167.
S. Chakrabarti, B. Dom, D. Gibson, S. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Experiments in topic distillation. Proc. ACM SIGIR workshop on Hypertext Information Retrieval on the Web, 1998.
M. Craven and J. Shavlik, 1997. Using Neural Networks for Data Mining. Future Generation Computer Systems (Special Issue on Data Mining) 13, 211–229.
M. Faloutsos, P. Faloutsos and C. Faloutsos. On power-law relationships of the Internet topology. ACM SICOMM, 1999.
T. A. Feo and M. G. C. Resende. Greedy randomized adaptive search procedures. Journal of Global Optimization, 6:109–133, 1995.
T. A. Feo and M. G. C. Resende. A greedy randomized adaptive search procedure for maximum independent set. Operations Research, 42:860–878, 1994.
D.J. Felleman and D.C. Van Essen, 1991. Distributed Hierarchical Processing in the Primate Cerebral Cortex. Cereb. Cortex, 1, 1–47.
M.R. Garey and D.S. Johnson, 1979. Computers and Intractability: A Guide to the Theory of NP-completeness, Freeman.
M.H. Hassoun, 1995. Fundamentals of Artificial Neural Networks, MIT Press, Cambridge, MA.
J. Håstad, 1999. Clique is hard to approximate within n 1−ε, Acta Mathematica 182 105–142.
B. Hayes, 2000. Graph Theory in Practice. American Scientist, 88: 9–13 (Part I), 104–109 (Part II).
S. Haykin, 1999. Neural Networks: A Comprehensive Foundation. Macmillan College Publishing Company, New York.
C.C. Hilgetag, R. Kötter, K.E. Stephen, O. Sporns, 2002. Computational Methods for the Analysis of Brain Connectivity, In: G. A. Ascoli, ed., Computational Neuroanatomy, Humana Press.
B. Huberman and L. Adamic. Growth dynamics of the World-Wide Web. Nature, 401: 131, 1999.
L.D. Iasemidis, P.M. Pardalos, J.C. Sackellares, D-S. Shiau, 2001. Quadratic binary programming and dynamical system approach to determine the predictability of epileptic seizures. J. Combinatorial Optimization 5, 9–26
A.K. Jain and R.C. Dubes, 1988. Algorithms for Clustering Data, Prentice-Hall, Englewood Cliffs, NJ.
H. Jeong, B. Tomber, R. Albert, Z.N. Oltvai, and A.-L. Barabasi, 2000. The large-scale organization of metabolic networks, Nature 407: 651–654.
D. S. Johnson and M. A. Trick (eds.), 1996. Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, Vol. 26 of DIMACS Series, American Mathematical Society.
J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACM-SIAM SODA, 1998.
J. Kleinberg and S. Lawrence. The Structure of the Web. Science, 294:1849–50, 2001.
R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins, and E. Upfal. The Web as a graph. Symposium on Principles of Database Systems, pages 1–10, 2000.
R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the Web for cyber communities. Proc. 8th WWW, 1999.
O.L. Mangasarian, 1993. Mathematical Programming in Neural Networks, ORSA Journal on Computing 5, 349–360.
O.L. Mangasarian, W.N. Street, and W.H. Wolberg, 1995. Breast Cancer Diagnosis and Prognosis via Linear Programming, Operations Research 43(4), 570–577.
R. N. Mantegna, and H. E. Stanley, 2000. An Introduction to Econophysics: Correlations and Complexity in Finance, Cambridge University Press.
B. Mirkin and I. Muchnik, 1998. Combinatoral Optimization in Clustering. In: Handbook of Combinatorial Optimization (D.-Z. Du and P.M. Pardalos, eds.), Volume 2. Kluwer Academic Publishers, 261–329.
A. Mendelzon, G. Mihaila, and T. Milo. Querying the World Wide Web. Journal of Digital Libraries, 1: 68–88, 1997.
A. Mendelzon and P. Wood. Finding regular simple paths in graph databases. SIAM J. Comp., 24: 1235–1258, 1995.
J.M. Murre and D.P. Sturdy, 1995. The Connectivity of the Brain: Multi-Level Quantitative Analysis. Biol. Cybern., 73, 529–545.
E. Osuna, R. Freund, and F. Girosi, 1997. Improved Training Algorithm for Support Vector Machines, in Proceedings of IEEE NNSP97, Amelia Island, FL, September 1997, IEEE Press, New York, 276285.
E. Osuna, R. Freund, and F. Girosi, 1997. Training Support Vector Machines: An Application to Face Detection, in IEEE Conference on Computer Vision and Pattern Recognition, Puerto Rico, June 1997, IEEE Press, New York, 130136.
P.M. Pardalos, T. Mavridou, and J. Xue, 1998. The Graph Coloring Problem: A Bibliographic Survey. In: Handbook of Combinatorial Optimization (D.-Z. Du and P.M. Pardalos, eds.), Volume 2. Kluwer Academic Publishers, 331–395.
P.M. Pardalos, W. Chaovalitwongse, L.D. Iasemidis, J.C. Sackellares, D.-S. Shiau, P.R. Carney, O.A. Prokopyev, V.A. Yatsenko, 2003. Seizure Warning Algorithm Based on Spatiotemporal Dynamics of Intracranial EEG, Submitted to Mathematical Programming.
N.G. Pavlidis, D.K. Tasoulis, G.S. Androulakis, and M.N. Vrahatis, 2004. Exchange Rate Forecasting through Distributed Time-Lagged Feedforward Neural Networks. In: Supply Chain and Finance (P.M. Pardalos, A. Migdalas, G. Baourakis, eds.), World Scientific, 283–298.
G. Piatetsky-Shapiro and W. Frawley (eds.), 1991. Knowledge Discovery in Databases, MIT Press, Cambridge, MA.
O.A. Prokopyev, V. Boginski, W. Chaovalitwongse, P.M. Pardalos, J. C. Sackellares, and P. R. Carney, 2003. Network-Based Techniques in EEG Data Analysis and Epileptic Brain Modeling. Submitted to Computational Statistics and Data Analysis.
D.E. Rumelhart and D. Zipser, 1985. Feature Discovery by Competitive Learning. Cognitive Science, 9, 75–112.
J.C. Sackellares, L.D. Iasemidis, R.L. Gilmore, S.N. Roper, 1997. Epileptic seizures as neural resetting mechanisms. Epilepsia 38,S3, 189.
B. Scholkopf, C. Burges, and V. Vapnik, 1995. Extracting Support Data for a Given Task, in Proceedings of the First International Conference in Knowledge Discovery and Data Mining, AAAI Press, Menlo Park, CA. 69–88.
D.S. Shiau, Q. Luo, S.L. Gilmore, S.N. Roper, P.M. Pardalos, J.C. Sackellares, L.D. Iasemidis, 2000. Epileptic seizures resetting revisited. Epilepsia. 41,S7, 208–209
V. Vapnik, S.E. Golowich, and A. Smola, 1997. Support Vector Method for Function Approximation, Regression Estimation, and Signal Processing, in Advances in Neural Information Processing Systems 9, M.C. Mozer, M.I. Jordan, and T. Petsche (eds.), MIT Press, Cambridge, MA.
V.N. Vapnik, 1995. The Nature of Statistical Learning Theory, Springer, New York.
D. Watts, 1999. Small Worlds: The Dynamics of Networks Between Order and Randomness, Princeton University Press.
D. Watts and S. Strogatz, 1998. Collective dynamics of’ small-world’ networks, Nature 393 440–442.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Kluwer Academic Publishers
About this chapter
Cite this chapter
Boginski, V., Pardalos, P.M., Vazacopoulos, A. (2004). Network-based Models and Algorithms in Data Mining and Knowledge Discovery. In: Du, DZ., Pardalos, P.M. (eds) Handbook of Combinatorial Optimization. Springer, Boston, MA. https://doi.org/10.1007/0-387-23830-1_5
Download citation
DOI: https://doi.org/10.1007/0-387-23830-1_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-23829-6
Online ISBN: 978-0-387-23830-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)