SNP-Schizo: A Web Tool for Schizophrenia SNP Sequence Classification

  • Vanessa Aguiar-Pulido
  • José A. Seoane
  • Cristian R. Munteanu
  • Alejandro Pazos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6692)


This work presents a tool which is an online implementation of the best machine learning-based model obtained after an exhaustive computational study. Twelve techniques were applied to schizophrenia data to obtain the results of this study and, with these, Quantitative Genotype – Disease Relationships (QDGRs) for disease prediction. Thus, the tool offers the possibility to introduce SNP sequences (which contain the SNPs considered in the study) in order to classify a patient. In the future, QDGR models could be extended to other diseases. The model implemented online is a linear neural network.


SNP schizophrenia machine learning neural networks data mining bioinformatics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Devillers, J., Balaban, A.T.: Topological Indices and Related Descriptors in QSAR and QSPR. Gordon and Breach, The Netherlands (1999)Google Scholar
  2. 2.
    Barabasi, A.L., Bonabeau, E.: Scale-free networks. Sci. Am. 288, 60–69 (2003)CrossRefGoogle Scholar
  3. 3.
    Balaban, A.T., Basak, S.C., Beteringhe, A., Mills, D., Supuran, C.T.: QSAR study using topological indices for inhibition of carbonic anhydrase II by sulfanilamides and Schiff bases. Mol. Divers 8, 401–412 (2004)CrossRefGoogle Scholar
  4. 4.
    Barabasi, A.L., Oltvai, Z.N.: Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5, 101–113 (2004)CrossRefGoogle Scholar
  5. 5.
    Barabasi, A.L.: Sociology. Network theory-the emergence of the creative enterprise. Science 308, 639–641 (2005)Google Scholar
  6. 6.
    González-Díaz, H., Vilar, S., Santana, L., Uriarte, E.: Medicinal Chemistry and Bioinformatics – Current Trends in Drugs Discovery with Networks Topological Indices. Curr. Top Med. Chem. 7, 1025–1039 (2007)CrossRefGoogle Scholar
  7. 7.
    Ferino, G., Gonzalez-Diaz, H., Delogu, G., Podda, G., Uriarte, E.: Using spectral moments of spiral networks based on PSA/mass spectra outcomes to derive quantitative proteome-disease relationships (QPDRs) and predicting prostate cancer. Biochem. Biophys. Res. Commun. 372, 320–325 (2008)CrossRefGoogle Scholar
  8. 8.
    Gonzalez-Diaz, H., Gonzalez-Diaz, Y., Santana, L., Ubeira, F.M., Uriarte, E.: Proteomics, networks and connectivity indices. Proteomics 8, 750–778 (2008)CrossRefGoogle Scholar
  9. 9.
    den Dunnen, J.T., Antonarakis, S.E.: Mutation nomenclature extensions and suggestions to describe complex mutations: a discussion. Hum. Mutat. 15, 7–12 (2000)CrossRefGoogle Scholar
  10. 10.
    Aguiar-Pulido, V., Seoane, J.A., Rabunal, J.R., Dorado, J., Pazos, A., Munteanu, C.R.: Machine learning techniques for single nucleotide polymorphism - disease classification models in schizophrenia. Molecules 15, 4875–4889Google Scholar
  11. 11.
    Diederich, J.: Artificial neural networks: concept learning. IEEE Press, Piscataway (1990)zbMATHGoogle Scholar
  12. 12.
    Byvatov, E., Schneider, G.: Support vector machine applications in bioinformatics. Appl. Bioinformatics 2, 67–77 (2003)Google Scholar
  13. 13.
    Eberbach, E.: Toward a theory of evolutionary computation. Biosystems 82, 1–19 (2005)CrossRefGoogle Scholar
  14. 14.
    Rowland, J.J.: Model selection methodology in supervised learning with evolutionary computation. Biosystems 72, 187–196 (2003)CrossRefGoogle Scholar
  15. 15.
    Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Addition Wesley, Boston (2006)Google Scholar
  16. 16.
    Dominguez, E., Loza, M.I., Padin, F., Gesteira, A., Paz, E., Paramo, M., Brenlla, J., Pumar, E., Iglesias, F., Cibeira, A., Castro, M., Caruncho, H., Carracedo, A., Costas, J.: Extensive linkage disequilibrium mapping at HTR2A and DRD3 for schizophrenia susceptibility genes in the Galician population. Schizophr. Res. 90, 123–129 (2007)CrossRefGoogle Scholar
  17. 17.
    Wright, F.A., Huang, H., Guan, X., Gamiel, K., Jeffries, C., Barry, W.T., de Villena, F.P., Sullivan, P.F., Wilhelmsen, K.C., Zou, F.: Simulating association studies: a data-based resampling method for candidate regions or whole genome scans. Bioinformatics 23, 2581–2588 (2007)CrossRefGoogle Scholar
  18. 18.
    Rosenblatt, F.: Principles of neurodynamics; perceptrons and the theory of brain mechanisms. Spartan Books, Washington (1962)zbMATHGoogle Scholar
  19. 19.
    Russel, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Prentice Hall, Upper Saddle River (2003)Google Scholar
  20. 20.
    Gutlein, M., Frank, E., Hall, M., Karwath, A.: Large-scale attribute selection using wrappers. In: Proceedings of Symposium on Computational Intelligence and Data Mining, pp. 332–339. IEEE Computer Society, Nashville (2009)Google Scholar
  21. 21.
    Yu, L., Liu, H.: Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution. In: Proceedings of the Twentieth International Conference on Machine Learning, pp. 856–863 (2003)Google Scholar
  22. 22.
    Goldberg, D.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co., Boston (1989)zbMATHGoogle Scholar
  23. 23.
    Garcia Lopez, F., Garcia Torres, M., Melian Batista, B., Moreno Perez, J.A., Moreno-Vega, J.M.: Solving feature subset selection problem by a Parallel Scatter Search. European Journal of Operational Research 169, 477–489 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Liu, H., Setiono, R.: A probabilistic approach to feature selection - A filter solution. In: 13th International Conference on Machine Learning, Bari, Italy, pp. 319–327 (1996)Google Scholar
  25. 25.
    Bishop, C.: Neural Networks for pattern recognition. Oxford University Press, New York (1995)zbMATHGoogle Scholar
  26. 26.
    Buhmann, M.D.: Radial Basis Functions: Theory and Implementations. Cambridge University Press, Cambridge (2003)CrossRefzbMATHGoogle Scholar
  27. 27.
    Aguiar, V., Seoane, J.A., Freire, A., Munteanu, C.R.: Data Mining in Complex Diseases Using Evolutionary Computation. In: Cabestany, J., Sandoval, F., Prieto, A., Corchado, J.M. (eds.) IWANN 2009. LNCS, vol. 5517, pp. 917–924. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  28. 28.
    Moore, J.H., Gilbert, J.C., Tsai, C.T., Chiang, F.T., Holden, T., Barney, N., White, B.C.: A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J. Theor. Biol. 241, 252–261 (2006)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Cordell, H.J.: Detecting gene-gene interactions that underlie human diseases. Nat. Rev. Genet. 10, 392–404 (2009)CrossRefGoogle Scholar
  30. 30.
    John, G.H., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. In: 11th Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufman, Quebec (1995)Google Scholar
  31. 31.
    Bouckaert, R.R.: Bayesian Networks in Weka. Computer Science Department. University of Waikato, Tauranga, New Zealand (2004)Google Scholar
  32. 32.
    Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, New York (1998)zbMATHGoogle Scholar
  33. 33.
    Kohavi, R.: The Power of Decision Tables. In: 8th European Conference on Machine Learning, pp. 174–189. Springer, Heidelberg (1995)Google Scholar
  34. 34.
    Mark Hall, E.F.: Combining Naive Bayes and Decision Tables. In: 21st Florida Artificial Intelligence Society Conference (FLAIRS). AAAI Press, Florida (2008)Google Scholar
  35. 35.
    Shi, H.: Best-first Decision Tree Learning. MsC. University of Waikato, New Zealand, Hamilton (2007)Google Scholar
  36. 36.
    Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithms. In: Thirteenth International Conference on Machine Learning, pp. 148–156. Morgan Kaufman, Desenzano sul Garda (1996)Google Scholar
  37. 37.
    Gonzalez-Diaz, H., Prado-Prado, F.J., Garcia-Mera, X., Alonso, N., Abeijon, P., Caamano, O., Yanez, M., Munteanu, C.R., Pazos Sierra, A., Dea-Ayuela, M.A., Gomez-Munoz, M.T., Garijo, M.M., Sansano, J., Ubeira, F.M.: MIND-BEST: web server for drugs & target discovery; design, synthesis, and assay of MAO-B inhibitors and theoretic-experimental study of G3PD protein from Trichomona gallineae. J. Proteome Res. (2010)Google Scholar
  38. 38.
    Rodriguez-Soca, Y., Munteanu, C.R., Dorado, J., Pazos, A., Prado-Prado, F.J., Gonzalez-Diaz, H.: Trypano-PPI: a web server for prediction of unique targets in trypanosome proteome by using electrostatic parameters of protein-protein interactions. J. Proteome Res. 9, 1182–1190 (2010)CrossRefGoogle Scholar
  39. 39.
    Munteanu, C.R., Vazquez, J.M., Dorado, J., Sierra, A.P., Sanchez-Gonzalez, A., Prado-Prado, F.J., Gonzalez-Diaz, H.: Complex network spectral moments for ATCUN motif DNA cleavage: first predictive study on proteins of human pathogen parasites. J. Proteome Res. 8, 5219–5228 (2009)CrossRefGoogle Scholar
  40. 40.
    Concu, R., Dea-Ayuela, M.A., Perez-Montoto, L.G., Prado-Prado, F.J., Uriarte, E., Bolas-Fernandez, F., Podda, G., Pazos, A., Munteanu, C.R., Ubeira, F.M., Gonzalez-Diaz, H.: 3D entropy and moments prediction of enzyme classes and experimental-theoretic study of peptide fingerprints in Leishmania parasites. Biochim. Biophys. Acta. 1794, 1784–1794 (2009)CrossRefGoogle Scholar
  41. 41.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.A.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Vanessa Aguiar-Pulido
    • 1
  • José A. Seoane
    • 1
  • Cristian R. Munteanu
    • 1
  • Alejandro Pazos
    • 1
  1. 1.Information and Communication Technologies Department, Faculty of InformaticsUniversity of A CoruñaSpain

Personalised recommendations