Skip to main content
Log in

Classifying modeling and simulation as a scientific discipline

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The body of knowledge related to modeling and simulation (M&S) comes from a variety of constituents: (1) practitioners and users, (2) tool developers and (3) theorists and methodologists. Previous work has shown that categorizing M&S as a concentration in an existing, broader disciple is inadequate because it does not provide a uniform basis for research and education across all institutions. This article presents an approach for the classification of M&S as a scientific discipline and a framework for ensuing analysis. The novelty of the approach lies in its application of machine learning classification to documents containing unstructured text (e.g. publications, funding solicitations) from a variety of established and emerging disciplines related to modeling and simulation. We demonstrate that machine learning classification models can be trained to accurately separate M&S from related disciplines using the abstracts of well-index research publication repositories. We evaluate the accuracy of our trained classifiers using cross-fold validation. Then, we demonstrate that our trained classifiers can effectively identify a set of previously unseen M&S funding solicitations and grant proposals. Finally, we use our approach to uncover new funding trends in M&S and support a uniform basis for education and research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Aboelela, S. W., Larson, E., Bakken, S., Carrasquillo, O., Formicola, A., Glied, S. A., et al. (2007). Defining interdisciplinary research: Conclusions from a critical review of the literature. Health Services Research, 42(1p1), 329–346.

    Article  Google Scholar 

  • Alpaydin, E. (2004). Introduction to machine learning. Cambridge: The MIT Press.

    MATH  Google Scholar 

  • Argamon, S., Koppel, M., & Avneri, G. (1998). Routing documents according to style. In First international workshop on innovative information systems, pp. 85–92. Citeseer.

  • Baird, L., & Moore, A. W. (1999). Gradient descent for general reinforcement learning. Advances in Neural Information Processing Systems, 20, 968–974.

    Google Scholar 

  • Balci, O. (2001). A methodology for certification of modeling and simulation applications. ACM Transactions on Modeling and Computer Simulation (TOMACS), 11(4), 352–377.

    Article  MathSciNet  Google Scholar 

  • Börner, K., Klavans, R., Patek, M., Zoss, A. M., Biberstine, J. R., Light, R. P., et al. (2012). Design and update of a classification system: The ucsd map of science. PLoS One, 7(7), e39464.

    Article  Google Scholar 

  • Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010, pp. 177–186. Springer.

  • Bourke, P., & Butler, L. (1998). Institutions and the map of science: Matching university departments and fields of research1. Research Policy, 26(6), 711–718.

    Article  Google Scholar 

  • Crookall, D. (2010). Serious games, debriefing, and simulation/gaming as a discipline. Simulation and Gaming, 41(6), 898–920.

    Article  Google Scholar 

  • Efron, B., & Gong, G. (1983). A leisurely look at the bootstrap, the jackknife, and cross-validation. The American Statistician, 37(1), 36–48.

    MathSciNet  Google Scholar 

  • Eyheramendy, S., Lewis, D., & Madigan, D. (2003). On the naive bayes model for text categorization. In Proceedings of the ninth international workshop on artificial intelligence and statistics, pp. 705–722.

  • Fox, C. (1989). A stop list for general text. In ACM SIGIR forum (Vol. 24, pp. 19–21). ACM.

  • Glänzel, W., & Schubert, A. (2005). Analysing scientific networks through co-authorship. In H. F. Moed, W. Glänzel, K. U. Leuven & U. Schmoch (Eds.), Handbook of quantitative science and technology research (pp. 257–276). New York, NY: Springer.

  • Glänzel, W. (1996). The need for standards in bibliometric research and technology. Scientometrics, 35(2), 167–176.

    Article  Google Scholar 

  • Glänzel, W., & Moed, H. F. (2002). Journal impact measures in bibliometric research. Scientometrics, 53(2), 171–193.

    Article  Google Scholar 

  • Glenisson, P., Glänzel, W., Janssens, F., & De Moor, B. (2005). Combining full text and bibliometric information in mapping scientific disciplines. Information Processing and Management, 41(6), 1548–1572.

    Article  Google Scholar 

  • Herrera, M., Roberts, D. C., & Gulbahce, N. (2010). Mapping the evolution of scientific fields. PloS One, 5(5), e10355.

    Article  Google Scholar 

  • Hinze, S. (1994). Bibliographical cartography of an emerging interdisciplinary discipline: The case of bioelectronics. Scientometrics, 29(3), 353–376.

    Article  Google Scholar 

  • Hu, X., Downie, J. S., & Ehmann, A. F. (2009). Lyric text mining in music mood classification. American Music, 183(5,049), 2–209.

    Google Scholar 

  • Ioannidis, J. P. A. (2006). Concentration of the most-cited papers in the scientific literature: Analysis of journal ecosystems. PLoS One, 1(1), e5.

    Article  Google Scholar 

  • Jahn, N., Fenner, M., & Schirrwagen, J. (2013). PlosopenR–exploring FP7 funded PLOS plosopenR–exploring FP7 funded PLOS. Information Services & Use, 33(2), 93–101.

    Google Scholar 

  • Jordan, A. (2002). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Advances in Neural Information Processing Systems, 14, 841.

    Google Scholar 

  • Katz, J. S., & Hicks, D. (1995). The classification of interdisciplinary journals: A new approach. In Proceeding of the fifth biennial conference of the international society for scientometrics and informatics, pp. 7–10.

  • Kaur, J., Hoang, D. T., Sun, X., Possamai, L., JafariAsbagh, M., Patil, S., et al. (2012). Scholarometer: A social framework for analyzing impact across disciplines. PloS One, 7(9), e43235.

    Article  Google Scholar 

  • Kim, S.-B., Han, K.-S., Rim, H.-C., & Myaeng, S. H. (2006). Some effective techniques for naive bayes text classification. IEEE Transactions on Knowledge and Data Engineering, 18(11), 1457–1466.

    Article  Google Scholar 

  • Kohavi, R., et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In International joint conference on artificial intelligence (Vol. 14, pp. 1137–1145). Lawrence Erlbaum Associates Ltd.

  • Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.

    Article  MathSciNet  MATH  Google Scholar 

  • Lewis, D. D. (1998). Naive (bayes) at forty: The independence assumption in information retrieval. In D. E. Chemnitz (Ed.), Machine learning: ECML-98 (pp. 4–15). New York, NY: Springer.

  • Lin, F.-R., Hsieh, L.-S., & Chuang, F.-T. (2009). Discovering genres of online discussion threads via text mining. Computers and Education, 52(2), 481–495.

    Article  Google Scholar 

  • Liu, B., & Zhang, L. (2012). A survey of opinion mining and sentiment analysis. In C. Aggarwal & C. Zhai (Eds.), Mining text data (pp. 415–463). New York, NY: Springer.

  • Mayr, E. (2004). What makes biology unique? Considerations on the autonomy of a scientific discipline. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • McCallum, A., Nigam, K., et al. (1998). A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization (Vol. 752, pp. 41–48). Citeseer.

  • Miltsakaki, E., & Troutt, A. (2008). Real-time web text classification and analysis of reading difficulty. In Proceedings of the third workshop on innovative use of NLP for building educational applications, pp. 89–97. Association for Computational Linguistics.

  • Nederhof, A. J., & Noyons, E. C. M. (1992). Assessment of the international standing of university departments’ research: A comparison of bibliometric methods. Scientometrics, 24(3), 393–404.

    Article  Google Scholar 

  • NIH. (2003). National Institute of Health Research Awards 1990–2012 via Exporter. http://exporter.nih.gov/. Accessed June 19, 2013.

  • Noyons, E. (2001). Bibliometric mapping of science in a policy context. Scientometrics, 50(1), 83–98.

    Article  Google Scholar 

  • Noyons, E. C. M., Moed, H. F., & Luwel, M. (1999). Combining mapping and citation analysis for evaluative bibliometric purposes: A bibliometric study. Journal of the Association for Information Science and Technology, 50(2), 115.

    Google Scholar 

  • Pazzani, M., & Meyers, A. (2003). NSF Research Award Abstracts 1990–2003 Data Set. http://archive.ics.uci.edu/ml/datasets/NSF+Research+Award+Abstracts+1990-2003. Accessed June 19, 2013.

  • Rajman, M., & Besançon, R. (1998). Text mining: Natural language techniques and text mining applications. In S. Spaccapietra & F. Maryanski (Eds.), Data mining and reverse engineering (pp. 50–64). New York, NY: Springer.

  • Salter, L., & Hearn, A. (1997). Outside the lines: Issues in interdisciplinary research. Montreal: McGill-Queen’s Press-MQUP.

  • Sarjoughian, H. S., & Zeigler, B. P. (2001). Towards making modeling & simulation into a discipline. Simulation Series, 33(2), 130–135.

    Google Scholar 

  • Searls, D. B. (2010). The roots of bioinformatics. PLoS Computational Biology, 6(6), e1000809.

    Article  Google Scholar 

  • Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys (CSUR), 34(1), 1–47.

    Article  Google Scholar 

  • Vessey, I., Ramesh, V., & Glass, R. L. (2005). A unified classification system for research in the computing disciplines. Information and Software Technology, 47(4), 245–255.

    Article  Google Scholar 

  • Vinkler, P. E. T. E. R. (1988). An attempt of surveying and classifying bibliometric indicators for scientometric purposes. Scientometrics, 13(5–6), 239–259.

    Article  Google Scholar 

  • Wallace, M. L., Larivière, V., & Gingras, Y. (2012). A small world of citations? The influence of collaboration networks on citation practices. PloS One, 7(3), e33339.

    Article  Google Scholar 

  • Wang, B., & PAN, W. (2005). A survey of content-based anti-spam email filtering [j]. Journal of Chinese Information Processing, 5, 000.

    Google Scholar 

  • Wei, C.-H., Harris, B. R., Li, D., Berardini, T. Z., Huala, E., Kao, H.-Y., et al. (2012). Accelerating literature curation with text mining tools: A case study of using PubTator to curate genes in PubMed abstracts. Database, 2012, bas041. doi:10.1093/database/bas041.

    Article  Google Scholar 

  • White, J. (2001). Open portal for digital library. Communications of the ACM, 44(7), 14–44.

    Article  Google Scholar 

  • Yu, B. (2008). An evaluation of text classification methods for literary study. Literary and Linguistic Computing, 23(3), 327–343.

    Article  Google Scholar 

  • Zhang, T. (2004). Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of the twenty-first international conference on machine learning, p. 116. ACM.

Download references

Acknowledgments

We gratefully acknowledge the support of our colleagues at the Virginia Modeling, Analysis and Simulation Center (VMASC), University of Virginia (UVA) and Gettysburg College in manually classifying the 1000 NSF and NIH Grants used in the evaluation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ross Gore.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gore, R., Diallo, S. & Padilla, J. Classifying modeling and simulation as a scientific discipline. Scientometrics 109, 615–628 (2016). https://doi.org/10.1007/s11192-016-2050-y

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-016-2050-y

Keywords

Navigation